2008-FAST-Avoiding the Disk Bottleneck in the Data Domain De(5)

时间:2026-01-19   来源:未知    
字号:

FAST有关论文。。

sequence of client data bytes and has intrinsic and client-settable attributes or metadata. An objec

t may be a conventional file, a backup image of an entire volume or a tape cartridge.

To write a range of bytes into an object, Content Store performs several operations.

Anchoring partitions the byte range into variable-length segments in a cccontent dependent manner c[Man93, BDH94].

Segment fingerprinting computes the SHA-1 hash and generates the segment descriptor based on it. Eah segment desriptor ontains per segment

information of at least fingerprint and size

Segment mapping builds the tree of segments that

records the mapping between object byte ranges and segment descriptors. The goal is to represent a data object using references to deduplicated segments.

c

Figure 2: Containers are self-describing, immutable, units of storage several megabytes in size. All segments are stored in containers.

3.2 Segment Store

To read a range of bytes in an object, Content Store traverses the tree of segments created by the segment mapping operation above to obtain the segment descriptors for the relevant segments. It fetches the segments from Segment Store and returns the requested byte range to the client.

ccc

cc

Segment lookup finds the ontainer storing the requested segment. This operation may trigger disk I/Os to look in the on-disk index, thus it is throughput sensitive. Container retrieval reads the relevant portion of the indiated ontainer by invoking the Container Manager.

Container unpacking decompresses the retrieved portion of the container and returns the requested data segment.

Segment Store is essentially a database of segments keyed by their segment descriptors. To support writes, it accepts segments with their segment descriptors and stores them. To support reads, it fethes segments designated by their segment descriptors.

To write a data segment, Segment Store performs several operations.

Segment filtering determines if a segment is a duplicate. This is the key operation to deduplicate segments and may trigger disk I/Os, thus its overhead an signifiantly impat throughput performance.

Container packing adds segments to be stored to a container which is the unit of storage in the system. The packing operation also compresses segment data using a variation of the Ziv-Lempel algorithm. A container, when fully packed, is appended to the Container Manager.

Segment Indexing updates the segment index that maps segment descriptors to the container holding the segment, after the container has been appended to the Container Manager.

3.3 Container Manager

The Container Manager provides a storage container log abstraction, not a block abstraction, to Segment Store. Containers, shown in Figure 2, are self-describing in that a metadata section includes the segment descriptors for the stored segments. They are immutable in that new containers can be appended and old containers deleted, but containers cannot be modified once written. When Segment Store appends a ontainer, the Container Manager returns a container ID which is unique over the life of the system.

The Container Manager is responsible for allocating, dealloating, reading, writing and reliably storing containers. It supports reads of the metadata section or a portion of the data section, but it only supports appends of whole containers. If a container is not full but needs to be written to disk, it is padded out to its full size. Container Manager is built on top of standard block storage. Advanced techniques such as Software RAID-6, continuous data scrubbing, container verification, and end to end data checks are applied to ensure a high level of data integrity and reliability.

The container abstraction offers several benefits.

To read a data segment, Segment Store performs the following operations.

USENIX AssociationFAST ’08: 6th USENIX Conference on File and Storage Technologies

273

…… 此处隐藏:1921字,全部文档内容请下载后查看。喜欢就下载吧 ……
2008-FAST-Avoiding the Disk Bottleneck in the Data Domain De(5).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
× 游客快捷下载通道(下载后可以自由复制和排版)
VIP包月下载
特价:19 元/月 原价:99元
低至 0.1 元/份 每月下载300
全站内容免费自由复制
VIP包月下载
特价:19 元/月 原价:99元
低至 0.1 元/份 每月下载300
全站内容免费自由复制
注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
× 常见问题(客服时间:周一到周五 9:30-18:00)