Detailed sparse file information on Linux

I have a sparse file, in which only some blocks are allocated:

~% du -h --apparent-size example
100K    example
~% du -h example
52K     example

I would like to know which blocks of the file are actually allocated. Is there a system call or kernel interface that could be used to get a list of either the allocations, or the holes of file?

Simply checking for a long enough string of zeros (the approach used by GNU cp, rsync, etc) does not work correctly:

~% cp example example1  
~% du -h example1 
32K     example1

It detected other sequences of zeros that were actually allocated.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

There is a similar question on SO. The currently accepted answer by @ephemient suggests using an ioctl called fiemap which is documented in linux/Documentation/filesystems/fiemap.txt. Quoting from that file:

The fiemap ioctl is an efficient method for userspace to get file
extent mappings. Instead of block-by-block mapping (such as bmap),
fiemap returns a list of extents.

Sounds like this is the kind of information you’re looking for. Support by filesystems is again optional:

File systems wishing to support fiemap must implement a ->fiemap
callback on their inode_operations structure.

Support for the SEEK_DATA and SEEK_HOLE arguments to lseek you mentioned from Solaris was added in Linux 3.1 according to the man page, so you might use that as well. The fiemap ioctl appears to be older, so it might be more portable across different Linux versions for now, whereas lseek might be more portable across operating systems if Solaris has the same.

Method 2

There is a collection of python programs called sparseutils that use SEEK_HOLE and SEEK_DATA to determine which sections of the file are represented as holes and which are data. Usage is quite straightforward. mksparse can be used to generate a sparse file according to some given layout.

 $ echo hole,data,hole | mksparse --hole-size 4096 --data-size 4096 example
 $ du -sh example
 4.0K   example

The sparsemap program can be used to print the layout to stdout:

 $ sparsemap example
 HOLE 4096
 DATA 4096
 HOLE 4096

Method 3

It depends on the file system. I don’t believe their is a call, which may be why many tools don’t handle copying sparse files well. The GNU tool chain use searching for large blocks of zeros as that allows them to remove unused allocated blocks. Many copy tools will convert a sparse file into a file with all blocks allocated.

You will likely have to open the inode, and parse the result. Inode format is file system dependent. Some file systems may have part of your data in the inode itself.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x