I have a bash script which uses rsync to backup files in Archlinux. I noticed that rsync failed to copy a file from /sys, while cp worked just fine:
# rsync /sys/class/net/enp3s1/address /tmp rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61) ERROR: address failed verification -- update discarded. rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9] # cp /sys/class/net/enp3s1/address /tmp ## this works
I wonder why does rsync fail, and is it possible to copy the file with it?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Rsync has code which specifically checks if a file is truncated during read and gives this error — ENODATA. I don’t know why the files in /sys have this behavior, but since they’re not real files, I guess it’s not too surprising. There doesn’t seem to be a way to tell rsync to skip this particular check.
I think you’re probably better off not rsyncing /sys and using specific scripts to cherry-pick out the particular information you want (like the network card address).
Method 2
First off /sys is a pseudo file system. If you look at /proc/filesystems you will find a list of registered file systems where quite a few has nodev
in front. This indicates they are pseudo filesystems. This means they exists
on a running kernel as a RAM-based filesystem. Further they do not require a
block device.
$ cat /proc/filesystems nodev sysfs nodev rootfs nodev bdev ...
At boot the kernel mount this system and updates entries when suited. E.g. when
new hardware is found during boot or by udev.
In /etc/mtab you typically find the mount by:
sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0
For a nice paper on the subject read
Patric Mochel’s – The sysfs Filesystem.
stat of /sys files
If you go into a directory under /sys and do a ls -l you will notice that
all files has one size. Typically 4096 bytes. This is reported by sysfs.
:/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ ls -l -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type -r--r--r-- 1 root root 4096 Apr 24 20:09 address -r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len ...
Further you can do a stat on a file and notice another distinct feature;
it occupies 0 blocks. Also inode of root (stat /sys) is 1. /stat/fs typically
has inode 2. etc.
rsync vs. cp
The easiest explanation for rsync failure of synchronizing pseudo files is
perhaps by example.
Say we have a file named address that is 18 bytes. An ls or stat of the
file reports 4096 bytes.
rsync
- Opens file descriptor, fd.
- Uses fstat(fd) to get information such as size.
- Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm.
read_size == 4096- Ask; read: 4096 bytes.
- A short string is read i.e. 18 bytes.
nread == 18 read_size = read_size - nread (4096 - 18 = 4078)- Ask; read: 4078 bytes
- 0 bytes read (as first read consumed all bytes in file).
nread == 0, line 255- Unable to read
4096bytes. Zero out buffer. - Set error
ENODATA. - Return.
- Report error.
- Retry. (Above loop).
- Fail.
- Report error.
- FINE.
During this process it actually reads the entire file. But with no size
available it cannot validate the result – thus failure is only option.
cp
- Opens file descriptor, fd.
- Uses fstat(fd) to get information such as st_size (also uses lstat and stat).
-
Check if file is likely to be sparse. That is the file has holes etc.
copy.c:1010 /* Use a heuristic to determine whether SRC_NAME contains any sparse * blocks. If the file has fewer blocks than would normally be * needed for a file of its size, then at least one of the blocks in * the file is a hole. */ sparse_src = is_probably_sparse (&src_open_sb);
As
statreports file to have zero blocks it is categorized as sparse. -
Tries to read file by extent-copy (a more efficient way to copy normal
sparse files), and fails. - Copy by sparse-copy.
- Starts out with max read size of MAXINT.
Typically
18446744073709551615bytes on a 32 bit system. - Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.)
- A short string is read i.e. 18 bytes.
- Check if a hole is needed, nope.
- Write buffer to target.
- Subtract 18 from max read size.
- Ask; read 4096 bytes.
- 0 bytes as all got consumed in first read.
- Return success.
- Starts out with max read size of MAXINT.
- All OK. Update flags for file.
- FINE.
Method 3
Might be related, but extended attribute calls will fail on sysfs:
[[email protected] eth0]# lsattr address
lsattr: Inappropriate ioctl for device While reading flags on address
[[email protected] eth0]#
Looking at my strace it looks like rsync tries to pull in extended attributes by default:
22964 <… getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available)
I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn’t able to find anything (--xattrs turns them on at the destination).
Method 4
Rsync normally read the file’s information, transfer file contents or delta’s to a temporary file in the destination directory, then after verifying the file’s data it renames it to the destination filename.
I believe the problem with sysfs is that all files shows as 4k (one memory page) yet they may contain only a few bytes. To avoid copying a potentially corrupt file to the destination rsync cancels the copy when it sees a mismatch between the file’s metadata and what was actually copied.
At least on rsync v3.0.6 this behaviour can be avoided using the --inplace switch. Rsync will still detect errors but since the destination files will already have been overwriten when it does it will leave the potentially corrupt files there.
Note though that a side effect of it is that files end up being zero-padded to 4k as this is the size rsync thinks the files are. It shouldn’t make a difference in most cases as null bytes are usually ignored.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0