I need to download a large file (1GB). I also have access to multiple computers running Linux, but each is limited to a 50kB/s download speed by an admin policy.
How do I distribute downloading this file on several computers and merge them after all segments have been downloaded, so that I can receive it faster?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The common protocols HTTP, FTP and SFTP support range requests, so you can
request part of a file. Note that this also requires server support, so it
might or might not work in practice.
You can use curl and the -r or --range option to specify the range and
eventually just catting the files together. Example:
curl -r 0-104857600 -o distro1.iso 'http://files.cdn/distro.iso' curl -r 104857601-209715200 -o distro2.iso 'http://files.cdn/distro.iso' […]
And eventually when you gathered the individual parts you concatenate them:
cat distro* > distro.iso
You can get further information about the file, including its size with the --head option:
curl --head 'http://files.cdn/distro.iso'
You can retrieve the last chunk with an open range:
curl -r 604887601- -o distro9.iso 'http://files.cdn/distro.iso'
Read the curl man page for more options and explanations.
You can further leverage ssh and tmux to ease running and keeping
track of the downloads on multiple servers.
Method 2
It would take about 5.5 hours to rode load a 1 gigabyte file at 50 kilobytes per second.
It seems the effort to coordinate multiple computers to get partials may save some time.
You can look at bittorrent and utilize web seeding along with transfers via peer exchange. Each client can receive pieces and share completed pieces with in the local area network (LAN). You end up with the same 1gb file on each computer but the merging of pieces is automated for you.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0