I have the scenario where lines to be added on begining and end of the huge files.
I have tried as shown below.
-
for the first line:
sed -i '1i'"$FirstLine" $Filename
-
for the last line:
sed -i '$ a'"$Lastline" $Filename
But the issue with this command is that it is appending the first line of the file and traversing entire file. For the last line it’s again traversing the entire file and appending a last line. Since its very huge file (14GB) this is taking very long time.
How can I add a line to the beginning and another to the end of a file while only reading the file once?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
sed -i uses tempfiles as an implementation detail, which is what you are experiencing; however, prepending data to the beginning of a data stream without overwriting the existing contents requires rewriting the file, there’s no way to get around that, even when avoiding sed -i.
If rewriting the file is not an option, you might consider manipulating it when it is read, for example:
{ echo some prepended text ; cat file ; } | command
Also, sed is for editing streams — a file is not a stream. Use a program that is meant for this purpose, like ed or ex. The -i option to sed is not only not portable, it will also break any symlinks to your file, since it essentially deletes it and recreates it, which is pointless.
You can do this in a single command with ed like so:
ed -s file << 'EOF'
0a
prepend these lines
to the beginning
.
$a
append these lines
to the end
.
w
EOF
Note that depending on your implementation of ed, it may use a paging file, requiring you to have at least that much space available.
Method 2
Note that if you want to avoid allocating a whole copy of the file on disk, you could do:
sed ' 1i begin $a end' < file 1<> file
That uses the fact that when its stdin/stdout is a file, sed reads and writes by block. So here, it’s OK for it to override the file it is reading as long as the first line you’re adding is smaller than sed‘s block size (should be something like 4k or 8k).
Note though that if for some reason sed fails (killed, machine crash…), you’ll end up with the file half processed which will mean some data the size of the first line missing somewhere in the middle.
Also note that unless your sed is the GNU sed, that won’t work for binary data (but since you’re using -i, you are using GNU sed).
Method 3
Here are some choices (all of which will create a new copy of the file so make sure you have enough space for that):
-
simple echo/cat
echo "first" > new_file; cat $File >> new_file; echo "last" >> new_file;
-
awk/gawk etc
gawk 'BEGIN{print "firstn"}{print}END{print "lastn"}' $File > NewFileawkand its ilk read files line by line. TheBEGIN{}block is executed before the first line and theEND{}block after the last line. So, the command above meansprint "first" at the beginning, then print every line in the file and print "last" at the end. -
Perl
perl -ne 'BEGIN{print "firstn"} print;END{print "lastn"}' $File > NewFileThis is essentially the same thing as the gawk above just written in Perl.
Method 4
I prefer the much simpler:
gsed -i '1s/^/foon/gm; $s/$/nbar/gm' filename.txt
This transforms the file:
asdf qwer
to the file:
foo asdf qwer bar
Method 5
You can use Vim in Ex mode:
ex -sc '1i|ALFA' -c '$a|BRAVO' -cx file
-
1select first line -
iinsert text and newline -
$select last line -
aappend text and newline -
xsave and close
Method 6
There is no way to insert data at the beginning of a file¹, all you can do is create a new file, write the additional data, and append the old data. So you’ll have to rewrite the whole file at least once to insert the first line. You can append the last line without rewriting the file however.
sed -i '1i'"$FirstLine" $Filename echo "$LastLine" >>$Filename
Alternatively, you can combine the two commands in one run of sed.
sed -i -e '1i'"$FirstLine" -e '$ a'"$Lastline" $Filename
sed -i creates a new output file and then moves it over the old file. This means that while sed is working, there is a second copy of the file using up space. You can avoid this by overwriting the file in place, but with major restrictions: the line you’re adding has to be smaller than sed’s buffer, and if your system crashes you’ll end up with a damaged file and some content lost in the middle, so I strongly recommend against it.
¹ Linux does have a way to insert data into a file, but it can only insert a whole number of filesystem blocks, it can’t insert strings of arbitrary lengths. It’s useful for some applications, such as databases and virtual machines, but it’s useless for text files.
Method 7
$ (echo "Some Text" ; cat file1) > file2
Method 8
Modern Linux kernels (higher than 4.1 or 4.2) support inserting data at the beginning of a file via the fallocate() system call with FALLOC_FL_INSERT_RANGE on ext4 and xfs filesystems. In essence this is a logical shifting operation: the data is logically relocated at a higher offset.
A constraint exists regarding the granularity of the range you want to insert at the beginning of the file. But for text files you can probably allocate a little more than required (up to the granularity boundary) and fill with spaces or carriage returns, but that depends on your application
I don’t know of any readily available linux utility that manipulates file extents but it isn’t difficult to write: get a file descriptor and call fallocate() with the appropriate arguments. For further details, refer to the man page of the fallocate system call: http://man7.org/linux/man-pages/man2/fallocate.2.html
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0