Add lines to the beginning and end of the huge file

I have the scenario where lines to be added on begining and end of the huge files.

I have tried as shown below.

  • for the first line:
    sed -i '1i'"$FirstLine" $Filename
  • for the last line:
    sed -i '$ a'"$Lastline" $Filename

But the issue with this command is that it is appending the first line of the file and traversing entire file. For the last line it’s again traversing the entire file and appending a last line. Since its very huge file (14GB) this is taking very long time.

How can I add a line to the beginning and another to the end of a file while only reading the file once?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

sed -i uses tempfiles as an implementation detail, which is what you are experiencing; however, prepending data to the beginning of a data stream without overwriting the existing contents requires rewriting the file, there’s no way to get around that, even when avoiding sed -i.

If rewriting the file is not an option, you might consider manipulating it when it is read, for example:

{ echo some prepended text ; cat file ; } | command

Also, sed is for editing streams — a file is not a stream. Use a program that is meant for this purpose, like ed or ex. The -i option to sed is not only not portable, it will also break any symlinks to your file, since it essentially deletes it and recreates it, which is pointless.

You can do this in a single command with ed like so:

ed -s file << 'EOF'
0a
prepend these lines
to the beginning
.
$a
append these lines
to the end
.
w
EOF

Note that depending on your implementation of ed, it may use a paging file, requiring you to have at least that much space available.

Method 2

Note that if you want to avoid allocating a whole copy of the file on disk, you could do:

sed '
1i
begin
$a
end' < file 1<> file

That uses the fact that when its stdin/stdout is a file, sed reads and writes by block. So here, it’s OK for it to override the file it is reading as long as the first line you’re adding is smaller than sed‘s block size (should be something like 4k or 8k).

Note though that if for some reason sed fails (killed, machine crash…), you’ll end up with the file half processed which will mean some data the size of the first line missing somewhere in the middle.

Also note that unless your sed is the GNU sed, that won’t work for binary data (but since you’re using -i, you are using GNU sed).

Method 3

Here are some choices (all of which will create a new copy of the file so make sure you have enough space for that):

  • simple echo/cat
    echo "first" > new_file; cat $File >> new_file; 
      echo "last" >> new_file;
  • awk/gawk etc
    gawk 'BEGIN{print "firstn"}{print}END{print "lastn"}' $File > NewFile

    awk and its ilk read files line by line. The BEGIN{} block is executed before the first line and the END{} block after the last line. So, the command above means print "first" at the beginning, then print every line in the file and print "last" at the end.
  • Perl
    perl -ne 'BEGIN{print "firstn"} print;END{print "lastn"}' $File > NewFile

    This is essentially the same thing as the gawk above just written in Perl.

Method 4

I prefer the much simpler:

gsed -i '1s/^/foon/gm; $s/$/nbar/gm' filename.txt

This transforms the file:
asdf
qwer

to the file:
foo
asdf
qwer
bar

Method 5

You can use Vim in Ex mode:

ex -sc '1i|ALFA' -c '$a|BRAVO' -cx file
  1. 1 select first line
  2. i insert text and newline
  3. $ select last line
  4. a append text and newline
  5. x save and close

Method 6

There is no way to insert data at the beginning of a file¹, all you can do is create a new file, write the additional data, and append the old data. So you’ll have to rewrite the whole file at least once to insert the first line. You can append the last line without rewriting the file however.

sed -i '1i'"$FirstLine" $Filename
echo "$LastLine" >>$Filename

Alternatively, you can combine the two commands in one run of sed.
sed -i -e '1i'"$FirstLine" -e '$ a'"$Lastline" $Filename

sed -i creates a new output file and then moves it over the old file. This means that while sed is working, there is a second copy of the file using up space. You can avoid this by overwriting the file in place, but with major restrictions: the line you’re adding has to be smaller than sed’s buffer, and if your system crashes you’ll end up with a damaged file and some content lost in the middle, so I strongly recommend against it.

¹ Linux does have a way to insert data into a file, but it can only insert a whole number of filesystem blocks, it can’t insert strings of arbitrary lengths. It’s useful for some applications, such as databases and virtual machines, but it’s useless for text files.

Method 7

$ (echo "Some Text" ; cat file1) > file2

Method 8

Modern Linux kernels (higher than 4.1 or 4.2) support inserting data at the beginning of a file via the fallocate() system call with FALLOC_FL_INSERT_RANGE on ext4 and xfs filesystems. In essence this is a logical shifting operation: the data is logically relocated at a higher offset.

A constraint exists regarding the granularity of the range you want to insert at the beginning of the file. But for text files you can probably allocate a little more than required (up to the granularity boundary) and fill with spaces or carriage returns, but that depends on your application

I don’t know of any readily available linux utility that manipulates file extents but it isn’t difficult to write: get a file descriptor and call fallocate() with the appropriate arguments. For further details, refer to the man page of the fallocate system call: http://man7.org/linux/man-pages/man2/fallocate.2.html


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments