There are specific lines that I want to remove from a file. Let’s say it’s line 20-37 and then line 45. How would I do that without specifying the content of those lines?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
With sed, like so:
sed '20,37d; 45d' < input.txt > output.txt
If you wanted to do this in-place:
sed --in-place '20,37d; 45d' file.txt
Method 2
If the file fits comfortably in memory, you could also use ed.
The commands are quite similar to the sed one above with one notable difference: you have to pass the list of line numbers/ranges to be deleted in descending order (from the highest line no/range to the lowest one). The reason is that when you delete/insert/split/join lines with ed, the text buffer is updated after each subcommand so if you delete some lines, the rest of the following lines will no longer be at the same position in the buffer when the next subcommand is executed. So you have to start backwards1.
In-place editing:
ed -s in_file <<IN 45d 20,37d w q IN
or
ed -s in_file <<< $'45dn20,37dnwnqn'
or
printf '%sn' 45d 20,37d w q | ed -s in_file
Replace write with ,print if you want to print the resulting output instead of writing to file. If you want to keep the original file intact and write to another file you can pass the new file name to the write subcommand:
ed -s in_file <<IN 78,86d 65d 51d 20,37d w out_file q IN
1
Unless you are willing to calculate the new line numbers after each delete, which is quite trivial for this particular case (after deleting lines 20-37, i.e. 18 lines, line 45 becomes line 27) so you could run:
ed -s in_file <<IN 20,37d 27d w q IN
However, if you have to delete multiple line numbers/ranges, working backwards is a no-brainer.
Method 3
Just read it into memory, alter it, then write it back. You can do something like
filename = "foo"
f = open(filename, 'r+')
linenums = [1, 3]
s = [y for x, y in enumerate(f) if x not in [line-1 for line in linenums]]
f.seek(0)
f.write(''.join(s))
f.truncate(f.tell())
f.close()
Tested with a 5 line file. Credits to http://pleac.sourceforge.net/pleac_python/fileaccess.html, see section “Modifying a File in Place Without a Temporary File”. See also https://stackoverflow.com/questions/125703/how-do-i-modify-a-text-file-in-python
Some notes:
-
One could first truncate the file,
then write to it, rather than write,
then truncate, as above. However, I
don’t know of a Python flag that
allows one to read, and then do a
truncated write. But maybe I’m
missing something, as the document
isn’t all that clear. Which brings
me to -
Sometimes the Python docs really
suck. See
http://docs.python.org/library/functions.html#openModes ‘r+’, ‘w+’ and ‘a+’ open the file for updating (note that ‘w+’
truncates the file).Does this mean anything to you? What
the hell is “open for updating”? -
I don’t know if doing this in python
as opposed to something unixy like
the stream editor is better. It
might be more portable, but I don’t
know how portable sed is. I just
wrote it like that because I’m more
comfortable with low level
programming than using classic unix
tools, which are good if they do
exactly what you want, but (I think)
are generally less flexible. -
This approach (manipulating the file in memory)
trades memory for disk
space. It should work Ok on machines
with a few Gb of memory for files up
to a few hundred Mb. Python doesn’t
handle strings very efficiently, so
switching to C/C++ for example would
slightly increase performance and
greatly reduce memory usage.
Method 4
You can use Vim in Ex mode:
ex -sc '20,37d|45d|x' file
-
ddelete -
xsave and close
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0