How do open files behave on linux systems?

I just renamed a log file to “foo.log.old”, and assumed that the application will start writing a new logfile at “foo.log”. I was surprised to discover that it tracked the logfile to its new name, and kept appending lines to “foo.log.old”.

In Windows, I’m not familiar with this kind of behavior – I don’t know if it’s even possible to implement it. How exactly is this behavior implemented in linux? Where can I learn more about it?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Programs connect to files through a number maintained by the filesystem (called an inode on traditional unix filesystems), to which the name is just a reference (and possibly not a unique reference at that).

So several things to be aware of:

  1. Moving a file using mv does not change that underling number unless you move it across filesystems (which is equivalent to using cp then rm on the original).
  2. Because more than one name can connect to a single file (i.e. we have hard links), the data in “deleted” files doesn’t go away until all references to the underling file go away.
  3. Perhaps most important: when a program opens a file it makes a reference to it that is (for the purposes of when the data will be deleted) equivalent to a having a file name connected to it.

This gives rise to several behaviors like:

  • A program can open a file for reading, but not actually read it until after the user as rmed it at the command line, and the program will still have access to the data.
  • The one you encountered: mving a file does not disconnect the relationship between the file and any programs that have it open (unless you move across filesystem boundaries, in which case the program still have a version of the original to work on).
  • If a program has opened a file for writing, and the user rms it’s last filename at the command line, the program can keep right on putting stuff into the file, but as soon as it closes there will be no more reference to that data and it will go away.
  • Two programs that communicate through one or more files can obtain a crude, partial security by removing the file(s) after they are finished opening. (This is not actual security mind, it just transforms a gaping hole into a race condition.)

Method 2

To really see how this behavior is implemented you could look at some Unix programming books. Mathepic is right in that it is related to an inode. The actual pathname is only used to open the file, once that’s done the program references it by its opened file descriptor. The file descriptor in turn references the inode, which in this case doesn’t care if the underlying files name has changed.

As far as implementing this in Windows, that’s a question for another site.

To read more about this without hitting the books just search around for linux filesystems and inodes. There might not be a clear answer, but you’ll be able to understand why.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x