Newlines in filenames

I understand and accept the premise that defensive1 shell scripting is both prudent and, in the longer term, more sustainable.

Many of the answers to text processing questions here follow this principle by building into the answers contingencies for unorthodox filenames; that might contain spaces, dashes and new lines.

How prevalent are new lines in filenames? Specifically:

  • Do any applications create filenames that include newlines by default?
  • Are there any situations where it would be desirable to create such filenames?
  • Or are they predominantly an instance of user error?

[1] Meaning planning for and managing the broadest possible range of scenarios and contingencies…
Question inspired by the (rather plaintive) comment on this question.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

I’ve never seen a file name with a newline other than ones deliberately created to test applications that manipulate file names. File names containing newlines can appear because:

  • Some bug or user error (e.g. a bad copy-paste) resulted in an unintended file name.
  • Some filesystem corruption affected a file name.
  • Someone deliberately created a “strange” file name to exploit a security hole, where an application put more trust in the file names it was passed than it should have.

POSIX defines a filename as “a name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the slash character and the null byte. The filenames dot and dot-dot have special meaning.” There is no guarantee that every filesystem will accept “strange” file names (the only guaranteed characters are ASCII letters, digits, period, hyphen and underscore, i.e. A-Z, a-z, 0-9 and ._-, with hyphen forbidden in first position), but most native filesystems on modern unices do.

Method 2

When writing a paper, I often collect a bibliography of PDF files from various sources. Not all of these contain the correct metadata, which means I sometimes copy-paste the title of the paper from the PDF viewer into the filename. This often results in newlines within the file name, but has never been an issue with any tools I have used.

IMHO there is nothing ‘defensive’ about coding to a standard.. a standard which states that newlines are allowed in filenames. If your script does not handle all file names allowed in the standard, then your script is broken.

Method 3

I’ve never seen NORMAL users use newlines in filenames. It appears that their primary purpose is to (1) make it easy for attackers to subvert your system, and to (2) make it harder to write secure programs :-(. However, modern Unix-likes (such as Linux) allow them, so you have to prepare for them if you want a program that resists attack.

“Filenames and Pathnames in Shell: How to do it correctly” shows how to handle this correctly.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x