I like being able to name files and directories with an underscore prefix if it’s something I want to keep separate from other files and directories at the same level. On Windows and Mac, for example, prefixing a file with an underscore sorts it to the top, in front of files starting with an alphanumeric character.
My googling has turned up that it has to do with the LC_COLLATE and my current locale (en_US). That’s fine, though I really don’t understand why en_US doesn’t sort as expected.
Based on the ICU Collate demonstration site setting locale to en_US_POSIX certainly appears to have the sort order I’m looking for (you have to edit the sample data and add some underscores to test it out). But I don’t really see how to apply this in my Linux shell.
Ideally, I’d like to be able to set up something in my bash config so that ls always sorts underscores first. How would I go about doing this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
If you don’t care to mix lowercase and uppercase, set your locale to C, which takes characters in their numerical order. _ falls between uppercase and lowercase.
$ LC_COLLATE=C ls BAR FOO _score _under hello world $ LC_COLLATE=en_US ls BAR FOO hello _score _under world
The locale settings LC_MESSAGES (language of error messages), LC_CTYPE (character sets) and LC_TIME (date and time format) are vey useful. LC_COLLATE and LC_NUMERIC are usually more trouble than they’re worth, I don’t recommend setting them. Proper lexicographic sorting is more complicated than LC_COLLATE is supposed to specify, and it can cause all sorts of weird behaviors when you use character ranges in regular expressions. LC_NUMERIC is mostly cosmetic, except when something goes horribly wrong because some program produced a number with a decimal separator other than ..
Method 2
If you can’t get ls to sort the way you want, try shell expansion.
You can use file name patterns to run ls with a list of files that the shell already sorted, bypassing the method that ls uses.
ls -lf _* [!_]*
Assuming you have the files
_a a _b b _c c
this is like running
ls -lf _a _b _c a b c
Explanation:
_* is a shell pattern matching any file name beginning with an underscore, expanded in alphabetic order.
[!_]* matches any file name not beginning with an underscore, expanded in alphabetic order.
-f tells ls to not sort, because the shell already did.
More information: bash filename expansion
If there are directories in the current directory you will want to run the command like this to avoid ls listing files in the directories:
ls -lfd _* [!_]*
Method 3
Unfortunately Linux uses glibc for its locale info, not ICU, so there is no way to directly apply this to Linux without expending a lot of effort either retrofitting ICU into glibc or supplementing the locale info in glibc.
Method 4
Adding the -f switch (no sorting) made it show that way for me.
man ls
[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7705181803371302041c19181e05">[email protected]</a> ~/java/test]# ls -fl total 0 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 _1 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 _2 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 _3 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 1 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 2 -rw-r--r-- 1 root wheel 0 Jun 1 13:27 3
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0