‘sort’ produces output in a weird order

Consider the following input to sort:

cat > foo <<EOM
D,,5014978
DD,,25
D,I,1972765530
D,Y,4223624
-,Y,71285059
YA,I,2
EOM

Now try running sort foo.

The output is not sorted when trying this on any of my Linux boxes (GNU coreutils versions 6.9-8.26). I get this instead:

$ sort foo
D,,5014978
DD,,25
D,I,1972765530
D,Y,4223624
-,Y,71285059
YA,I,2

Obviously, all the lines with D, should be together, and - should come before any letters.

The output is sorted when run under Cygwin (GNU coreutils 8.5). Comments?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Sorting depends on the locale; specifically, it depends on $LC_COLLATE (possibly overridden by $LC_ALL), falling back to $LANG if it doesn’t exist. The command locale will show you what values you’re effectively working with. See man 3 strcoll, man 3 setlocale, etc.

LC_COLLATE=C (or POSIX or no locale at all) results in a strict byte-by-byte comparison.

LC_COLLATE=en_US.utf8 results in an alphabetical-equivalence sort, with punctuation ignored and characters within the same equivalence class treated equally.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x