For a long period I thought the default behavior of the sort program was using ASCII order. However, when I input the following lines into sort without any arguments:
# @
I got:
@ #
But according to the ASCII table, # is 35 and @ is 64. Another example is:
A a
And the output is:
a A
Can anybody explain this? By the way, what is ‘dictionary-order’ when using sort -d?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Looks like you are using a non-POSIX locale.
Try:
export LC_ALL=C
and then sort.
info sort clearly says:
(1) If you use a non-POSIX locale (e.g., by setting `LC_ALL’ to
`en_US’), then `sort’ may produce output that is sorted differently
than you’re accustomed to. In that case, set the `LC_ALL’ environment
variable to `C’. Note that setting only `LC_COLLATE’ has two problems.
First, it is ineffective if `LC_ALL’ is also set. Second, it has
undefined behavior if `LC_CTYPE’ (or `LANG’, if `LC_CTYPE’ is unset) is
set to an incompatible value. For example, you get undefined behavior
if `LC_CTYPE’ is `ja_JP.PCK’ but `LC_COLLATE’ is `en_US.UTF-8′.
Method 2
As man sort says, “dictionary-order” means
“consider only blanks and alphanumeric characters”.
For example, given the data
The !quick brown @fox jumps #over 17 $lazy dogs %42 times.
the unadorned sort command produces
dogs !quick #over $lazy %42 @fox 17 brown jumps The times.
(putting the lines that begin with the space characters
and the !, #, $, %, and @ symbols1
ahead of the lines that begin with letters and numbers;
i.e., alphanumeric characters), but sort -d produces
dogs 17 %42 brown @fox jumps $lazy #over !quick The times.
dogs is still first, because it begins with spaces,
but the special (punctuation) characters are ignored.
17 comes before 42, and fox comes between brown and jumps,
in spite of the fact that 42 and fox have characters in front of them
that would normally move them before the 17.
____________
1 in order of their ASCII values:
space=040, !=041, #=043, $=044, %=045, and @=0100.
Note that (disregarding the space bar)
this is approximately left-to-right order on some keyboards.
Method 3
To determine the sort order, simply create a file with a different character on each line and the sort it. The resulting output will tell you the sort order.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0