Wednesday, May 1, 2013

Locales and Sort Order

I've been annoyed by Linux behavior regarding sort order for a long while, but today hit my gag reflex when I was editing the wrong file because emacs sorted the directory listing incorrectly. I had changed to the local directory listing, hit down arrow and enter, and was puzzled when the file looked wrong. The file was fine, but it was the wrong file. Who was telling emacs to ignore dots in file name sort order? Annoying.

I need an intuitive sort order, and that is the one given by the ASCII collating sequence. It was interesting, because some applications were getting it right, and some were providing goofy results:
  • Dot should be before letter or digits.
  • Upper case should precede lower case. They should not be intermingled.
  • Digits should be treated as digits, not as numbers. E.g., 10 should appear before 5, since '1' appears before '5' in the collating sequence. The problem here is that someone is trying to predetermine how I name files or interpret file names.
The fix is easy. My /etc/default/locale was just one line. I added the second, logged out, and logged back in:

LANG="en_US.UTF-8"
LC_COLLATE="C"


[  Added 2013-05-02:

  I've got a 64b Lubuntu 12.10/Mate 1.6.0 box that seems to sort files okay with this setting:

  LC_COLLATE="en_US.UTF-8"

  However Caja 1.6.1 still doesn't sort file names correctly.

]

No comments: