UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 36.4 Confusion with White Space Field Delimiters Chapter 36
Sorting
Next: 36.6 Miscellaneous sort Hints
 

36.5 Alphabetic and Numeric Sorting

sort performs two fundamentally different kinds of sorting operations: alphabetic sorts and numeric sorts. An alphabetic sort is performed according to the traditional "dictionary order," using the ASCII (51.3) collating sequence. Uppercase letters come before lowercase letters (unless you specify the -f option, which "folds" uppercase and lowercase together), with numerals and punctuation interspersed.

This is all fairly trivial and common sense. However, it's worth belaboring the difference, because it's a frequent source of bugs in shell scripts. Say you sort the numbers 1 through 12. A numeric sort gives you these numbers "in order," just like you'd expect. An alphabetic sort gives you:

1
11
12
2
...

Of course, this is how you'd sort the numbers if you applied dictionary rules to the list. Numeric sorts can handle decimal numbers (for example, numbers like 123.44565778); they can't handle floating-point numbers (for example, 1.2344565778E+02).

What happens if you include alphabetic characters in a numeric sort? Although the results are predictable, I would prefer to say that they're "undefined." Including alphabetic characters in a numeric sort is a mistake, and there's no guarantee that different versions of sort will handle them the same way. As far as I know, there is no provision for sorting hexadecimal numbers.

One final note: Under System V, the numeric sort treats initial blanks as significant - so numbers with additional spaces before them will be sorted ahead of numbers without the additional spaces. This is an incredibly stupid misfeature. There is a workaround; use the -b (ignore leading blanks) and always specify a sort field. [2] That is: sort -nb +0 will do what you expect; sort -n won't.

[2] Stupid misfeature number 2: -b doesn't work unless you specify a sort field explicitly, with a +n option.

- ML


Previous: 36.4 Confusion with White Space Field Delimiters UNIX Power ToolsNext: 36.6 Miscellaneous sort Hints
36.4 Confusion with White Space Field Delimiters Book Index36.6 Miscellaneous sort Hints

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System