The environment variable LC_COLLATE
is part of the POSIX standard. It
controls the locale specific sort order for tools like sort
. It can
be found in man 5 locale
and man 7 locale
The following bash script can be used to create the table below. It shows the
most common characters from the ASCII charset – 32 to 126. That will most
likely cover all chars you ever used in file names and allows a prediction in
which order they will appear, if you prefix file and folder names, e.g. with
~
:
paste \
<(echo -e "|num\n|-" ; printf "|%s\n" `seq 32 126`) \
<(echo -e "|C\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="C" sort) \
<(echo -e "|de\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="de_DE" sort) \
<(echo -e "|unicode\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="de_DE.UTF-8" sort) \
num | C | de_DE | de_DE.UTF-8 |
---|---|---|---|
32 | |||
33 | ! | ! | ! |
34 | " | " | " |
35 | # | # | # |
36 | $ | $ | % |
37 | % | % | & |
38 | & | & | ' |
39 | ' | ' | ( |
40 | ( | ( | ) |
41 | ) | ) | * |
42 | * | * | + |
43 | + | + | , |
44 | , | , | - |
45 | - | - | . |
46 | . | . | / |
47 | / | / | : |
48 | 0 | 0 | ; |
49 | 1 | 1 | < |
50 | 2 | 2 | = |
51 | 3 | 3 | > |
52 | 4 | 4 | ? |
53 | 5 | 5 | @ |
54 | 6 | 6 | [ |
55 | 7 | 7 | \ |
56 | 8 | 8 | ] |
57 | 9 | 9 | ^ |
58 | : | : | _ |
59 | ; | ; | ` |
60 | < | < | { |
61 | = | = | |
62 | > | > | } |
63 | ? | ? | ~ |
64 | @ | @ | $ |
65 | A | A | 0 |
66 | B | B | 1 |
67 | C | C | 2 |
68 | D | D | 3 |
69 | E | E | 4 |
70 | F | F | 5 |
71 | G | G | 6 |
72 | H | H | 7 |
73 | I | I | 8 |
74 | J | J | 9 |
75 | K | K | a |
76 | L | L | A |
77 | M | M | b |
78 | N | N | B |
79 | O | O | c |
80 | P | P | C |
81 | Q | Q | d |
82 | R | R | D |
83 | S | S | e |
84 | T | T | E |
85 | U | U | f |
86 | V | V | F |
87 | W | W | g |
88 | X | X | G |
89 | Y | Y | h |
90 | Z | Z | H |
91 | [ | [ | i |
92 | \ | \ | I |
93 | ] | ] | j |
94 | ^ | ^ | J |
95 | _ | _ | k |
96 | ` | ` | K |
97 | a | a | l |
98 | b | b | L |
99 | c | c | m |
100 | d | d | M |
101 | e | e | n |
102 | f | f | N |
103 | g | g | o |
104 | h | h | O |
105 | i | i | p |
106 | j | j | P |
107 | k | k | q |
108 | l | l | Q |
109 | m | m | r |
110 | n | n | R |
111 | o | o | s |
112 | p | p | S |
113 | q | q | t |
114 | r | r | T |
115 | s | s | u |
116 | t | t | U |
117 | u | u | v |
118 | v | v | V |
119 | w | w | w |
120 | x | x | W |
121 | y | y | x |
122 | z | z | X |
123 | { | { | y |
124 | |||
125 | } | } | z |
126 | ~ | ~ | Z |
System wide configuration #
With COLLATE=C
we can use underscores as first letter in folders to make them
appear before other folders:
# /etc/locale.conf
LC_COLLATE=C
Listing all interesting UTF-8 symbols #
The standard contains a block for user defined stuff, the "private use area". Nerdfonts is for example using it to store fancy stuff.
for i in `seq 57344 63743`; do echo -ne "\u$(printf '%x\n' $i) "; done
where 57344
is E000
and 63743
is F8FF