The environment variable LC_COLLATE is part of the POSIX standard. It
controls the locale specific sort order for tools like sort. It can
be found in man 5 locale and man 7 locale
The following bash script can be used to create the table below. It shows the
most common characters from the ASCII charset – 32 to 126. That will most
likely cover all chars you ever used in file names and allows a prediction in
which order they will appear, if you prefix file and folder names, e.g. with
~:
paste \
<(echo -e "|num\n|-" ; printf "|%s\n" `seq 32 126`) \
<(echo -e "|C\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="C" sort) \
<(echo -e "|de\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="de_DE" sort) \
<(echo -e "|unicode\n|-" ; for i in `seq 32 126`; do printf "| \\$(printf %o $i)\n"; done | LC_COLLATE="de_DE.UTF-8" sort) \
| num | C | de_DE | de_DE.UTF-8 |
|---|---|---|---|
| 32 | |||
| 33 | ! | ! | ! |
| 34 | " | " | " |
| 35 | # | # | # |
| 36 | $ | $ | % |
| 37 | % | % | & |
| 38 | & | & | ' |
| 39 | ' | ' | ( |
| 40 | ( | ( | ) |
| 41 | ) | ) | * |
| 42 | * | * | + |
| 43 | + | + | , |
| 44 | , | , | - |
| 45 | - | - | . |
| 46 | . | . | / |
| 47 | / | / | : |
| 48 | 0 | 0 | ; |
| 49 | 1 | 1 | < |
| 50 | 2 | 2 | = |
| 51 | 3 | 3 | > |
| 52 | 4 | 4 | ? |
| 53 | 5 | 5 | @ |
| 54 | 6 | 6 | [ |
| 55 | 7 | 7 | \ |
| 56 | 8 | 8 | ] |
| 57 | 9 | 9 | ^ |
| 58 | : | : | _ |
| 59 | ; | ; | ` |
| 60 | < | < | { |
| 61 | = | = | |
| 62 | > | > | } |
| 63 | ? | ? | ~ |
| 64 | @ | @ | $ |
| 65 | A | A | 0 |
| 66 | B | B | 1 |
| 67 | C | C | 2 |
| 68 | D | D | 3 |
| 69 | E | E | 4 |
| 70 | F | F | 5 |
| 71 | G | G | 6 |
| 72 | H | H | 7 |
| 73 | I | I | 8 |
| 74 | J | J | 9 |
| 75 | K | K | a |
| 76 | L | L | A |
| 77 | M | M | b |
| 78 | N | N | B |
| 79 | O | O | c |
| 80 | P | P | C |
| 81 | Q | Q | d |
| 82 | R | R | D |
| 83 | S | S | e |
| 84 | T | T | E |
| 85 | U | U | f |
| 86 | V | V | F |
| 87 | W | W | g |
| 88 | X | X | G |
| 89 | Y | Y | h |
| 90 | Z | Z | H |
| 91 | [ | [ | i |
| 92 | \ | \ | I |
| 93 | ] | ] | j |
| 94 | ^ | ^ | J |
| 95 | _ | _ | k |
| 96 | ` | ` | K |
| 97 | a | a | l |
| 98 | b | b | L |
| 99 | c | c | m |
| 100 | d | d | M |
| 101 | e | e | n |
| 102 | f | f | N |
| 103 | g | g | o |
| 104 | h | h | O |
| 105 | i | i | p |
| 106 | j | j | P |
| 107 | k | k | q |
| 108 | l | l | Q |
| 109 | m | m | r |
| 110 | n | n | R |
| 111 | o | o | s |
| 112 | p | p | S |
| 113 | q | q | t |
| 114 | r | r | T |
| 115 | s | s | u |
| 116 | t | t | U |
| 117 | u | u | v |
| 118 | v | v | V |
| 119 | w | w | w |
| 120 | x | x | W |
| 121 | y | y | x |
| 122 | z | z | X |
| 123 | { | { | y |
| 124 | |||
| 125 | } | } | z |
| 126 | ~ | ~ | Z |
System wide configuration #
With COLLATE=C we can use underscores as first letter in folders to make them
appear before other folders:
# /etc/locale.conf
LC_COLLATE=C
Listing all interesting UTF-8 symbols #
The standard contains a block for user defined stuff, the "private use area". Nerdfonts is for example using it to store fancy stuff.
for i in `seq 57344 63743`; do echo -ne "\u$(printf '%x\n' $i) "; done

where 57344 is E000 and 63743 is F8FF