...making Linux just a little more fun!

rI18N or The Real Internationalization Project

By Anonymous

My article "Keymap Blues in Ubuntu's Text Console" in LG#157 left a poster in LG#158 a bit annoyed.

He is saying that I didn't do this or didn't do that. And he is right, I did not.

Specifically, I don't feel capable of proposing "[...] a good consistent solution to all the woes of the Linux console." Please address that challenge to Linus Torvalds.

I am, however, willing to take up a smaller challenge posited by that poster: namely "[...] a sample keymap which is 'sized down' and [fits] the author's needs".

Fine - let's go for it. As noted before, we are discussing the text console, no X involved.

1 What has to be included?

The text console keymap covers, inter alia, the self-insertion keys that we need to enter text. These keys vary a lot from country to country, so I'm going to leave them out. I'm not even going to try defining them for the US default keymap. The real concern, when considering text mode applications, are the 'functional keys'.

2 What are functional keys?

This is a term I'm using for lack of anything better. Alternative suggestions are welcome.

Functional keys are defined here by enumeration. The names for the keys come from the physical keyboard I'm typing this article on. They are quite common, actually:

    F1  F2  F3  F4  F5  F6  F7  F8  F9  F10  F11  F12

    Tab Backspace PrintScreen/Sys Rq Pause/Break
    Insert  Home  PageUp Delete  End   PageDown

                       Up
                Left        Right
                      Down

3 Is anything missing?

You could argue that other keys should also be in the set 'functional keys'. For instance, Escape or Enter, or the modifier keys Shift, Ctrl, Alt. The reason they are not in the set is that they are not troublesome. I have checked the default keymaps offered by the kbd project for US, Germany, France, Italy, Spain, and Russia, and I would say these extra keys are safe. They are already consistent, and the differences are practically irrelevant.

4 Terminology

Again, a note on terminology: keymap normally refers to a file where the key assignments are defined. The assignments can refer to plain keys, but they can also refer to modified keys, e.g. <ctrl><left>. In the keymap (the file!), a table of assignments for given modifiers is also called a keymap, so we get a keymap for <ctrl>, a keymap for <alt> and so on.

Additionally, you'll need to keep in mind the difference between key names and assigned keymap variables. Examples:

variable 'Delete'is distinct from the key Delete
variable 'BackSpace'is distinct from the key Backspace
variable 'F14'does not need a physical key F14

5 The approach via multiple strings

What you see, especially in Ubuntu (implying Debian, although I have not checked), is that the modifier keymaps rely on multiple strings. Examples:

KeysAssignments
<f4>F4
<shift><f4>F14
<ctrl><f4>F24
<shift><ctrl><f4>F34
<altgr><f4>F44
<shift><altgr><f4>F54
<altgr><ctrl><f4>F64
<shift><altgr<ctrl><f4>F74

Variables F4 to F74 would deliver strings to the application expecting keyboard input and the application could then take action. The funny thing is that Ubuntu only has strings for F4 and F14, while F24 to F74 are empty, and no action can be taken on receiving the empty string.

This is, however, not the point here. The point is: is it a good idea to define all those keys via strings?

All those variables up to F256 are inherited from Unix. They were meant to make the keyboard flexible - i.e. customizable - on a case-by-case basis without assuming consensus. Unix and consensus don't mix. Everybody was welcome to do with those variables whatever they wanted, and there is old software that relies on such flexibility: define F74 in the keymap, and you are going to touch somebody.

6 The approach via modifier status

There is a way to recognize, for example, <ctrl><f4> even if it has no unique string attached to it. It must have a string, of course - otherwise it would be ignored when the keyboard is in translation mode (either ASCII or UTF-8), which is the normal case. The approach relies on just reading the status of the modifiers - pressed down or not. All the modified keys get the same string as the plain key and then you find out about the modifier status. Example:

KeysAssignments
<f4>F4
<shift><f4>F4
<ctrl><f4>F4
<shift><ctrl><f4>F4
<altgr><f4>F4
<shift><altgr><f4>F4
<altgr><ctrl><f4>F4
<shift><altgr><ctrl><f4>F4

You want to know if <ctrl><f4> was received? Check the input for the F4 string, then read the status of <ctrl>. If <ctrl> is pressed you got <ctrl><f4>; if not, you got <f4>.

Nice, isn't it? Not among the Unixsaurs. You see, reading the modifier status is a Linux specialty. Even the Linux manpage for ioctl_codes, where the trick is explained, gives a strong warning against their use and recommends POSIX functions. The catch is there are no such POSIX functions - so you either use the Linux IOCTLs or you're out of luck.

Ah, I hear, but that's not platform neutral. So what? Go through the source code of any text console editor and count the pre-processor directives that are there to accommodate peculiarities of Unix variants 1-999. There are also pre-processor directives to accommodate Linux, modifier status and all. If Midnight Commander can do it, why not others?

There are text console editors that use the modifiers for their Windows version but not for their Linux version. Why not? Because Windows delivers the key and the modifier at once, while Linux needs distinct commands, one to read the key, one to read the modifier. Therefore there is a slight time difference between the results - and theoretically, a risk of incurring an error. A lame excuse: when the two commands are next to each other in the source code, that error will never materialize. We are talking about micro-seconds.

My choice is to use plain keys everywhere in the set of functional keys whatever the modifiers may be, except the <ctrl><alt> combo which will be reserved for system operations like switching consoles.

7 Which modifiers do we reasonably need?

How many keymaps do you need in the keymap? (If you're confused, please review the terminology warning above.) Ubuntu has 64 keymaps in the keymap, a mighty overkill. Fedora and OpenSUSE are a lot more reasonable. I'll stick close to their version:

plain0
<shift>1
<altgr>2
<ctrl>4
<shift><ctrl>5
<altgr><ctrl>6
<alt>8
<ctrl><alt>12

This choice gives the entry keymaps 0-2,4-6,8,12 in the keymap (the file) with a total of 8 keymaps (the assignment tables). As already mentioned, Ubuntu has keymaps 0-63.

Note that defining 8 keymaps does not preclude defining more. But those 8 keymaps should be defined as we dare to propose here.

Note also to the users of the US keyboard: <altgr> is nothing more than the Alt key on the right side, which must be kept distinct since it plays a role on non-US keyboards.

8 Control characters 28-31

The characters 28-31, which are control codes, are desperately difficult to find on non-US keyboards. All the mnemonics implied by their name get lost. Besides, they also get shifted and are awkward to generate.

These are Control_backslash, Control_bracketright, Control_underscore, Control_asciicircum. A language and keyboard neutral solution could be as follows:

NameCodeAssignment
Control_backslashchar. 28<ctrl><8> on numeric keypad
Control_bracketrightchar. 29<ctrl><9> on numeric keypad
Control_underscorechar. 30<ctrl><0> on numeric keypad
Control_asciicircumchar. 31<ctrl><1> on numeric keypad

9 Immediate and likely effects

The immediate effects of the proposed partial keymap for functional keys concern system operations:

This would seem to conflict with DOSEMU - but it doesn't, because DOSEMU uses raw keyboard mode.

The non-immediate effects depend on text mode applications following Midnight Commander's example and using the Linux ioctls to read the modifiers status.

If it spreads then it would be normal to move to the start of a buffer with <ctrl><home> while <home> takes you to the start of the line. To move to the next word <ctrl><right> would be available. And you could highlight a selection pushing <shift> and moving the cursor. Last but not least, a large number of keybindings based on F1-F12 would become available and they would be language and country independent!

To anybody who only has experience with the US keyboard running the US default keymap, please try Nano on a Spanish or French keyboard. When you are done, please come back and agree with me that this little partial keymap should be called rI18N or the Real Internationalization Project.

10 The goodies

So, after all those clarifications, here is the partial keymap for the functional keys.


Talkback: Discuss this article with The Answer Gang


Bio picture A. N. Onymous has been writing for LG since the early days - generally by sneaking in at night and leaving a variety of articles on the Editor's desk. A man (woman?) of mystery, claiming no credit and hiding in darkness... probably something to do with large amounts of treasure in an ancient Mayan temple and a beautiful dark-eyed woman with a snake tattoo winding down from her left hip. Or maybe he just treasures his privacy. In any case, we're grateful for his contributions.
-- Editor, Linux Gazette

Copyright © 2009, Anonymous. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 159 of Linux Gazette, February 2009

Tux