Are ASCII escape sequences and control characters pairings part of a standard?

There are common pairings of escape sequences to ASCII control characters, such as Ctrl-C and Ctrl-Z to ETX and SUB, respectively.

On the Wikipedia Control Codes page, there are most pairings, but no cited reference.

Are the control character and key sequence pairings part of a standard?

Where is that listed for Linux and other OS’s?

Are there man pages listing these pairings?

Are they purely decades of unwritten convention?

References

The Linux manpage for termios(3) lists some of them.
the command stty -a lists some for your system

Contents hide

Answers:

Method 1

Method 2

Method 3

De-jure standardization – maybe no

De-facto standardization – the VT100?

Linux kernel

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The pairings are “the Latin alphabet”, 1 through 26 (and then the relevant rest of ASCII too).

Ctrl-C gives ETX, byte value 3 (0x03, 000000011); C is ASCII 67 (0x43, 010000011). Flip bit 7 (add/subtract 64) to get from one to the other. SUB is byte value 26, and so on, as listed on the Wikipedia page you mention in order from 1-26 and A-Z.

The other C0 controls correspond to Ctrl and other non-alphabetic characters: NUL is [email protected], since @ is ASCII 64, and [ (91) corresponds to ESC (27), and so on until you hit space.

ASCII defines these bytes with those labels and (somewhat) meaning, as does Unicode, and as do numerous other encoding standards. The use of Ctrl to flip that bit is determined by the terminal or input driver, but the name “control characters” is fairly suggestive of how that pairing comes about. They will have the same correspondence between letter and byte on any system following this tradition.

On the other hand, many of the ASCII controls and their key sequences are either not used or used for different purposes than originally envisaged, in modern Unix-like systems at least. Ctrl-C and Ctrl-D are still reasonably parallel in the effect they have, but Ctrl-V is usually used to initiate literal input these days rather than synchronous idle, for example, and I’ve never seen a group separator in the wild.

Method 2

I wrote a document in 1984 that summarizes ANSI Codes X3.64-1979, ANSI X3.4-1977, and ANSI X3.41-1974. This ansicode.txt describes how the control codes affect DEC LA-series hardcopy terminals and the VT-series video terminals.

Method 3

De-jure standardization – maybe no

We can rule out that it is part of POSIX. POSIX tries to avoid requiring either the full ASCII set, or the ASCII encodings (numeric values), on which this mapping is based. For example, the ETX character you mention is not required by POSIX. Nor does POSIX mention anything about Control-C / ETX or Control-Z / SUB as defaults when discussing the characters used for INTR and SUSP, for example.

As pointed out by others, the Control key behaviour is not inherently part of the definition of a character set / character encoding. It looks like the Control key mapping was not specified as part of ASCII. Nor does it seem to be part of the series of ANSI terminal standards.

De-facto standardization – the VT100?

I think you can partly explain the expectation of this behaviour in terms of the VT100 / VT102 having become the de-facto standard, though I expect the behaviour pre-dates it. See “What protocol/standard is used by terminals?“

See VT102 User Guide, “Transmitted Characters” -> “Function Keys” -> “Control Character Keys”.

Figure 4-3 shows the keys that generate control characters. You can generate control characters in two ways.

Hold down CTRL and press any unshaded key in Figure 4-3.

Press any shaded key in Figure 4-3 without using CTRL. These dedicated keys generate control characters without the use of CTRL.

Table 4-2 lists the control characters generated by the keyboard. Different computer systems may use each control character differently.

NOTE: The VT102 generates some control characters differently than previous DIGITAL terminals. Table 4-3 lists the changes.

I found this last note particularly interesting. The VT102 uses Control-space for NUL, whereas “previous terminals” from Digital used [email protected] It also changes the escapes for the last two C0 controls, RS and US. (I wonder how this fits into the pattern of flipping bit 7). The VT100 also uses Control-space, so I assume “previous terminals” refers to the VT52 family.

Linux kernel

This is not replicated in different hardware drivers e.g. PS/2 keyboard v.s. USB keyboard. Instead it is handled in the VT layer. See vt/keyboard.c. The state of the keyboard modifiers, including Control, is maintained in shift_state. The shift state is then used to select a keymap.

param.shift = shift_final = (shift_state | kbd->slockstate) ^ kbd->lockstate;
param.ledstate = kbd->ledflagstate;
key_map = key_maps[shift_final];

https://elixir.bootlin.com/linux/v4.16.8/source/drivers/tty/vt/keyboard.c#L1393

So for more information I think you would have to look into keymaps as used by the VT layer. I assume the key map for Control is set up to produce the pairing you describe.

The loadkeys man page also mentions the kernel default key map. This has been moved since the man page was written; it is now located at drivers/tty/vt/defkeymap.c_shipped. To read these tables, you must know the Linux keycodes used to index it. They are based on a QWERTY keyboard, so the letters are neither alphabetical nor contiguous. See include/uapi/linux/input-event-codes.h. Or better, this table which shows both keycodes and the default Control mapping.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating