On double-click:

The conversion tables were built using Python.

The ISO 8859-X encodings are shown with C1 control characters in positions 0x800x9f (the C0 controls being the familiar ASCII 0x000x1f plus ).

The C1 control codes are from an ancient era of computer history, and don't seem to have had much use. Wikipedia says they were almost never used. Some of the functions they describe could be implemented in 7-bit environments by using followed by an ASCII printable char (eg. 0x9B CSI could be represented in a 7-bit environment as 0x1B 0x5B [: this is the start of ANSI terminal control sequences). Since they were redundant, those control chars were hardly used. Indeed Windows-1252 decided to extend ISO 8859-1 by reassigning those codepoints. Still according to Wikipedia, U+008E SS2 and U+008F SS3 in EUC-JP are amonsgt the few C1 control codes used in their intended usage, besides round-trip to EBCDIC.

Further, U+0080 PAD, U+0081 HOP and U+0099 SGC are marked as figment, meaning labels for C1 control code points which were never actually approved in any standard, which I suppose means they're even more obscure; see also this posting by Ken Whistler.

The ISO/IEC 8859-1:1997 document does not give any meaning to codepoints 0x800x9f, deferring their meaning to ISO/IEC 6429.

For HTML5, ISO 8859-1 is aliased to Windows-1252. It wouldn't make sense to use the C1 control charactes in an HTML document.

In summary and in conclusion, the C1 controls included in the tables for ISO 8859-X are an obscure relic of computer history. If you have data that uses those codepoints, in all likelihood it is actually using Windows-1252 or another extension of ISO 8859-1.


The DOS CP encodings are shown with C0 control characters in the 0x000x1f range. This follows the Unicode translation tables. Other sources (eg. Wikipedia's CP437 page) show the graphical characters instead. A SO answer goes into some details about the interpretation of that range.


Frédéric Perrin, October 2024.