In cxterm, one Chinese character is represented by 2 bytes, and one ASCII is represented by 1 byte. There are two kinds of Chinese encoding schemes, one requires that the highest-bits (MSB) be set to 1 in both bytes of a Chinese character, the other only requires that the MSB in the first byte of a Chinese character be set to 1. Both schemes requires that the MSB in each ASCII code be unset, i.e. second half of the ASCII code set (from 128 to 255) cannot be used in cxterm. The cxterm program recognizes both GB and BIG5 encoding, which are examples of the above two scheme respectively. However, two encoding cannot co-exist in the same time.
Please refer manual page xterm(1) for usage of xterm. This manual will cover only those features related to the Chinese language processing part.
If the current input keystroke sequence yields more candidates than that can be displayed in the input area, only the first group of ten candidates are shown. The rest of the selection list will shown a group at a time upon a press of the "move-right" key. Conceptually it can be considered as a "viewport" that reveals only a segment of the list. The size of each group or viewport is usually ten candidates; the actual number can be redefined and it depends on the current width of the window. Initially the viewport is at the beginning of the list. A ``>'' sign at the right edge of the viewport indicates more candidates available to the right. A move-right key (usually ``>'' or ``.'', both on the same physical key button) will move the viewport to the right and reveal the next group of candidates. A ``<'' sign at the left edge of the viewport suggests more candidates at the right. A move-left key (usually ``<'' or ``,'', both on the same physical key button) will move the viewport backward and reveal the previous group.
Chinese input area in cxterm is not a separate window. Whether the X cursor is in the input area or elsewhere in the cxterm cxterm makes no difference to the keyboard input.
Cxterm is always in one of the different input modes corresponding to different input methods. Two built-in methods are the "ASCII" and "IC". The "ASCII" method is to input 7-bit ASCII characters without any translation (just like xterm). The "IC" (Internal Coding) method translates every 4 hexadecimal digits into one double-byte code, corresponding to a Chinese character in the adopted encoding.
Most other input methods are stored as external files and are loaded in run-time on demand. Such external input methods are user-accessible and expandable. See tit2cit(1) on the format of the input method specification file and on how to add your own input method. The name of the input method is determined by the name of the external file where the input method is stored.
An input method can be character-based (each candidate is a single Chinese character), phrase-based (each candidate is a multiple Chinese character unit), or both. This makes no different to the cxterm input mechanism. If a keystroke sequence yields to a phrase (a multi-character word), the whole phrase will be displayed as a whole in the selection list, and when the corresponding selection key is pressed, it will be input as a whole unit.
Chinese input in cxterm requires only minimal keystrokes. If a keystroke sequence matches an input unit (single character or phrase), all its non-empty prefixes also match that input unit. A selection can be made any time without waiting for the whole key sequence being typed. It is however a common practice to type more keys to reduce the size of the candidate list for an easy selection.
If the exact keystroke sequence for an input unit is forgotten, the wildcard characters (usually ``*'' and ``?'') can be used in place of the actual keys. The expansion is similar to the file name generation as in a Unix shell. A wildchar (usually ``?'') can substitute any single key, while a wildcard (usually ``*'') can substitute any number (including zero) of keys.
Automatic selection refers to the automatic input of a candidate when it is the only candidate matching the current input keystroke sequence. Cxterm supports three kinds of automatic selection, ``NEVER'', ``ALWAYS'', and ``WHENNOMATCH''. Under the ``NEVER'' mode, cxterm will not take any action; a user has to press the corresponding selection key to pick the input unit. Under the ``ALWAYS'' mode, cxterm will automatically pick the unique choice and sent to the terminal. Under the ``WHENNOMATCH'' mode, cxterm will input the unique choice only after the user types another key into the keystroke sequence, and only if the new sequence no longer matches any candidate. The newly typed key will start a new keystroke sequence for the next input cycle. For example, if the current keystroke sequence ``ab'' yields only one candidate ``X'', but after the user types a key ``c'', the key sequence ``abc'' yields nothing, then cxterm will pick the choice ``X'' for the keystroke ``ab'' and leave ``c'' alone in the keystroke sequence for the next input. If ``abc'' matches one or more candidates, the input process continues and the automatic selection aborts.
Cxterm provides an alternative way to input a Chinese phrase by composition. If the input keystroke sequence contains two or more segments connected by association keys (usually ``-''), cxterm will match it against a predefined list of phrases (called glossary, or the association list). It produces only those phrase candidates, in which the first Chinese character matches the first segment, and the second character matches the second segment, and so on.
Cxterm usually relies on the association keys typed by the user to segment the keystroke buffer. It also supports an auto-segmentation mode. When this mode is on, cxterm can automatically insert association keys following the rule of longest matches of keystroke segments. That is, if the current keystroke sequence matches some candidates but not any more when a new input key is typed, cxterm inserts an artificial associate key before the newly input keystroke. Combined with the ``WHENNOMATCH'' auto-selection mode, it is sometimes possible to keep typing input keys and having cxterm making the selection.
If the association input mode is on (the default case), cxterm will form and display a new candidate list whenever a Chinese character is input. The list consists of all possible subsequences of the latest input Chinese character. The user can pick up a candidate from the the list instead of typing another key sequence.
Both phrase input by composition and post-selection association input require a predefined list of phrases (the association list). The list is stored as an external file and it is loaded each time cxterm starts up. The user can change the file to add or drop phrases. The list is usually in the usage frequency order, since both composition and association will search for and display candidates in that order.
BS or DEL delete the previous typed input key ctrl-F move cursor forward one key ctrl-B move cursor backward one key ctrl-A move cursor to start of the buffer ctrl-E move cursor to end of the buffer ctrl-D delete input key at the cursor position ctrl-U delete all keys and clear the buffer ctrl-P fetch the keystrokes of the last input
All the font selection entries in fontMenu can be used to set Chinese font or fonts as well.
Cxterm adds two more entries to the vtMenu:
parameter string meaning ---------------- ------- auto-select=never set auto-select mode to "NEVER" auto-select=always set auto-select mode to "ALWAYS" auto-select=whennomatch set auto-select mode to "WHENNOMATCH" auto-segment=no disable auto-segmentation auto-segment=yes enable auto-segmentation association=no disable post-selection association association=yes enable post-selection association input-conv=disable temporary disable HZ input conversion input-conv=enable enable HZ input conversion again input-conv=toggle toggle HZ input conversion enable/disable
The hanzi input conversion is usually enable. When it is set disable, the cxterm input area becomes ``insensitive'' and all input key are uninterpreted and treated as ASCII. If hanzi input is current disabled (or enabled), input-conv=toggle enables (or disables) it. Switching of input method automatically enables HZ input.
The defaults bindings in cxterm window are:
Shift <KeyPress> Prior: scroll-back(1,halfpage) \n\ Shift <KeyPress> Next: scroll-forw(1,halfpage) \n\ Shift <KeyPress> Select: select-cursor-start() \ select-cursor-end(PRIMARY, CUT_BUFFER0) \n\ Shift <KeyPress> Insert: insert-selection(PRIMARY, CUT_BUFFER0) \n\ <KeyPress> F1: switch-HZ-mode(ASCII) \n\ <KeyPress> F2: switch-HZ-mode(IC) \n\ ~Meta<KeyPress>: insert-seven-bit() \n\ Meta<KeyPress>: insert-eight-bit() \n\ Ctrl ~Meta<Btn1Down>: popup-menu(mainMenu) \n\ ~Meta <Btn1Down>: select-start() \n\ ~Meta <Btn1Motion>: select-extend() \n\ Ctrl ~Meta <Btn2Down>: popup-menu(vtMenu) \n\ ~Ctrl ~Meta <Btn2Down>: ignore() \n\ ~Ctrl ~Meta <Btn2Up>: insert-selection(PRIMARY, CUT_BUFFER0) \n\ Ctrl ~Meta <Btn3Down>: popup-menu(fontMenu) \n\ ~Ctrl ~Meta <Btn3Down>: start-extend() \n\ ~Meta <Btn3Motion>: select-extend() \n\ ~Ctrl ~Meta <BtnUp>: select-end(PRIMARY, CUT_BUFFER0) \n\ <BtnDown>: bell(0)
Below is a sample of how to use switch-HZ-mode() action to add more input methods, or redefine input mode switch keys:
cxterm*VT100.Translations: #override \ <KeyPress> F1: set-HZ-parameter(input-conv=toggle) \n\ <KeyPress> F2: switch-HZ-mode(IC) \n\ <KeyPress> F3: popup-panel(config) \n\ ~Shift <KeyPress> F4: switch-HZ-mode(TONEPY) \n\ Shift <KeyPress> F4: switch-HZ-mode(PY) \n\ ~Shift <KeyPress> F5: switch-HZ-mode(WuBi) \n\ Shift <KeyPress> F5: switch-HZ-mode(CangJie) \n\ ~Meta <KeyPress> Escape: insert() set-HZ-parameter(input-conv=off)
In this example, pressing <F2> will switch the current input method to IC; <F4> will switch again to TONEPY method (external input method, requires TONEPY.cit to be in the search path(s) of the .cit files); <shift>+<F4> will switch again to PY method, and so on. The last line above may be a good setting for those who use vi (or celvis). Pressing <ESC> will pass ESC to vi to end the insertion mode, and temporarily disable cxterm hanzi input (so that you can enter subsequent vi commands as ASCII).
The following xterm actions have additional meaning:
Start a cxterm in reserve video with scroll bar: (It is in GB encoding and uses X11 fonts hanzigb16st and 8x16 by default).
cxterm -rv -sb
Start a cxterm in BIG5 encoding (where hku16et is a BIG5 encoding X11 font):
cxterm -fh hku16et -fn 8x16 -BIG5