Next: , Previous: , Up: International   [Contents][Index]


22.2 Disabling Multibyte Characters

By default, Emacs starts in multibyte mode: it stores the contents of buffers and strings using an internal encoding that represents non-ASCII characters using multi-byte sequences. Multibyte mode allows you to use all the supported languages and scripts without limitations.

Under very special circumstances, you may want to disable multibyte character support, for a specific buffer. When multibyte characters are disabled in a buffer, we call that unibyte mode. In unibyte mode, each character in the buffer has a character code ranging from 0 through 255 (0377 octal); 0 through 127 (0177 octal) represent ASCII characters, and 128 (0200 octal) through 255 (0377 octal) represent non-ASCII characters.

To edit a particular file in unibyte representation, visit it using find-file-literally. See Visiting. You can convert a multibyte buffer to unibyte by saving it to a file, killing the buffer, and visiting the file again with find-file-literally. Alternatively, you can use C-x RET c (universal-coding-system-argument) and specify ‘raw-text’ as the coding system with which to visit or save a file. See Text Coding. Unlike find-file-literally, finding a file as ‘raw-text’ doesn’t disable format conversion, uncompression, or auto mode selection.

Emacs normally loads Lisp files as multibyte. This includes the Emacs initialization file, .emacs, and the initialization files of packages such as Gnus. However, you can specify unibyte loading for a particular Lisp file, by adding an entry ‘coding: raw-text’ in a file local variables section. See Specify Coding. Then that file is always loaded as unibyte text. You can also load a Lisp file as unibyte, on any one occasion, by typing C-x RET c raw-text RET immediately before loading it.

The buffer-local variable enable-multibyte-characters is non-nil in multibyte buffers, and nil in unibyte ones. The mode line also indicates whether a buffer is multibyte or not. See Mode Line. With a graphical display, in a multibyte buffer, the portion of the mode line that indicates the character set has a tooltip that (amongst other things) says that the buffer is multibyte. In a unibyte buffer, the character set indicator is absent. Thus, in a unibyte buffer (when using a graphical display) there is normally nothing before the indication of the visited file’s end-of-line convention (colon, backslash, etc.), unless you are using an input method.

You can turn off multibyte support in a specific buffer by invoking the command toggle-enable-multibyte-characters in that buffer.


Next: , Previous: , Up: International   [Contents][Index]