ULIX TxT Editor on Sourceforge.net

Using Different Language Formats

ULIX TxT Editor allows you to create and open documents in several different formats: ANSI, Unicode, big-endian Unicode, or UTF-8. These formats allow you to work with documents that use different character sets.

By default, your documents will be saved as standard ANSI text.

Unicode is a superset of all the major scripts of the world. It includes character sets common to business and computer use. When you save a document in Unicode, you can use Unicode control characters to help with text flow and direction for languages such as Arabic and Hebrew.

Some fonts cannot display all of the Unicode characters. If you see any characters missing in your text file, you can change the font to one that includes the character. Generally, Microsoft Sans Serif is a good choice for Unicode characters.

The bytes (a unit of storage) in a word in a Unicode document created on a big-endian processor, such as the Macintosh, are arranged in an order opposite to that of the bytes in a word in a document created on an Intel processor. The most significant byte has the lowest address, with the word stored big end first. To make your documents accessible to users on these types of computers, save your ULIX TxT Editor file in the big-endian Unicode format.

UTF stands for Universal Character Set Transformation Format. UTF-8 is the 8-bit form of Unicode. Save your document in UTF-8 if you are using older transmission media that support only 8 bits of significant data within individual bytes.