
bMighty SMB TechEncyclopedia
Result for: Unicode
A character code that defines every character in most of the speaking languages in the world. Although commonly thought to be only a two-byte coding system, Unicode characters can use only one byte, or up to four bytes, to hold a Unicode "code point." The code point is a unique number for a character or some character aspect such as an accent mark or ligature. Unicode supports more than a million code points, which are written with a "U" followed by a plus sign and the number in hex; for example, the word "Hello" is written U+0048 U+0065 U+006C U+006C U+006F (see hex chart).
Character Encoding Schemes
There are several formats for storing Unicode code points. When combined with the byte order of the hardware (big endian or little endian), they are known officially as "character encoding schemes." They are also known by their UTF acronyms, which stand for "Unicode Transformation Format" or "Universal Character Set Transformation Format." See byte order.
UTF-8 is widely used because the first 128 bits in the byte are ASCII, and although up to four bytes can be used, only one byte is required for use in the English speaking world. UTF-16 and UTF-32 use a fixed number of bytes. See DBCS.
Character Encoding Schemes
There are several formats for storing Unicode code points. When combined with the byte order of the hardware (big endian or little endian), they are known officially as "character encoding schemes." They are also known by their UTF acronyms, which stand for "Unicode Transformation Format" or "Universal Character Set Transformation Format." See byte order.
UTF-8 is widely used because the first 128 bits in the byte are ASCII, and although up to four bytes can be used, only one byte is required for use in the English speaking world. UTF-16 and UTF-32 use a fixed number of bytes. See DBCS.
Unicode ISO Number Coding 10646 of Byte Scheme Equivalent Bytes Order** UTF-8 1-4 BE or LE UTF-16 (UCS-2) 2 BE or LE UTF-16BE (UCS-2) 2 BE UTF-16LE (UCS-2) 2 LE UTF-32 (UCS-4) 4 BE or LE UTF-32BE (UCS-4) 4 BE UTF-32LE (UCS-4) 4 LE Pure ASCII (compatible with early 7-bit e-mail systems) UTF-7 1-4 BE or LE **Byte Order (see byte order) BE = big endian LE = little endian
Terms similiar to your search
- Entries before Unicode
- Unibus
- unicast
- unicast routing protocol
- Unicenter
- Unices
- Entries after Unicode
- UniData
- unidirectional
- UNIFACE
- unified communications
- unified display interface
Define another IT term
THIS COPYRIGHTED DEFINITION IS FOR PERSONAL USE ONLY.
All other reproduction is strictly prohibited without permission from the publisher.
Copyright (©) 1981-2007 The Computer Language Company Inc All rights reserved.
Find pre-screened vendors to grow your business
Get key info on the products you need
- Phone Systems Guide - What kind of phone system is right for your business
- Web Design Guide - What to look for in a Web designer
- Merchant Services Guide - Credit card processing and more
- Online Marketing Guide - Leverage the Net to market your business
- Alternative Financing Guide - How to find the cash your business needs
- View all guides
bMighty White Papers
Check out the FREE
bMighty email newsletter!
bMighty email newsletter!
Browse by Category
bMighty Tech
Term Of Day:
Boost your tech
vocabulary!
bMighty's SMB
TechEncyclopedia
defines more than
20,000 IT terms.
FREE Technology Services Locator!
Search our database of 200,000 solution- provider locations by business activity, technology, vertical market, and customer size. Find a technology partner NOW.
go



