Chinese character
Mirror of English Wikipedia, the free encyclopedia
Categories: Articles with unsourced statements | Articles containing Chinese text | Logographic writing systems | Graphemes | Chinese language
| Chinese character in various languages | |
|---|---|
| Chinese | |
| Traditional Chinese | 漢字 |
| Simplified Chinese | 汉字 |
| Pinyin (Mandarin) | Hànzì |
| Jyutping (Cantonese) | hon3 zi6 |
| Korean | |
| Hanja | 漢字 |
| Hangul | 한자 |
| Revised Romanization: | Hanja |
| McCune-Reischauer | Hancha |
| Japanese | |
| Kanji | 漢字 |
| Romaji | Kanji |
| Vietnamese | |
| Hán Tự/Chữ Nho | 漢字 |
| Quốc Ngữ (National Script) | Hán Tự |
A Chinese character (Traditional Chinese: 漢字; Simplified Chinese: 汉字; Hanyu Pinyin: hànzì) is a logogram used in writing Chinese, Japanese, Korean, and formerly Vietnamese. Its possible precursors appeared as early as 8000 years ago, and a complete writing system in Chinese characters was developed 3500 years ago in China, making it perhaps the oldest surviving writing system.
Contrary to popular belief, only 4% of Chinese characters are derived directly from individual pictograms (象形字), and even in most of those cases the relationship is not necessarily clear to the modern reader; the other 95% are logical aggregates (會意字) and pictophonetics (形聲字), characters containing two parts — one indicating a general category of meaning and the other the sound, though the sound is often only approximate to the modern pronunciation because of changes over time and differences between source languages. Most words in Mandarin Chinese are polysyllabic and thus require two or more characters to write; single-character abbreviations for polysyllabic words are also common. The number of Chinese characters contained in the Kangxi dictionary is nearly 47,035, although a large number of these are rarely-used variants accumulated throughout history. In China, literacy for the working citizen is defined as knowledge of 2000 characters. [1]
In Chinese tradition, each character corresponds to a single syllable, but pronunciation varies in dialects. The loose relationship between phonetics and characters has made its use possible for widely different language families.
The actual shape of many Chinese characters varies in different cultures. Mainland China adopted simplified characters in 1956, but Traditional Chinese characters are still used in Taiwan and Hong Kong. Japan has its own simplified characters since 1946, while Korea has limited the use of Chinese characters, and Vietnam completely abolished the use of characters in favour of romanized Vietnamese.
Chinese characters are also known more formally as sinographs and the system as sinography. Languages which have adopted sinography - and with the orthography a large number of loanwords from the Chinese language - are known as Sinoxenic whether they still use sinography or not at present. The term does not imply any genetic affiliation with Chinese. The major Sinoxenic languages are generally considered to be Japanese, Korean and Vietnamese.
Contents |
History
According to legend, Chinese characters were invented by Cangjie (c. 2650 BC), a bureaucrat under the legendary emperor, Huangdi. The fable told that as Cangjie was hunting on Mount Yangxu (today Shanxi), he saw a tortoise whose veins caught his curiosity. Inspired by the possibility of a logical relation of those veins, he studied the animals of the world, the landscape of the earth, and the stars in the sky, and invented a symbolic system called zi (字) -- Chinese characters. It was said that on the day the characters were born, Chinese heard the devil mourning, and saw crops falling like rain, as it marked the beginning of civilization, for good and for bad.
Modern archaeological evidence, however, has suggested an earlier Neolithic root of Chinese characters. The earliest evidence for what might be writing comes from Jiahu (賈湖), a Neolithic site in the basin of the Yellow River in Henan province, dated to c. 6500 BC [2]. It has yielded turtle carapaces that were pitted and inscribed with symbols. Later excavations in eastern China's Anhui province and the Dadiwan culture sites in the eastern part of northwestern China's Gansu province uncovered pottery shards, dated to c. 5000 BC, inscribed with symbols [3][4]. It is unknown whether these symbols formed part of an organized system of writing, but many of them bear resemblance to what are accepted as early Chinese characters, and it is speculated that they may be ancestors to the latter.
Inscription-bearing artifacts from the Dawenkou culture (大汶口) culture site in Juxian County, Shandong, dating to c. 2800 BC, have also been found [5]. The Chengziyai (城子崖) site in Longshan township, Shandong has produced fragments of inscribed bones used to divine the future, dating to 2500 - 1900 BC, and symbols on pottery vessels from Dinggong are thought by some scholars to be an early form of writing. Symbols of a similar nature have also been found on pottery shards from the Liangzhu culture (良渚) of the lower Yangtze valley.
Although the earliest forms of primitive Chinese writing are no more than individual symbols and therefore cannot be considered a true written script, the inscriptions found on bones (dated to 2500 - 1900 BC) used for the purposes of divination from the late Neolithic Longshan (龍山) Culture (c. 3200 - 1900 BC) are thought by some to be a proto-written script, similar to the earliest forms of writing in Mesopotamia and Egypt. It is possible that these inscriptions are ancestral to the later Oracle bone script of the Shang Dynasty and therefore the modern Chinese script, since late Neolithic culture found in Longshan is widely accepted by historians and archaeologists to be ancestral to the bronze age Erlitou culture and the later Shang and Zhou Dynasties.
The oldest Chinese inscriptions that are indisputably writing are the Oracle bone script (甲骨文 jiǎgǔwén, lit. shell-bone-script), a well-developed writing system of the Shang Dynasty (or Yin (殷) Dynasty), attested from about 1600 BC (from Zhengzhou) and 1300 BC (from Anyang), along with a very few logographs found on pottery shards and cast in bronzes, known as the Bronze script, which is very similar to but more complex and pictorial than the Oracle Bone Script. Only about 1,400 of the 2,500 known Oracle Bone logographs can be identified with later Chinese characters and therefore easily read. However, it should be noted that these 1,400 logographs include most of the commonly used ones.
Written Styles
The earliest Chinese characters are the Oracle Bone Script of the late Shang Dynasty and the Bronze Script (金文, jīnwén) of the Shang and Zhou Dynasties. These scripts are no longer in use, and are of purely academic interest.
The first script that is still in use today, albeit restricted use, is the Seal Script (篆書, zhuànshū). It evolved organically out of the Zhou bronze script, and was adopted in a standardized form under the first emperor of China, Qin Shi Huang. The seal script, as the name suggests, is now only used in artistic seals. Few people are still able to read it effortlessly today, although the art of carving a traditional seal in the seal script remains alive in China and Japan today; some calligraphers also work in this style.
Scripts that are still used regularly are the "Clerical Script" (隸書, lìshū) of the Qin to Han dynasties, the Weibei (魏碑, wèibēi), the "Regular Script" (楷書, kǎishū) used for most printing, and the "Semi-cursive Script" (行書, xíngshū) used for most handwriting.
The Grass Script (草書, cǎoshū) is not in general use, and is a purely artistic calligraphic style. The basic character shapes are suggested, rather than explicitly realized, and the abbreviations are extreme. Despite being cursive to the point where individual strokes are no longer differentiable and the characters often illegible to the untrained eye, this script (also known as draft) is highly revered for the beauty and freedom that it embodies. Many simplified Chinese characters according to the CCP 1964 list are derived from cursive simplifications of traditional characters. The Japanese hiragana script is derived from the cursive script.
Just as Roman letters have a characteristic shape (lower-case letters occupying a roundish area, with ascenders or descenders on some letters), Chinese characters occupy a more or less square area. Characters made up of multiple parts squash these parts together in order to maintain a uniform size and shape — this is the case especially with characters written in the Sòngtǐ style. Because of this, beginners often practise on squared graph paper, and the Chinese sometimes call Han characters "Square-Block Characters" (方塊字, fāngkuài zì).
| Oracle Bone Script | Seal Script | Regular Script (Traditional) | Regular Script (Simplified) | Pinyin | Meaning |
|
|
| — | rì | Sun |
|
|
| — | yuè | Moon |
|
|
| — | shān | Mountain |
|
|
| — | shuǐ | Water |
|
|
| — | yǚ | Rain |
|
|
| — | mù | Wood |
|
|
| — | hé | Rice Plant |
|
|
| — | rén | Human |
|
|
| — | nǚ | Woman |
|
|
| — | mǔ | Mother |
|
|
| — | mù | Eye |
|
|
| — | niú | Bull |
|
|
| — | yáng | Goat |
|
|
|
| mǎ | Horse |
|
|
|
| niǎo | Bird |
|
|
|
| guī | Tortoise |
|
|
|
| lóng | Chinese Dragon |
|
|
|
| fèng | Chinese Phoenix |
Formation of Characters
- Main articles: Chinese character classification and radical
In the early days when Chinese characters were invented, pictograms dominated the early writing system, in which it was possible to discern the meaning from shapes. The evolution of characters, notably the need for expressing abstract concepts and ease of writing, has boosted the emergence of more conceptual characters.
Around 100AD, a lingust Xu Shen classified all Chinese characters into six categories, namely liùshū' (六書), in his dictionary of etymology Shuowen Jiezi (說文解字). Although the categories are arguably inconsistent to reflect complete nature of Chinese characters, it has been perpetuated by the long history and its pervasive use. [6]
1. Pictogram (象形字 xiàngxíngzì)
Contrary to popular belief, only a small portion of Chinese characters are pictograms, which reflects the shape of real objects. These characters have evolved into a simplifer form to make ease of writing.
Examples include 日 (ri) for "sun", 月 (yue) for "moon", 木 (mu) for "wood". There is no concrete data to show the number of pictograms in modern characters, but 2000 years ago Xu Shen estimated that 4% of Chinese characters fell into this category.
2. Ideograph (指示字, zhǐshìzì)
Also called a simple indicative, simple ideography, or ideogram, it adds an indicator to a pictograph to make a new meaning. For instance, while 刀 (dāo) is a pictogram for "knife", placing an indicator in the knife makes 刃 (rèn), an ideogram for "blade". Other common examples are 上 (shàng) for "up" and 下 (xià) for "down". The number of this category is small, as most concepts can be represented by characters in other categories.
3. Logical aggregrates (會意字, Huìyìzì)
Also translated as associative compounds, it symbolizes an abstract concept with pictograms. For instance, while 木 (mu) is a pictograph for wood, putting two 木 together makes 林 (lin), an ideogram for "forest". Combining 日 (ri) sun and 月 (yue) light makes 明 (ming) bright which reflects the sunlight and moonlight up the sky. Xu Shen estimated that 13% of characters fall into this category.
4. Pictophonetic compounds (形聲字, Xíngshēngzì)
Called by semantic-phonetic compounds, or phono-semantic compounds, it represents the largest group of characters in existing Chinese, in which it combines a simple pictograph with phonetics and makes a new meaning.
Examples are 河 (he) river, 湖 (hu) lake, 流 (liu) stream, 沖 (chong) riptide, 滑 (hua) slippery. All these characters are started with a radical of three dots, a simplified pictograph for a water drop. The other side is a phonetic indicator, mostly a homophone or a near homophone whose pronunciation is simliar to that of original meaning.
In AD100, Xu Shen categorised up to 82% of characters into this category, but it was up to around 90% in Kangxi Dictionary 2000 years later. Pictophonetic has helped Chinese to extend its vocabulary in a high speed, but as the evolution of Chinese progressed, most phonetics in pictophonetic compounds were lost, and the making of them has been often arbitary.
It is arguably difficult to associate relevant concepts with those characters. For example, the radical of 貓 (mao) cat is 豸(zhi), a pictograph for worms; the radical of 气 (qi) air is also used to make a character for 氧 (yang) oxygen and 氨 (an) ammonia.
5. Borrowing (假借字, Jiǎjièzì)
Also called phonetic loan characters, those characters have been created before, but were borrowed to represent another meaning. In most cases, it happens when a concept is invented orally, but lacks characters to represent it. Occasionally, a new meaning can also replace the old meaning. 自 (zì) was a character for nose, but today it exclusively refers to oneself. The old meaning of 萬 (wan) was spider, but it was completely replaced by ten thousand.
As the number of characters has grown, lingusts often resist making any new characters based on this principle because it ignores the logistics of creating new characters. However, the need for writing dialects, notably Cantonese and Taiwanese in Hong Kong and Taiwan, has extended tardily the number of those characters to represent dialectic vocabulary in which its written form is not recorded in existing Chinese characters.
6. Associate Transformation (轉注字, Zhuǎnzhùzì)
These characters originally represented the same meaning but have bifurcated through orthographic and often semantic drift. For instance, 考 (kǎo) to verify and 老 (lǎo) old were once the same character for "elder person", but detached into two separate words. As characters of this category are rare, association transformation is often omitted or combined with others in modern character categories.
Written Variants
Orthography
The nature of Chinese characters makes it very easy to produce allographs for each single character, and the efforts to standardize an orthographic character set by authorities have never ceased since the past.
Usually, each Chinese character take up the same amount of space, due to their block, square nature. One of the easiest ways for beginners to ensure a proper push-off is, hence, to practise writing with a grid as a guide, which is indeed standard practice in primary schools for both normal exercises and calligraphy training. In addition to strictness in the amount of space a character takes up, Chinese characters are written with very precise rules. The three most important rules are the strokes employed, stroke placement, and the order in which they are written (stroke order). Most words can be written with just one stroke order, though some words also have variant stroke orders, which may occasionally result in different stroke counts. On a larger scale, Chinese text is traditionally written from top to bottom and then right to left, but it is more common today to see the same orientation as Western languages: going from left to right and then top to bottom (see Chinese written language). Most punctuation marks were adopted from the West, but there are a few exceptions: for example, names of books are marked with a wavy line drawn to their right in vertical text, or enclosed in a special double pointed bracket in horizontal text.
Common errors while writing Chinese characters include incorrect stroke direction, incorrect stroke order, incorrect stroke length relative to other strokes, and incorrect placement of strokes relative to other strokes, as well as the weight given to the different parts of a stroke. Each mistake is highly visible to the literate eye, and such mistakes are often shunned, being marks of illiteracy or incompetence. In a culture that values scholarship as its highest virtue, such attributions are highly undesirable. Because of this strictness in not only the image of the character, but how the image is produced, it is considered by many the most difficult to learn properly.
Allography
Due to the long history of Chinese characters however, as well as many stylistic variations that have been developed and numerous attempts by past rulers to standardise writing, some characters do possess multiple forms. These characters are merely allographs of each other in that their composition is of the same root. These allographs should not be mistakened as simplified forms, as their stroke counts are the same for most cases, and often have generally the same appearance.
Reforms: Simplification
- Main articles: Simplified Chinese character, Shinjitai
The use of traditional characters versus simplified characters varies greatly, and can depend on both the local customs and the medium. Because character simplifications were not officially sanctioned and generally a result of caoshu writing or idiosyncratic reductions, traditional, standard characters were mandatory in printed, and especially official, works, while the (unofficial) simplified characters would be used in everyday writing, or quick scribblings. Since the 1950's and especially with the publication of the 1964 list, the PRC has officially adopted a simplified script, while Hong Kong, Macau, and Taiwan retain the use of the traditional characters. There is no absolute rule for using either system, and often, it is determined by what the target audience understands, as well as the upbringing of the writer. In addition there is a special system of characters used for writing numerals in financial contexts; these characters are modifications or adaptations of the original, simple numerals, deliberately made complicated to prevent forgeries or unauthorised alterations.
Although most often associated with the PRC, character simplification predates the 1949 communist victory. Caoshu, cursive written text, almost always includes character simplification, and simplified forms have always existed in print, albeit not for the most formal works. In the 1930s and 1940s, discussions on character simplification took place within the Kuomintang government, and a large number of Chinese intellectuals and writers have long maintained that character simplification would help boost literacy in China. Indeed, this desire by the Kuomintang to simplify the Chinese writing system (inherited and implemented by the CCP) also nursed aspirations of some for the adoption of a phonetic script, in imitation of the Roman alphabet, and spawned such inventions as the Gwoyeu Romatzyh.
The PRC issued its first round of official character simplifications in two documents, the first in 1956 and the second in 1964. A second round of character simplifications (known as erjian, or "second round simplified characters"), were promulgated in 1977. It was poorly received, and in 1986 the authorities rescinded the second round completely, while making six revisions to the 1964 list, including the restoration of three traditional characters that had been simplified: 叠 dié, 覆 fù, 像 xiàng.
Many of the simplifications adopted had been in use in informal contexts for a long time, as more convenient alternatives to their more complex standard forms. For example, the traditional character 來 lái (come) was written with the structure 来 in the clerical script (隸書 lìshū) of the Han dynasty. This clerical form uses two fewer strokes, and was thus adopted as a simplified form. And the character 雲 yún (cloud) was written with the structure 云 in the oracle bone script of the Shāng dynasty, and had remained in use later as a phonetic loan in the meaning of to say. The simplified form reverted to this original structure.
Southeast Asian Chinese communities
Singapore underwent three successive rounds of character simplification. These resulted in some simplifications that differed from those used in mainland China. It ultimately adopted the reforms of the PRC in their entirety as official, and has implemented them in the educational system.
Malaysia promulgated a set of simplified characters in 1981, which were also completely identical to the Mainland China simplifications; here, however, the simplifications were not generally widely adopted, as the Chinese educational system fell outside the purview of the federal government. However, with the advent of the PRC as an economic powerhouse, simplified characters are taught at school, and the simplified characters are more commonly, if not almost universally, used. However, a large majority of the older Chinese literate generation use the traditional characters. Chinese newspapers are published in either set of characters, with some even incorporating special Cantonese characters when publishing about the canto celebrity scene of Hong Kong.
Japanese Kanji
In the years after World War II, the Japanese government also instituted a series of orthographic reforms. Some characters were given simplified forms called Shinjitai 新字体 (lit. "new character forms"; the older forms were then labelled the Kyūjitai 旧字体 , lit. "old character forms"). The number of characters in common use was restricted, and formal lists of characters to be learned during each grade of school were established, first the 1850-character Toyo kanji 当用漢字 list in 1945, and later the 1945-character Jōyō kanji 常用漢字 list in 1981. Many variant forms of characters and obscure alternatives for common characters were officially discouraged. This was done with the goal of facilitating learning for children and simplifying kanji use in literature and periodicals. These are simply guidelines, hence many characters outside these standards are still widely known and commonly used, especially those used for personal and place names (for the former, see Jinmeiyō kanji).
| Traditional | Chinese simp. | Japanese simp. | meaning | |
|---|---|---|---|---|
| Simplified in Chinese, not Japanese | 電 | 电 | 電 | electricity |
| 開 | 开 | 開 | open | |
| 東 | 东 | 東 | east | |
| Simplified in Japanese, not Chinese | 佛 | 佛 | 仏 | Buddha |
| 惠 | 惠 | 恵 | favour | |
| 拜 | 拜 | 拝 | kowtow, pray to, worship | |
| Simplified in both, but differently | 圖 | 图 | 図 | picture, diagram |
| 轉 | 转 | 転 | turn | |
| 廣 | 广 | 広 | wide, broad | |
| Simplified in both in the same way | 學 | 学 | 学 | learn |
| 體 | 体 | 体 | body | |
| 點 | 点 | 点 | dot, point |
Note: this table is merely cursory, and is not a complete listing.
Furthermore, some Kanji found in the Japanese language have different meanings even though the characters are the same. For example, 好 "háo", whilst meaning 'good' in Chinese, means 'like' in Japanese, and is written thus 好き "suki".
Japanese Kanji have two readings in Japanese, the 音読み "on-yomi" and 訓読み "kun-yomi", meaning the Phonetic and Japanese reading, respectively. The onyomi is a Japanese reading derived from the Chinese language, and likewise the kun-yomi is an entirely new reading.
The on-yomi pronounciation is used when there is a word or phrase with two Kanji in succession. For example 大好き. This comprises of the two Chinese characters 大 "da" and 好 "hao". In Japanese, it is pronounced "daisuki". However, when the two characters are separate, they are pronounced as 大き "ooki" and 好き "suki". This is the kun-yomi pronounciation.
Dictionaries
The design and use of a dictionary of Chinese characters presents interesting problems. Dozens of indexing schemes have been created for the Chinese characters. The great majority of these schemes — beloved by their inventors but nobody else — have appeared in only a single dictionary; only one such system has achieved truly widespread use. This is the system of radicals.
Chinese character dictionaries often allow users to locate entries in several different ways. Many Chinese, Japanese, and Korean dictionaries of Chinese characters list characters in radical order: characters are grouped together by radical, and radicals containing fewer strokes come before radicals containing more strokes. Under each radical, characters are listed by their total number of strokes. In Japanese and Korean dictionaries, it is usually possible to search for characters by sound, using Kana and Hangul. Most dictionaries also allow searches by total number of strokes, and individual dictionaries often allow other search methods as well.
For instance, to look up the character 松 (pine tree) in a typical dictionary, the user first determines which part of the character is the radical (here 木), then counts the number of strokes in the radical (four), and turns to the radical index (usually located on the inside front or back cover of the dictionary). Under the number "4" for radical stroke count, the user locates 木, then turns to the page number listed, which is the start of the listing of all the characters containing this radical. This page will have a sub-index giving remainder stroke numbers (for the non-radical portions of characters) and page numbers. The right half of the character also contains four strokes, so the user locates the number 4, and turns to the page number given. From there, the user must scan the entries to locate the character he or she is seeking. Some dictionaries have a sub-index which lists every character containing each radical, and if the user knows the number of strokes in the non-radical portion of the character, he or she can locate the correct page directly.
Another popular dictionary system is the four corner method, where characters are classified according to the "shape" of each of the four corners.
Most Chinese-English dictionaries and Chinese dictionaries sold to English speakers use the radical lookup method combined with an alphabetical listing of characters based on their pinyin romanization system. To use one of these dictionaries, the reader finds the radical and stroke number of the character, as before, and locates the character in the radical index. The character's entry will have the character's pronunciation in pinyin written down; the reader then turns to the main dictionary section and looks up the pinyin spelling alphabetically, just as if it were an English dictionary.
This system has also been reborrowed by Chinese-language dictionary editors, giving rise to dictionaries with the traditional radical-based character listings in a section at the front, while the main body of the dictionary carries character listings by their pronunciation listed alphabetically according to their pinyin spelling.
Sinoxenic Languages
Besides Korean and Japanese, a number of Asian languages have historically been written with Han characters, or with characters modified from Han characters. They include:
- Khitan language (ja:契丹文字)
- Miao language
- Nakhi (Naxi) language (Geba script)
- Tangut language (fr:Tangoute, zh:西夏文, [7], [8])
- Vietnamese language (Chữ nôm)
- Zhuang language
The Jurchen language (ja:女真文字) used an ideographic script consisted of original characters with a few Han borrowings.
In addition, the Yi script is similar to Han, but is not known to be directly related to it.
Number of Chinese characters
The question of how many characters there are is still the subject of debate. In the 18th century, European scholars claimed the total tally to be about 80,000. This number, however, is thought to be exaggerated as the character count varies by dictionary and its comprehensiveness. For example, the Kangxi Dictionary lists about 40,000 characters, while the modern Zhonghua Zihai lists in excess of 80,000 (the most comprehensive Korean hanja dictionary Han-Han Dae Sajeon consists about 60,000 characters, while Japanese competing kanji dictionary Dai Kan-Wa Jiten lists 50,000 entries). One reason for the overwhelming number of characters is due to the existence of rarely-occurring variant and obscure characters (many of which are unused, even in Classical Chinese). Note, however, that no two characters are ever contextually identical.
The large number of Chinese characters is due to their logographic nature — for every morpheme a glyph is required, and variant characters have at times developed for the same morpheme. Furthermore, in the centuries after the standardisation of the Chinese script by Qin Shi Huang to the zhuanshu, the literati multiplied the total stock of characters by modifying extant characters à la xíngshēngzì (形聲字) method—by altering the radical of a homonym character to provide a distinct glyph for either new words or words that had till then been homographs. It has also been claimed that the sheer number of characters is used as a way to separate scholars from the ordinary, and perhaps even to keep certain texts from being read by all but the most scholarly.
Chinese
It is usually said that about 2,000 characters are needed for basic literacy in Chinese (for example, to read a Chinese newspaper), and a well-educated person will know well in excess of 4,000 to 5,000 characters. Note that it is not necessary to know a character for every known word of Chinese, as the majority of modern Chinese words, unlike their Ancient Chinese and Middle Chinese counterparts, are bimorphemic compounds, i.e. they are made up of two, usually common, characters. There are 6763 code points in GB2312, an early version of the national encoding standard used in the People's Republic of China. GB18030, the modern, mandatory standard, has a much higher number. The Hanyu Shuiping Kaoshi proficiency test covers approximately 5000 characters.
In the Taiwanese Ministry of Education's Chángyòng Guózì Biāojǔn Zìtǐ Biǎo (常用國字標準字體表), a list of standard forms for regularly used Chinese characters) 4808 characters are listed; The Chinese Standard Interchange Code (CNS11643)—the official national encoding standard—supports 48027 characters, while the most widely-used encoding scheme, BIG-5, supports only 13053.
In addition, there is a large corpus of dialect characters, which are not used in formal written Chinese but represent colloquial terms in non-Mandarin Chinese spoken forms. One such variety is Written Cantonese, in widespread use in Hong Kong even for certain formal documents, due to the former British colonial administration's recognition of Cantonese for use for official purposes. In Taiwan, there is also an informal body of characters used to represent the spoken Min Nan dialect.
Most common characters
These are the five hundred (500) most commonly-used Chinese characters.[citation needed] According to research, these 500 Chinese characters cover 72.1% of those used in classical and modern Chinese texts. [citation needed]的 一 不 是 了 人 在 有 我 他 这 为 之 来 大 以 个 中 上 们 到 说 国 和 地 也 子 时 道 出 而 要 于 就 下 得 可 你 年 生 自 会 那 后 能 对 着 事 其 里 所 去 行 过 家 十 用 发 天 如 然 作 方 成 者 多 日 都 三 小 军 二 无 同 么 经 法 当 起 与 好 看 学 进 种 将 还 分 此 心 前 面 又 定 见 只 主 没 公 从 知 使 部 本 动 现 因 开 些 理 长 明 样 意 已 月 正 想 实 把 但 相 两 民 她 力 文 等 外 第 王 高 问 太 头 情 西 机 它 回 并 间 手 四 关 重 应 工 性 全 门 老 点 身 东 由 何 向 至 物 战 业 被 政 内 五 儿 及 入 先 己 安 或 利 很 最 书 制 美 山 体 什 新 话 名 曰 合 加 世 平 水 常 果 位 信 度 产 立 声 南 代 走 女 言 马 金 处 便 通 命 特 给 数 次 海 今 表 原 斯 义 各 州 化 口 任 真 才 几 教 官 少 司 德 解 神 则 必 兵 气 打 员 再 论 别 听 提 万 死 更 比 受 百 做 尔 即 元 报 直 白 总 非 建 夫 北 未 张 令 反 士 师 许 条 变 系 计 且 认 目 光 管 路 接 城 活 保 结 题 却 指 感 难 量 务 治 取 场 思 电 空 边 统 件 期 克 帝 亲 复 住 请 市 六 放 风 资 求 史 色 形 望 传 八 府 眼 领 清 决 笑 告 叫 队 强 往 区 交 武 达 社 权 科 九 设 李 观 记 改 展 字 故 品 议 象 花 七 完 林 基 服 带 据 界 云 觉 像 院 飞 远 收 石 众 车 候 类 程 转 共 千 式 失 流 每 该 朝 始 连 术 近 格 济 干 运 怎 步 台 让 江 河 识 规 拉 切 极 持 若 英 争 功 深 备 造 阳 快 集 布 尽 周 宗 病 华 称 罗 爱 导 确 呢 办 节 根 击 商 陈 火 兴 京 注 虽 杀 父 存 臣 准 广 首 乎 具 甚 黄 满 容 单 联 调 吃 古 算 坐 早 引 须 离 证 约 母 组 房 曾 似 易 随 精 视 尚 断 乃 影 除 青 初 息 守 党 半 县 轻 质 语 越 况 举



























































