SC-TC Lexeme Mappings

In 1996 we launched a project whose ultimate goal is to develop a Chinese-to-Chinese conversion system that gives near-perfect results. To achieve a high level of conversion accuracy, our mapping tables are comprehensive, and include approximately three million general vocabulary lexemes, technical terms, and proper nouns.

This a sample of lexemic mappings between Simplified and Traditional Chinese. This is similar to the American 'gas' vs. British 'petrol'. For example, 'laser' is translated to 激光 in SC but to 雷射 in TC. For details, see this paper.

Field Description

  1. SCID: ID for SC lexeme
  2. L/N
  3. SC Lexeme: Simplified Chinese lexeme
  4. TC Lexeme: Traditional Chinese lexeme
  5. SC Pinyin: Reading in SC pinyin
  6. TC Pinyin: Reading in TC pinyin

Data Sample

SC ID L/N SC Lexeme TC Lexeme SC Pinyin TC Pinyin
S0002011B L 乡下佬 鄉巴佬 xiang1-xia4-lao3 xiang1-ba1-lao3
S0002641A N 亲昵 親暱 qin1-ni4 qin1-ni4
S0003226A L 伤心 心碎 shang1-xin1 xin1-sui4
S0005072A L 减震器 避震器 jian3-zhen4-qi4 bi4-zhen4-qi4
S0005256B N 击球区 打擊區 da3-ji2-qu1 da3-ji2-qu1
S0005411A L 刘海 瀏海 liu2-hai3 liu2-hai3
S0005867A L 劝戒 勸誡 quan4-jie4 quan4-jie4
S0006144A L 劳伦斯 羅倫斯 lao2-lun2-si1 luo2-lun2-si1
S0006514A L 协议 協定 xie2-yi4 xie2-ding4
S0006714B N 单晶硅 單晶矽 dan1-jing1-xi4 dan1-jing1-xi4
S0006756A N 单糖 單醣 dan1-tang2 dan1-tang2
S0008661B N 咖喱 咖哩 ga1-li3 ga1-li3
S0009080A N yin1 yin1
S0017006A L 扰流器 隔音板 rao3-liu2-qi4 ge2-yin1-ban3
S0017201A L 护卫舰 護航艦 hu4-wei4-jian4 hu4-hang2-jian4
S0018815A L 摄像机 攝影機 she4-xiang4-ji1 she4-ying3-ji1
S0020451A L 极乐鸟 天堂鳥 ji2-le4-niao3 tian1-tang2-niao3
S0020481A L 极坐标 極座標 ji2-zuo4-biao1 ji2-zuo4-biao1
S0021511A N 欢迎词 歡迎辭 huan1-ying2-ci2 huan1-ying2-ci2
S0022528B N 泔水 餿水 sou1-shui3 sou1-shui3
S0023093A L 涤纶 達克龍 di2-lun2 da2-ke4-long2
S0024559A L 热销 暢銷 re4-xiao1 chang4-xiao1
S0024617B N 热身 暖身 nuan3-shen1 nuan3-shen1
S0024761B N 煊赫 烜赫 xuan3-he4 xuan3-he4
S0025012B N li2 li2
S0025014A N 牦牛 犛牛 mao2-niu2 mao2-niu2
S0353388B N 波多诺伏 新港 xin1-gang3 xin1-gang3