Database of Foreign Names in Arabic
Because of the important role personal names play in such natural language applications as named entity extraction and machine translation, The CJK Dictionary Institute is continuously expanding and revising its proper noun resources, which provide systematic coverage of Arabic orthographic variants and common orthographic errors.
Our institute, in an international collaboration effort including Arabic name specialists, has developed new techniques for the collection, validation and attestation of non-Arab names written in Arabic, and are now in the process of building a comprehensive Database of Foreign Names in Arabic, referred to as DAFNA.
The sample below shows orthographic variants and spelling errors of a common American given name (John), and a common American surname (Davis). The original American name data was obtained from the U.S. Census Bureau. A larger sample is also available.