NoerNova Logo

May 30, 2024

Shan language in CLDR and LID

Shan language in CLDR and LID
  • Unicode Standard: Unicode is a standard for encoding, representing, and handling text in digital form. It assigns unique code points (integer values) to every character in a wide range of writing systems, including alphabets, ideograms, and symbols.

  • Unicode Language Identifier (ULI): Unicode Language Identifier (ULI) is a string that identifies a language or language family in a Unicode environment. It typically consists of a two-letter ISO 639 language code, optionally followed by a two-letter ISO 3166 country code. For example, "en" represents English, "fr" represents French, "zh" represents Chinese.

Conclusion refer to Unicode Language Identifier (ULI) document

  • for Shan language we can use 2 identifier

    • shn-MM for Shan language spoken in Myanmar
    • shn-TH for Shan language spoken in Thailand
      • because those are 2 countries mostly spoken in Shan
      • Unicode Language Identifier (ULI) is used for help identify the language used in a document or application
  • To updating CLDR (Language/Script/Region Subtags)