Mongolian character tables

This document lists the per-character shaping information needed to shape Mongolian text.

Contents

Mongolian character table

Mongolian glyphs should be classified as in the following table. Codepoints in the Mongolian block with no assigned meaning are designated as unassigned in the Unicode category column.

The Joining type column indicates whether each codepoint is defined as joining with adjacent characters on the left side, right side, left and right sides (“DUAL”), or neither side (“NON_JOINING”). Codepoints designated TRANSPARENT in the Joining type column do not join with adjacent characters and, in addition, do not affect the joining behavior of surrounding characters. Non-spacing marks are of type TRANSPARENT. Codepoints designated JOIN_CAUSING force adjacent characters to join.

The Joining group column lists the fundamental letter that the listed codepoint behaves like for joining purposes.

Assigned codepoints with a null in the Joining group column evoke no special behavior from the shaping engine during the join-computation stage.

The Mark class column indicates the Canonical Combining Class for the codepoint. Marks are assigned non-zero combining classes so that sequences of adjacent marks can be reordered as required by the orthography.

For Mongolian, a subset of marks in the 220 and 230 classes are also designated Modifier Combining Marks (MCM). These are denoted with 220_MCM and 230_MCM in the Mark class column. The MCM marks are treated differently during the mark-reordering stage.

Table 49 Mongolian character table

Codepoint

Unicode category

Joining type

Joining group

Mark class

Glyph

U+1800

Punctuation

NON_JOINING

null

0

᠀ Mongolian Birga

U+1801

Punctuation

NON_JOINING

null

0

᠁ Mongolian Ellipsis

U+1802

Punctuation

NON_JOINING

null

0

᠂ Mongolian Comma

U+1803

Punctuation

NON_JOINING

null

0

᠃ Mongolian Full Stop

U+1804

Punctuation

NON_JOINING

null

0

᠄ Mongolian Colon

U+1805

Punctuation

NON_JOINING

null

0

᠅ Mongolian Four Dots

U+1806

Punctuation [Pd]

NON_JOINING

null

0

᠆ Todo Soft Hyphen

U+1807

Punctuation

DUAL

null

0

᠇ Sibe Syllable Boundary Mark

U+1808

Punctuation

NON_JOINING

null

0

᠈ Manchu Comma

U+1809

Punctuation

NON_JOINING

null

0

᠉ Manchu Full Stop

U+180A

Punctuation

JOIN_CAUSING

null

0

᠊ Mongolian Nirugu

U+180B

Mark [Mn]

TRANSPARENT

null

0

᠋ Free Variation Selector One

U+180C

Mark [Mn]

TRANSPARENT

null

0

᠌ Free Variation Selector Two

U+180D

Mark [Mn]

TRANSPARENT

null

0

᠍ Free Variation Selector Three

U+180E

Formatting

NON_JOINING

null

0

᠎ Mongolian Vowel Separator

U+180F

Mark [Mn]

TRANSPARENT

null

0

᠏ Free Variation Selector Four

U+1810

Number

NON_JOINING

null

0

᠐ Digit Zero

U+1811

Number

NON_JOINING

null

0

᠑ Digit One

U+1812

Number

NON_JOINING

null

0

᠒ Digit Two

U+1813

Number

NON_JOINING

null

0

᠓ Digit Three

U+1814

Number

NON_JOINING

null

0

᠔ Digit Four

U+1815

Number

NON_JOINING

null

0

᠕ Digit Five

U+1816

Number

NON_JOINING

null

0

᠖ Digit Six

U+1817

Number

NON_JOINING

null

0

᠗ Digit Seven

U+1818

Number

NON_JOINING

null

0

᠘ Digit Eight

U+1819

Number

NON_JOINING

null

0

᠙ Digit Nine

U+181A

unassigned

U+181B

unassigned

U+181C

unassigned

U+181D

unassigned

U+181E

unassigned

U+181F

unassigned

U+1820

Letter

DUAL

null

0

ᠠ A

U+1821

Letter

DUAL

null

0

ᠡ E

U+1822

Letter

DUAL

null

0

ᠢ I

U+1823

Letter

DUAL

null

0

ᠣ O

U+1824

Letter

DUAL

null

0

ᠤ U

U+1825

Letter

DUAL

null

0

ᠥ Oe

U+1827

Letter

DUAL

null

0

ᠦ Ue

U+1827

Letter

DUAL

null

0

ᠧ Ee

U+1828

Letter

DUAL

null

0

ᠨ Na

U+1829

Letter

DUAL

null

0

ᠩ Ang

U+182A

Letter

DUAL

null

0

ᠪ Ba

U+182B

Letter

DUAL

null

0

ᠫ Pa

U+182C

Letter

DUAL

null

0

ᠬ Qa

U+182D

Letter

DUAL

null

0

ᠭ Ga

U+182E

Letter

DUAL

null

0

ᠮ Ma

U+182F

Letter

DUAL

null

0

ᠯ La

U+1830

Letter

DUAL

null

0

ᠰ Sa

U+1831

Letter

DUAL

null

0

ᠱ Sha

U+1832

Letter

DUAL

null

0

ᠲ Ta

U+1833

Letter

DUAL

null

0

ᠳ Da

U+1834

Letter

DUAL

null

0

ᠴ Cha

U+1835

Letter

DUAL

null

0

ᠵ Ja

U+1836

Letter

DUAL

null

0

ᠶ Ya

U+1837

Letter

DUAL

null

0

ᠷ Ra

U+1838

Letter

DUAL

null

0

ᠸ Wa

U+1839

Letter

DUAL

null

0

ᠹ Fa

U+183A

Letter

DUAL

null

0

ᠺ Ka

U+183B

Letter

DUAL

null

0

ᠻ Kha

U+183C

Letter

DUAL

null

0

ᠼ Tsa

U+183D

Letter

DUAL

null

0

ᠽ Za

U+183E

Letter

DUAL

null

0

ᠾ Haa

U+183F

Letter

DUAL

null

0

ᠿ Zra

U+1840

Letter

DUAL

null

0

ᡀ Lha

U+1841

Letter

DUAL

null

0

ᡁ Zhi

U+1842

Letter

DUAL

null

0

ᡂ Chi

U+1843

Letter

DUAL

null

0

ᡃ Todo Long Vowel Sign

U+1844

Letter

DUAL

null

0

ᡄ Todo E

U+1845

Letter

DUAL

null

0

ᡅ Todo I

U+1846

Letter

DUAL

null

0

ᡆ Todo O

U+1847

Letter

DUAL

null

0

ᡇ Todo U

U+1848

Letter

DUAL

null

0

ᡈ Todo Oe

U+1849

Letter

DUAL

null

0

ᡉ Todo Ue

U+184A

Letter

DUAL

null

0

ᡊ Todo Ang

U+184B

Letter

DUAL

null

0

ᡋ Todo Ba

U+184C

Letter

DUAL

null

0

ᡌ Todo Pa

U+184D

Letter

DUAL

null

0

ᡍ Todo Qa

U+184E

Letter

DUAL

null

0

ᡎ Todo Ga

U+184F

Letter

DUAL

null

0

ᡏ Todo Ma

U+1850

Letter

DUAL

null

0

ᡐ Todo Ta

U+1851

Letter

DUAL

null

0

ᡑ Todo Da

U+1852

Letter

DUAL

null

0

ᡒ Todo Cha

U+1853

Letter

DUAL

null

0

ᡓ Todo Ja

U+1854

Letter

DUAL

null

0

ᡔ Todo Tsa

U+1855

Letter

DUAL

null

0

ᡕ Todo Ya

U+1856

Letter

DUAL

null

0

ᡖ Todo Wa

U+1857

Letter

DUAL

null

0

ᡗ Todo Ka

U+1858

Letter

DUAL

null

0

ᡘ Todo Gaa

U+1859

Letter

DUAL

null

0

ᡙ Todo Haa

U+185A

Letter

DUAL

null

0

ᡚ Todo Jia

U+185B

Letter

DUAL

null

0

ᡛ Todo Nia

U+185C

Letter

DUAL

null

0

ᡜ Todo Dza

U+185D

Letter

DUAL

null

0

ᡝ Sibe E

U+185E

Letter

DUAL

null

0

ᡞ Sibe I

U+185F

Letter

DUAL

null

0

ᡟ Sibe Iy

U+1860

Letter

DUAL

null

0

ᡠ Sibe Ue

U+1861

Letter

DUAL

null

0

ᡡ Sibe U

U+1862

Letter

DUAL

null

0

ᡢ Sibe Ang

U+1863

Letter

DUAL

null

0

ᡣ Sibe Ka

U+1864

Letter

DUAL

null

0

ᡤ Sibe Ga

U+1865

Letter

DUAL

null

0

ᡥ Sibe Ha

U+1866

Letter

DUAL

null

0

ᡦ Sibe Pa

U+1867

Letter

DUAL

null

0

ᡧ Sibe Sha

U+1868

Letter

DUAL

null

0

ᡨ Sibe Ta

U+1869

Letter

DUAL

null

0

ᡩ Sibe Da

U+186A

Letter

DUAL

null

0

ᡪ Sibe Ja

U+186B

Letter

DUAL

null

0

ᡫ Sibe Fa

U+186C

Letter

DUAL

null

0

ᡬ Sibe Gaa

U+186D

Letter

DUAL

null

0

ᡭ Sibe Haa

U+186E

Letter

DUAL

null

0

ᡮ Sibe Tsa

U+186F

Letter

DUAL

null

0

ᡯ Sibe Za

U+1870

Letter

DUAL

null

0

ᡰ Sibe Raa

U+1871

Letter

DUAL

null

0

ᡱ Sibe Cha

U+1872

Letter

DUAL

null

0

ᡲ Sibe Zha

U+1873

Letter

DUAL

null

0

ᡳ Manchu I

U+1874

Letter

DUAL

null

0

ᡴ Manchu Ka

U+1875

Letter

DUAL

null

0

ᡵ Manchu Ra

U+1876

Letter

DUAL

null

0

ᡶ Manchu Fa

U+1877

Letter

DUAL

null

0

ᡷ Manchu Zha

U+1878

Letter

DUAL

null

0

ᡸ Cha With Two Dots

U+1879

unassigned

U+187A

unassigned

U+187B

unassigned

U+187C

unassigned

U+187D

unassigned

U+187E

unassigned

U+187F

unassigned

U+1880

Letter

NON_JOINING

null

0

ᢀ Ali Gali Anusvara One

U+1881

Letter

NON_JOINING

null

0

ᢁ Ali Gali Visarga One

U+1882

Letter

NON_JOINING

null

0

ᢂ Ali Gali Damaru

U+1883

Letter

NON_JOINING

null

0

ᢃ Ali Gali Ubadama

U+1884

Letter

NON_JOINING

null

0

ᢄ Ali Gali Inverted Ubadama

U+1885

Mark [Mn]

TRANSPARENT

null

0

ᢅ Ali Gali Baluda

U+1886

Mark [Mn]

TRANSPARENT

null

0

ᢆ Ali Gali Three Baluda

U+1887

Letter

DUAL

null

0

ᢇ Ali Gali A

U+1888

Letter

DUAL

null

0

ᢈ Ali Gali I

U+1889

Letter

DUAL

null

0

ᢉ Ali Gali Ka

U+188A

Letter

DUAL

null

0

ᢊ Ali Gali Nga

U+188B

Letter

DUAL

null

0

ᢋ Ali Gali Ca

U+188C

Letter

DUAL

null

0

ᢌ Ali Gali Tta

U+188D

Letter

DUAL

null

0

ᢍ Ali Gali Ttha

U+188E

Letter

DUAL

null

0

ᢎ Ali Gali Dda

U+188F

Letter

DUAL

null

0

ᢏ Ali Gali Nna

U+1890

Letter

DUAL

null

0

ᢐ Ali Gali Ta

U+1891

Letter

DUAL

null

0

ᢑ Ali Gali Da

U+1892

Letter

DUAL

null

0

ᢒ Ali Gali Pa

U+1893

Letter

DUAL

null

0

ᢓ Ali Gali Pha

U+1894

Letter

DUAL

null

0

ᢔ Ali Gali Ssa

U+1895

Letter

DUAL

null

0

ᢕ Ali Gali Zha

U+1896

Letter

DUAL

null

0

ᢖ Ali Gali Za

U+1897

Letter

DUAL

null

0

ᢗ Ali Gali Ah

U+1898

Letter

DUAL

null

0

ᢘ Todo Ali Gali Ta

U+1899

Letter

DUAL

null

0

ᢙ Todo Ali Gali Zha

U+189A

Letter

DUAL

null

0

ᢚ Manchu Ali Gali Gha

U+189B

Letter

DUAL

null

0

ᢛ Manchu Ali Gali Nga

U+189C

Letter

DUAL

null

0

ᢜ Manchu Ali Gali Ca

U+189D

Letter

DUAL

null

0

ᢝ Manchu Ali Gali Jha

U+189E

Letter

DUAL

null

0

ᢞ Manchu Ali Gali Tta

U+189F

Letter

DUAL

null

0

ᢟ Manchu Ali Gali Ddha

U+18A0

Letter

DUAL

null

0

ᢠ Manchu Ali Gali Ta

U+18A1

Letter

DUAL

null

0

ᢡ Manchu Ali Gali Dha

U+18A2

Letter

DUAL

null

0

ᢢ Manchu Ali Gali Ssa

U+18A3

Letter

DUAL

null

0

ᢣ Manchu Ali Gali Cya

U+18A4

Letter

DUAL

null

0

ᢤ Manchu Ali Gali Zha

U+18A5

Letter

DUAL

null

0

ᢥ Manchu Ali Gali Za

U+18A6

Letter

DUAL

null

0

ᢦ Ali Gali Half U

U+18A7

Letter

DUAL

null

0

ᢧ Ali Gali Half Ya

U+18A8

Letter

DUAL

null

0

ᢨ Manchu Ali Gali Bha

U+18A9

Mark [Mn]

TRANSPARENT

null

228

ᢩ Ali Gali Dagalga

U+18AA

Letter

DUAL

null

0

ᢪ Manchu Ali Gali Lha

U+18AB

unassigned

U+18AC

unassigned

U+18AD

unassigned

U+18AE

unassigned

U+18AF

unassigned

Mongolian Supplement character table

The Mongolian Supplement block includes variants of the birga mark used to denote the beginning of a text.

Table 50 Mongolian Supplement character table

Codepoint

Unicode category

Joining type

Joining group

Mark class

Glyph

U+11660

Punctuation

NON_JOINING

null

0

𑙠 Birga with Ornament

U+11661

Punctuation

NON_JOINING

null

0

𑙡 Rotated Birga

U+11662

Punctuation

NON_JOINING

null

0

𑙢 Double Birga with Ornament

U+11663

Punctuation

NON_JOINING

null

0

𑙣 Triple Birga with Ornament

U+11664

Punctuation

NON_JOINING

null

0

𑙤 Birga with Double Ornament

U+11665

Punctuation

NON_JOINING

null

0

𑙥 Rotated Birga with Ornament

U+11666

Punctuation

NON_JOINING

null

0

𑙦 Rotated Birga with Double Ornament

U+11667

Punctuation

NON_JOINING

null

0

𑙧 Inverted Birga

U+11668

Punctuation

NON_JOINING

null

0

𑙨 Inverted Birga with Double Ornament

U+11669

Punctuation

NON_JOINING

null

0

𑙩 Swirl Birga

U+1166A

Punctuation

NON_JOINING

null

0

𑙪 Swirl Birga with Ornament

U+1166B

Punctuation

NON_JOINING

null

0

𑙫 Swirl Birga with Double Ornament

U+1166C

Punctuation

NON_JOINING

null

0

𑙬 Turned Swirl Birga with Double Ornament

U+1166D

unassigned

U+1166E

unassigned

U+1166F

unassigned

U+11670

unassigned

U+11671

unassigned

U+11672

unassigned

U+11673

unassigned

U+11674

unassigned

U+11675

unassigned

U+11676

unassigned

U+11677

unassigned

U+11678

unassigned

U+11679

unassigned

U+1167A

unassigned

U+1167B

unassigned

U+1167C

unassigned

U+1167D

unassigned

U+1167E

unassigned

U+1167F

unassigned

Miscellaneous character table

Other important characters that may be encountered when shaping runs of Mongolian text include the dotted-circle placeholder (U+25CC), the combining grapheme joiner (U+034F), the zero-width joiner (U+200D) and zero-width non-joiner (U+200C), the left-to-right text marker (U+200E) and right-to-left text marker (U+200F), and the no-break space (U+00A0).

The dotted-circle placeholder is frequently used when displaying a combining mark in isolation. Real-world text syllables may also use other characters, such as hyphens or dashes, in a similar placeholder fashion; shaping engines should cope with this situation gracefully.

Table 51 Miscellaneous character table

Codepoint

Unicode category

Joining type

Joining group

Mark class

Glyph

U+00A0

Separator

NON_JOINING

null

0

  No-break space

U+200C

Other

NON_JOINING

null

0

‌ Zero-width non-joiner

U+200D

Other

JOIN_CAUSING

null

0

‍ Zero-width joiner

U+2010

Punctuation

NON_JOINING

null

0

‐ Hyphen

U+2011

Punctuation

NON_JOINING

null

0

‑ No-break hyphen

U+2012

Punctuation

NON_JOINING

null

0

‒ Figure dash

U+2013

Punctuation

NON_JOINING

null

0

– En dash

U+2014

Punctuation

NON_JOINING

null

0

— Em dash

U+202F

Separator

NON_JOINING

null

0

  Narrow No-Break Space

U+25CC

Symbol

NON_JOINING

null

0

◌ Dotted circle

The zero-width joiner (ZWJ) is primarily used to force the usage of the cursive connecting form of a letter even when the context of the adjoining letters would not trigger the connecting form.

For example, to show the initial form of a letter in isolation (such as for displaying it in a table of forms), the sequence “Letter,ZWJ” would be used. To show the medial form of a letter in isolation, the sequence “ZWJ,Letter,ZWJ” would be used.

The no-break space is primarily used to display those codepoints that are defined as non-spacing (such as vowel or diacritical marks and “Hamza”) in an isolated context, as an alternative to displaying them superimposed on the dotted-circle placeholder.

The narrow no-break space is used in Mongolian to insert a small gap between a word and its suffix.