Conversion options
These flags control the linguistic rules applied during conversion.
Preset
--preset selects a preconfigured combination of defaults:
Individual flags below override the preset's defaults.
Segmentation strategy
--segmentation controls how word boundaries are found:
lattice(default): finds the globally optimal segmentation by evaluating all dictionary matches at every position with dynamic programming. Best for accuracy.eager: greedy left-to-right longest-match. Faster but may mis-segment compound words.
Numeral handling
--numerals controls how hanja numerals are rendered:
Initial sound law
The initial sound law (頭音法則) is enabled by default for ko-kr and
disabled for ko-kp. It affects character-by-character fallback readings for
characters not found in any dictionary; dictionary entries already encode their
correct readings.
Override with explicit flags:
Homophone disambiguation
Different hanja words can share the same hangul reading (for example, 連霸 and
連敗 are both 연패). In the default hangul-only rendering mode, Gukhanmun
can keep the hanja in parentheses for such words so readers can tell them
apart. --disambiguation sets the scope across which a reading is considered
ambiguous:
--homophone-detection chooses which readings count as ambiguous within the
window:
context-local keeps hangul-only output clean. dictionary-wide is broader,
but with the bundled Standard Korean Dictionary nearly every common reading has
some homophone, so it glosses most Sino-Korean words. To always gloss a
specific word regardless of context, use the --require-hanja flag instead
(see User directives).
Only recognized words are disambiguated
Homophone disambiguation operates on words the dictionary recognizes as units.
A hanja sequence with no dictionary entry of its own is not treated as a single
word, and its fallback (non-dictionary) characters are never glossed; any
recognized single-character entries inside it (such as 紫) are still handled
on their own. For example, 自由 and 子游 are both bundled entries read
자유, so 自由와 子游 becomes 자유(自由)와 자유(子游); but 紫楡 has no
entry of its own, so under the default context-local strategy 自由와 紫楡
becomes 자유와 자유 with no gloss, because the engine never sees a second
자유 unit to collide with 自由. To disambiguate the whole term, add it to
a custom dictionary and load it with --dictionary (see
Dictionaries) so the engine treats it as a single unit.
First-occurrence clearing
--first-occurrence removes annotations from characters whose presentation
was already forced earlier in the window:
Error recovery
--recovery controls behaviour when an unrecoverable parse error occurs
(currently relevant for HTML input only):
strict(default) — abort with an errorlenient— skip the problematic fragment and continue