Dictionaries
Gukhanmun uses dictionaries to look up the hangul readings of hanja. By default it ships with the bundled Standard Korean Dictionary (標準國語大辭典).
Bundled Standard Korean Dictionary
The bundled dictionary is loaded automatically. No extra flags are needed for most Korean text.
To disable it—for example when you want to rely entirely on a custom
dictionary—pass --no-stdict:
Custom dictionaries
Supply one or more custom dictionaries with -d (or --dictionary). The
flag can be repeated:
Gukhanmun supports two binary dictionary formats:
Dictionaries are tried in the order they appear on the command line, with the bundled dictionary consulted last. The first match wins.
Building a custom dictionary
The .gukfst and .gukcdb files are compiled artifacts, not something you
edit by hand. You author your entries as a plain text table and compile them
with gukhanmun-mkdict.
The gukhanmun-mkdict builder is installed together with gukhanmun, whether
you install via mise or
download a prebuilt archive. If you instead
built from crates.io, install the builder the same way:
Write your entries as a tab-separated file with a hanja key column and a
hangul reading column:
Two optional columns control how renderers treat each entry: set
require_hanja to true to keep the source hanja visible (for homophones that
need disambiguation), and require_hangul to true to force a hangul gloss in
the original-script rendering mode.
Compile the table into an FST dictionary (the default format):
Pass --format cdb to produce a .gukcdb file instead. You can supply
several input files, which are merged in order; --merge selects how duplicate
keys are resolved (error, first-wins, or last-wins). Add --validate to
reopen the output and confirm every entry round-trips, and --metadata KEY=VAL
to embed provenance such as the source or license.
Then load the result like any other custom dictionary:
CSV and JSON Lines inputs are also accepted, and a few more advanced options are available. See the internals section for the full dictionary file format specification.