[ 原始碼: unidic-mecab ]
套件:unidic-mecab(3.1.1-1)
Dictionary for Mecab (Corpus of Contemporary Written Japanese)
unidic-mecab is a dictionary for Mecab (Japanese morphological analysis implementation), based on corpus of Contemporary Written Japanese (upstream publish it as unidic-cwj).
* All entries are based on the definition of "SUW (short-unit word)" that is specified by NINJAL (The National Institute for Japanese Language and Linguistics), which provides word segmentation in uniform size suited for linguistic research. * It has three-layered structure with - lemma - form - spelling And it can provide a clear distinction of two types of word variant: spelling variant and form variant. * It is useful for research of Speech processing since it can be added accent and shift in sound information.
This package is huge. You need more than 10GB of free space to download and install.
其他與 unidic-mecab 有關的套件
|
|
|
|
-
- rec: mecab (>= 0.96)
- Japanese morphological analysis system
-
- rec: mecab-utils (>= 0.96)
- Support programs of Mecab