A Grapheme-level Approach for Constructing a Korean Morphological Analyzer without Linguistic Knowledge
Information
Abstract
Morphological analysis is an essential step for processing the Korean language, due to highly agglutinative properties of the language. In this paper, we propose a novel approach for constructing a Korean morphological analyzer that can capture linguistic properties using graphemes as basic processing units. Since our model does not utilize prior linguistic knowledge, the model can be applied to other training corpora with ease. Our model performs morphological analysis through two consecutive sequence labeling tasks: lexical form recovery and part-of-speech tagging. In the lexical form recovery step, morphological changes of an input sentence are restored to the original form. Then in the part-of-speech step, corresponding part-of-speech tags are attached to the recovered form. Experimental results show that our model outperforms previous models which are constructed without prior knowledge.