A Grapheme-level Approach for Constructing a Korean Morphological Analyzer without Linguistic Knowledge

Information

Title A Grapheme-level Approach for Constructing a Korean Morphological Analyzer without Linguistic Knowledge
Authors
Jihun Choi, Jonghem Youn, Sang-goo Lee
Year 2016 / 12
Keywords natural language processing, computational linguistics, morphological analysis
Acknowledgement BK
Publication Type International Workshop
Publication Big Data and Natural Language Processing workshop hosted at IEEE Big Data 2016, pp. 3872-3879
Link doi

Abstract

Morphological analysis is an essential step for processing the Korean language, due to highly agglutinative properties of the language. In this paper, we propose a novel approach for constructing a Korean morphological analyzer that can capture linguistic properties using graphemes as basic processing units. Since our model does not utilize prior linguistic knowledge, the model can be applied to other training corpora with ease. Our model performs morphological analysis through two consecutive sequence labeling tasks: lexical form recovery and part-of-speech tagging. In the lexical form recovery step, morphological changes of an input sentence are restored to the original form. Then in the part-of-speech step, corresponding part-of-speech tags are attached to the recovered form. Experimental results show that our model outperforms previous models which are constructed without prior knowledge.