This site is deprecated and no longer maintained. Please visit the new site for up-to-date information.

This site is deprecated and no longer maintained. Please visit the new site for up-to-date information.

Consensus Similarity Measure for Short Text Clustering

From IDSlab

Jump to: navigation, search
Recent | By Year | By Topic | Only SCI or SCIE | In Journal | In Conference | Tabular Form | Search
Title Consensus Similarity Measure for Short Text Clustering
Authors

Youhyun Shin, Yeonchan Ahn, Heesik Jeon, Sang-goo Lee

Date 2015-9
Keywords Short text, Clustering, Semantic similarity
Acknowledgement Samsung
Publication Type International Workshop
Publication Info 12th International Workshop on Text-based Information Retrieval In conjunction with DEXA 2015 (TIR 2015) , Volume , Page
Conference Info
Publisher
SCIE
Other Information ISBN:
ISSN:
Link
Download Media:SecureFile-TIR201509.pdf
Related Research
Related Project


Abstract (Korean)



Abstract (English)
Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity measures in order to exploit advantages of both approaches. We apply our method to a dialog-utterance dataset, which consists of short dialog texts. Empirical study shows that the proposed method outperforms one of the state-of-the-art clustering algorithms for short text clustering.