Variational Deep Semantic Text Hashing with Pairwise Labels


Title Variational Deep Semantic Text Hashing with Pairwise Labels
Richeng Xuan, Junho Shim, Sang-goo Lee
Year 2019 / 1
Keywords Natural Language Processing, Semantic Hashing, Machine Learning, Similarity Search
Acknowledgement ITRC
Publication Type International Conference
Publication International Conference on Ubiquitous Information Management and Communication (IMCOM 2019) (Honorable Paper Award), pp. 1076-1091
Link doi
File download


With the rapid growth of the Web, the amount of textual data has increased explosively over the past few years. Fast similarity searches for text are becoming an essential requirement in many applications. Semantic hashing is one of the most powerful solutions for fast similarity searches. Semantic hashing has been widely deployed to approximate large-scale similarity searches. We can represent original text data using compact binary codes through hashing. Recent advances in neural network architecture have demonstrated the effectiveness and capability of this method to learn better hash functions. Most encode explicit features, such as categorical labels. Due to the special nature of textual data, previous semantic text hashing approaches do not utilize pairwise label information. However, pairwise label information reflects the similarity more intuitively than categorical label data. In this paper, we propose a supervised semantic text hashing method that utilizes pairwise label information. Experimental results on three public datasets show that our method can exploit pairwise label information well enough to outperform previous state-of-the-art hashing approaches.