Variational Deep Semantic Text Hashing with Pairwise Labels
With the rapid growth of the Web, the amount of textual data has increased explosively over the past few years. Fast similarity searches for text are becoming an essential requirement in many applications. Semantic hashing is one of the most powerful solutions for fast similarity searches. Semantic hashing has been widely deployed to approximate large-scale similarity searches. We can represent original text data using compact binary codes through hashing. Recent advances in neural network architecture have demonstrated the effectiveness and capability of this method to learn better hash functions. Most encode explicit features, such as categorical labels. Due to the special nature of textual data, previous semantic text hashing approaches do not utilize pairwise label information. However, pairwise label information reflects the similarity more intuitively than categorical label data. In this paper, we propose a supervised semantic text hashing method that utilizes pairwise label information. Experimental results on three public datasets show that our method can exploit pairwise label information well enough to outperform previous state-of-the-art hashing approaches.