Retrofitting Word Vectors to Semantic Lexicons

Author

Faruqui, Manaal and Dodge, Jesse and Jauhar, Sujay Kumar and Dyer, Chris and Hovy, Eduard and Smith, Noah A.

Conference

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Year

2015

Figures & Tables

Table 5: Spearman’s correlation for word similarity evaluation using the using original and retrofitted SG vectors.

Figure 2: Spearman’s correlation on the MEN word similarity task, before and after retrofitting.

Figure 1: Word graph with edges between related words showing the observed (grey) and the inferred (white)word vector representations.

Table 3: Absolute performance changes for including PPDB information while training LBL vectors. Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks. Bold indicates greatest improvement.

Table 4: Comparison of retrofitting for semantic enrichment against Yu and Dredze (2014), Xu et al. (2014). Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks.

Table 1: Approximate size of the graphs obtained from different lexicons.

Figure 3: Two-dimensional PCA projections of 100-dimensional SG vector pairs holding the “adjective to adverb”relation, before (left) and after (right) retrofitting.

Table 2: Absolute performance changes with retrofitting. Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks. Higher scores are always better. Bold indicates greatest improvement for a vector type.

Abstract
1 Introduction
2 Retrofitting with Semantic Lexicons
3 Word Vector Representations
and are of length 300. 1
4 Semantic Lexicons
5 Evaluation Benchmarks
6 Experiments
- 6.1 Retrofitting
- 6.2 Semantic Lexicons during Learning
- 6.3 Comparisons to Prior Work
- 6.4 Multilingual Evaluation
7 Further Analysis
8 Related Work
9 Conclusion
Acknowledgements
References

References

Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Paşca, and Aitor Soroa. 2009. A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of NAACL.View this Paper
Andrei Alexandrescu and Katrin Kirchhoff. 2009. Graph-based learning for statistical machine translation. In Proceedings of NAACL.View this Paper
Collin F. Baker, Charles J. Fillmore, and John B. Lowe.1998. The berkeley framenet project. In Proceedings of ACL.View this Paper
Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux.2006. Label propagation and quadratic criterion. In Semi-Supervised Learning.
Jiang Bian, Bin Gao, and Tie-Yan Liu. 2014. Knowledge-powered deep learning for word embedding. In Machine Learning and Knowledge Discovery in Databases.View this Paper
Elia Bruni, Gemma Boleda, Marco Baroni, and NamKhanh Tran. 2012. Distributional semantics in technicolor. In Proceedings of ACL.View this Paper
Bob Carpenter. 2008. Lazy sparse stochastic gradient descent for regularized multinomial logistic regression. Technical Report Alias-i Inc.
Kai-Wei Chang, Wen-tau Yih, and Christopher Meek.2013. Multi-relational latent semantic analysis. In Proceedings of EMNLP.View this Paper
Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of ICML.View this Paper
Mark Culp and George Michailidis. 2008. Graph-based semisupervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.View this Paper
Dipanjan Das and Slav Petrov. 2011. Unsupervised part-of-speech tagging with bilingual graph-based projections. In Proc. of ACL.View this Paper
Dipanjan Das and Noah A. Smith. 2011. Semisupervised frame-semantic parsing for unknown predicates. In Proc. of ACL.View this Paper
Gerard de Melo and Gerhard Weikum. 2009. Towards a universal wordnet by learning from combined evidence. In Proceedings of CIKM.View this Paper
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science.View this Paper
John Duchi, Elad Hazan, and Yoram Singer. 2010. Adaptive subgradient methods for online learning and stochastic optimization. Technical Report UCB/EECS-2010-24, Mar.View this Paper
Manaal Faruqui and Chris Dyer. 2014. Improving vector space word representations using multilingual correlation. In Proceedings of EACL.View this Paper
Charles Fillmore, Christopher Johnson, and Miriam Petruck. 2003. International Journal of Lexicography.
Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias,Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: the concept revisited. In Proceedings of WWW.View this Paper
Daniel Fried and Kevin Duh. 2014. Incorporating both distributional and relational semantics in word representations. arXiv preprint arXiv:1412.4369.View this Paper
Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of NAACL.View this Paper
Andrew B. Goldberg and Xiaojin Zhu. 2006. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. TextGraphs-1.View this Paper
Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu.2014. Revisiting embedding features for simple semi-supervised learning. In Proceedings of EMNLP.View this Paper
Iryna Gurevych. 2005. Using the structure of a conceptual network in computing semantic relatedness. In Proceedings of IJCNLP.View this Paper
Samer Hassan and Rada Mihalcea. 2009. Cross-lingual semantic relatedness using encyclopedic knowledge. In Proc. of EMNLP.View this Paper
Eric H Huang, Richard Socher, Christopher D Manning,and Andrew Y Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of ACL.View this Paper
Colette Joubarne and Diana Inkpen. 2011. Comparison of semantic similarity for different languages using the google n-gram corpus and second- order co-occurrence measures. In Proceedings of CAAI.View this Paper
Ross Kindermann and J. L. Snell. 1980. Markov Random Fields and Their Applications. AMS.
Emiel Krahmer, Sebastian van Erk, and André Verleg.2003. Graph-based generation of referring expressions. Comput. Linguist.View this Paper
Thomas K Landauer and Susan T. Dumais. 1997. A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review.
Joel Lang and Mirella Lapata. 2011. Unsupervised semantic role induction with graph partitioning. In Proceedings of EMNLP.View this Paper
Omer Levy and Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word representations. In Proceedings of CoNLL.View this Paper
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.View this Paper
Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig.2013b. Linguistic regularities in continuous space word representations. In Proceedings of NAACL.View this Paper
George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM.View this Paper
Andriy Mnih and Yee Whye Teh. 2012. A fast and simple algorithm for training neural probabilistic language models. In Proceedings of ICML.View this Paper
Jerome L. Myers and Arnold D. Well. 1995. Research Design & Statistical Analysis. Routledge.
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP.View this Paper
Herbert Rubenstein and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM,8(10):627–633, October.View this Paper
Michael Schuhmacher and Simone Paolo Ponzetto.2014. Knowledge-based graph document modeling. In Proceedings of WSDM.View this Paper
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang,Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of EMNLP.View this Paper
Amarnag Subramanya, Slav Petrov, and Fernando Pereira. 2010. Efficient graph-based semi-supervised learning of structured tagging models. In Proceedings of EMNLP.View this Paper
Partha Pratim Talukdar and Fernando Pereira. 2010. Experiments in graph-based semi-supervised learning methods for class-instance acquisition. In Proceedings of ACL.View this Paper
Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proc. of ACL.View this Paper
Peter D. Turney. 2006. Similarity of semantic relations. Comput. Linguist., 32(3):379–416, September.View this Paper
Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang,Xiaoguang Liu, and Tie-Yan Liu. 2014. Rc-net: A general framework for incorporating knowledge into word representations. In Proceedings of CIKM.View this Paper
Wen-tau Yih, Geoffrey Zweig, and John C. Platt. 2012. Polarity inducing latent semantic analysis. In Proceedings of EMNLP.View this Paper
Mo Yu and Mark Dredze. 2014. Improving lexical em-beddings with semantic knowledge. In ACL.View this Paper
Xiaojin Zhu. 2005. Semi-supervised Learning with Graphs. Ph.D. thesis, Pittsburgh, PA, USA. AAI3179046.

+- Similar Papers (10)

+- Cited by (37)