Retrofitting Word Vectors to Semantic Lexicons

Author

Faruqui, Manaal and Dodge, Jesse and Jauhar, Sujay Kumar and Dyer, Chris and Hovy, Eduard and Smith, Noah A.

Conference

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Year

2015

Figures & Tables

Table 5: Spearman’s correlation for word similarity evaluation using the using original and retrofitted SG vectors.
Figure 2: Spearman’s correlation on the MEN word similarity task, before and after retrofitting.
Figure 1: Word graph with edges between related words showing the observed (grey) and the inferred (white)word vector representations.
Table 3: Absolute performance changes for including PPDB information while training LBL vectors. Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks. Bold indicates greatest improvement.
Table 4: Comparison of retrofitting for semantic enrichment against Yu and Dredze (2014), Xu et al. (2014). Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks.
Table 1: Approximate size of the graphs obtained from different lexicons.
Figure 3: Two-dimensional PCA projections of 100-dimensional SG vector pairs holding the “adjective to adverb”relation, before (left) and after (right) retrofitting.
Table 2: Absolute performance changes with retrofitting. Spearman’s correlation (3 left columns) and accuracy (3 right columns) on different tasks. Higher scores are always better. Bold indicates greatest improvement for a vector type.

Table of Contents

  • Abstract
  • 1 Introduction
  • 2 Retrofitting with Semantic Lexicons
  • 3 Word Vector Representations
  • and are of length 300. 1
  • 4 Semantic Lexicons
  • 5 Evaluation Benchmarks
  • 6 Experiments
    • 6.1 Retrofitting
    • 6.2 Semantic Lexicons during Learning
    • 6.3 Comparisons to Prior Work
    • 6.4 Multilingual Evaluation
  • 7 Further Analysis
  • 8 Related Work
  • 9 Conclusion
  • Acknowledgements
  • References

References

  •   Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Paşca, and Aitor Soroa. 2009. A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of NAACL.View this Paper
  •   Andrei Alexandrescu and Katrin Kirchhoff. 2009. Graph-based learning for statistical machine translation. In Proceedings of NAACL.View this Paper
  •   Collin F. Baker, Charles J. Fillmore, and John B. Lowe.1998. The berkeley framenet project. In Proceedings of ACL.View this Paper
  •   Yoshua Bengio, Olivier Delalleau, and Nicolas Le Roux.2006. Label propagation and quadratic criterion. In Semi-Supervised Learning.
  •   Jiang Bian, Bin Gao, and Tie-Yan Liu. 2014. Knowledge-powered deep learning for word embedding. In Machine Learning and Knowledge Discovery in Databases.View this Paper
  •   Elia Bruni, Gemma Boleda, Marco Baroni, and NamKhanh Tran. 2012. Distributional semantics in technicolor. In Proceedings of ACL.View this Paper
  •   Bob Carpenter. 2008. Lazy sparse stochastic gradient descent for regularized multinomial logistic regression. Technical Report Alias-i Inc.
  •   Kai-Wei Chang, Wen-tau Yih, and Christopher Meek.2013. Multi-relational latent semantic analysis. In Proceedings of EMNLP.View this Paper
  •   Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of ICML.View this Paper
  •   Mark Culp and George Michailidis. 2008. Graph-based semisupervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.View this Paper
  •   Dipanjan Das and Slav Petrov. 2011. Unsupervised part-of-speech tagging with bilingual graph-based projections. In Proc. of ACL.View this Paper
  •   Dipanjan Das and Noah A. Smith. 2011. Semisupervised frame-semantic parsing for unknown predicates. In Proc. of ACL.View this Paper
  •   Gerard de Melo and Gerhard Weikum. 2009. Towards a universal wordnet by learning from combined evidence. In Proceedings of CIKM.View this Paper
  •   S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science.View this Paper
  •   John Duchi, Elad Hazan, and Yoram Singer. 2010. Adaptive subgradient methods for online learning and stochastic optimization. Technical Report UCB/EECS-2010-24, Mar.View this Paper
  •   Manaal Faruqui and Chris Dyer. 2014. Improving vector space word representations using multilingual correlation. In Proceedings of EACL.View this Paper
  •   Charles Fillmore, Christopher Johnson, and Miriam Petruck. 2003. International Journal of Lexicography.
  •   Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias,Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: the concept revisited. In Proceedings of WWW.View this Paper
  •   Daniel Fried and Kevin Duh. 2014. Incorporating both distributional and relational semantics in word representations. arXiv preprint arXiv:1412.4369.View this Paper
  •   Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of NAACL.View this Paper
  •   Andrew B. Goldberg and Xiaojin Zhu. 2006. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. TextGraphs-1.View this Paper
  •   Jiang Guo, Wanxiang Che, Haifeng Wang, and Ting Liu.2014. Revisiting embedding features for simple semi-supervised learning. In Proceedings of EMNLP.View this Paper
  •   Iryna Gurevych. 2005. Using the structure of a conceptual network in computing semantic relatedness. In Proceedings of IJCNLP.View this Paper
  •   Samer Hassan and Rada Mihalcea. 2009. Cross-lingual semantic relatedness using encyclopedic knowledge. In Proc. of EMNLP.View this Paper
  •   Eric H Huang, Richard Socher, Christopher D Manning,and Andrew Y Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of ACL.View this Paper
  •   Colette Joubarne and Diana Inkpen. 2011. Comparison of semantic similarity for different languages using the google n-gram corpus and second- order co-occurrence measures. In Proceedings of CAAI.View this Paper
  •   Ross Kindermann and J. L. Snell. 1980. Markov Random Fields and Their Applications. AMS.
  •   Emiel Krahmer, Sebastian van Erk, and André Verleg.2003. Graph-based generation of referring expressions. Comput. Linguist.View this Paper
  •   Thomas K Landauer and Susan T. Dumais. 1997. A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological review.
  •   Joel Lang and Mirella Lapata. 2011. Unsupervised semantic role induction with graph partitioning. In Proceedings of EMNLP.View this Paper
  •   Omer Levy and Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word representations. In Proceedings of CoNLL.View this Paper
  •   Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.View this Paper
  •   Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig.2013b. Linguistic regularities in continuous space word representations. In Proceedings of NAACL.View this Paper
  •   George A Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM.View this Paper
  •   Andriy Mnih and Yee Whye Teh. 2012. A fast and simple algorithm for training neural probabilistic language models. In Proceedings of ICML.View this Paper
  •   Jerome L. Myers and Arnold D. Well. 1995. Research Design & Statistical Analysis. Routledge.
  •   Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP.View this Paper
  •   Herbert Rubenstein and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM,8(10):627–633, October.View this Paper
  •   Michael Schuhmacher and Simone Paolo Ponzetto.2014. Knowledge-based graph document modeling. In Proceedings of WSDM.View this Paper
  •   Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang,Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of EMNLP.View this Paper
  •   Amarnag Subramanya, Slav Petrov, and Fernando Pereira. 2010. Efficient graph-based semi-supervised learning of structured tagging models. In Proceedings of EMNLP.View this Paper
  •   Partha Pratim Talukdar and Fernando Pereira. 2010. Experiments in graph-based semi-supervised learning methods for class-instance acquisition. In Proceedings of ACL.View this Paper
  •   Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proc. of ACL.View this Paper
  •   Peter D. Turney. 2006. Similarity of semantic relations. Comput. Linguist., 32(3):379–416, September.View this Paper
  •   Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang,Xiaoguang Liu, and Tie-Yan Liu. 2014. Rc-net: A general framework for incorporating knowledge into word representations. In Proceedings of CIKM.View this Paper
  •   Wen-tau Yih, Geoffrey Zweig, and John C. Platt. 2012. Polarity inducing latent semantic analysis. In Proceedings of EMNLP.View this Paper
  •   Mo Yu and Mark Dredze. 2014. Improving lexical em-beddings with semantic knowledge. In ACL.View this Paper
  •   Xiaojin Zhu. 2005. Semi-supervised Learning with Graphs. Ph.D. thesis, Pittsburgh, PA, USA. AAI3179046.
+- Similar Papers (10)
+- Cited by (37)