Inclusive yet Selective: Supervised Distributional Hypernymy Detection

Author

Roller, Stephen and Erk, Katrin and Boleda, Gemma

Conference

Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

Year

2014

Figures & Tables

Table 2: Average accuracy of Concat and Diff on B LESS and E NTAILMENT using different spaces for feature generation.
Table 1: Mean Average Precision for the unsupervised measures on three spaces.
Table 3: Mean Average Precision for the unsupervised measures after selecting the top dimensions from a supervised model.
Figure 1: Distributions of relata invCL scores for the U+W2, U+Sent, and TypeDM spaces for each of the semantic relations, after per-concept z-normalization.
Figure 2: Distributions of relata scores across concepts using the cosine, ClarkeDE, and invCL measures(after per-concept z-normalization). Here we use the selected dimensions of the U+W2 proj space.

Table of Contents

  • Abstract
  • 1 Introduction
  • 2 Background
  • 3 Data
    • 3.1 Distributional Vector Spaces
    • 3.2 Evaluation Data Sets
  • 4 Distributional Inclusion across Spaces
  • 5 Supervised Hypernymy Detection
    • 5.1 Models, Features, and Method
    • 5.2 Results
  • 6 Selective Distributional Inclusion
    • 6.1 Experiment
  • 7 Conclusion
  • Acknowledgements
  • References

References

  •  2Marco Baroni and Alessandro Lenci. 2011. How we BLESSed distributional semantic evaluation. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, pages 1–10, Edinburgh,UK, July. Association for Computational Linguistics.View this Paper
  •  Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The WaCky wide web: A collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation,43(3):209–226.View this Paper
  • 333Marco Baroni, Raffaella Bernardi, Ngoc-Quynh Do, and Chung-chieh Shan. 2012. Entailment above the word level in distributional semantics. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pages 23–32, Avignon, France, April. Association for Computational Linguistics.View this Paper
  •  Matthew Berland and Eugene Charniak. 1999. Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 57–64, College Park, Maryland, USA,June. Association for Computational Linguistics.View this Paper
  •  Paul Buitelaar, Philipp Cimiano, and Bernardo Magnini. 2005. Ontology Learning from Text: Methods, Evaluation and Applications. Frontiers in Artificial Intelligence and Applications Series. IOS Press, Amsterdam.
  •  Timothy Chklovski and Patrick Pantel. 2004. Verbocean: Mining the web for fine-grained semantic verb relations. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 33–40.View this Paper
  •  Philipp Cimiano, Aleksander Pivk, Lars Schmidt-Thieme, and Steffen Staab. 2005. Learning taxonomic relations from heterogeneous sources of evidence. Ontology Learning from Text: Methods, evaluation and applications.
  •  Daoud Clarke. 2009. Context-theoretic semantics for natural language: an overview. In Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pages 112–119, Athens, Greece, March. Association for Computational Linguistics.View this Paper
  •  Maayan Geffet and Ido Dagan. 2004. Feature vector quality and distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics, page 247. Association for Computational Linguistics.View this Paper
  •  Roxana Girju, Adriana Badulescu, and Dan Moldovan. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pages 1–8. Association for Computational Linguistics.View this Paper
  •   10 http://www.tacc.utexas.edu
  •  Roxana Girju, Adriana Badulescu, and Dan Moldovan. 2006. Automatic discovery of part-whole relations. Computational Linguistics, 32(1):83–135.View this Paper
  •  Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th Conference on Computational Linguistics, pages 539–545, Stroudsburg, PA, USA. Association for Computational Linguistics.View this Paper
  •  Aurélie Herbelot and Mohan Ganesalingam. 2013. Measuring semantic content in distributional vectors. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 440–445, Sofia, Bulgaria, August. Association for Computational Linguistics.View this Paper
  • 4Lili Kotlerman, Ido Dagan, Idan Szpektor, and Maayan Zhitomirsky-geffet. 2010. Directional distributional similarity for lexical inference. Natural Language Engineering, 16:359–389, 10.View this Paper
  • 24Alessandro Lenci and Giulia Benotto. 2012. Identifying hypernyms in distributional semantic spaces. In *SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 75–79, Montréal, Canada, 7-8 June. Association for Computational Linguistics.View this Paper
  •  Alessandro Lenci. 2008. Distributional approaches in linguistic and cognitive research. Italian Journal of Linguistics, 20(1):1–31.
  •  Dekang Lin, Shaojun Zhao, Lijuan Qin, and Ming Zhou. 2003. Identifying synonyms among distributionally similar words. In Proceedings of the 18th international Joint Conference on Artificial intelligence, pages 1492–1493.View this Paper
  •  Dekang Lin. 1998. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, volume 98, pages 296–304.View this Paper
  •   Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746–751, Atlanta, Georgia, June. Association for Computational Linguistics.View this Paper
  •  Gregory L. Murphy. 2002. The Big Book of Concepts. MIT Press, Boston, MA.
  •   Patrick Pantel and Marco Pennacchiotti. 2006. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics.View this Paper
  •  Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertran Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, MMatthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.View this Paper
  •  Enrico Santus. 2013. SLQS: An entropy measure. Master’s thesis, University of Pisa.
  •  Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. 2005. Learning syntactic patterns for automatic hypernym discovery. In Lawrence K. Saul, Yair Weiss, and Léon Bottou, editors, Advances in Neural Information Processing Systems 17, pages 1297–1304, Cambridge, MA. MIT Press.View this Paper
  •  Rion Snow, Daniel Jurafsky, and Andrew Y. Ng. 2006. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pages 801–808, Stroudsburg, PA, USA. Association for Computational Linguistics.View this Paper
  •  Peter Turney and Patrick Pantel. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37:141–188.View this Paper
  •  Peter D. Turney. 2006. Similarity of semantic relations. Computational Linguistics, 32(3):379–416.View this Paper
  •  Julie Weeds and David Weir. 2003. A general framework for distributional similarity. In Michael Collins and Mark Steedman, editors, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing,pages 81–88.View this Paper
  • 2Julie Weeds, David Weir, and Diana McCarthy. 2004. Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics, pages 1015–1021, Geneva,Switzerland, Aug 23–Aug 27. Association for Computational Linguistics, COLING.View this Paper
  •  Maayan Zhitomirsky-Geffet and Ido Dagan. 2005. The distributional inclusion hypotheses and lexical entailment. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 107–114,Ann Arbor, Michigan, June. Association for Computational Linguistics.View this Paper
  •  Maayan Zhitomirsky-Geffet and Ido Dagan. 2009. Bootstrapping distributional feature vector quality. Computational linguistics, 35(3):435–461.View this Paper
+- Similar Papers (10)
+- Cited by (33)