Specialising Word Vectors for Lexical Entailment

Author

Vulić, Ivan and Mrkšić, Nikola

Conference

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Year

2018

Figures & Tables

Figure 3 : Results on the graded LE task defined by HyperLex. Following Nickel and Kiela (2017), we use Spearman’s rank correlation scores on: a) the entire dataset (2,616 noun and verb pairs); and b) its noun subset (2,163 pairs). The summary table shows the performance of other well-known architectures on the full HyperLex dataset, compared to the best results achieved using LEAR specialisation.
Figure 2 : Summary of the results on three different word-level LE subtasks: (a) directionality; (b) detection;(c) detection and directionality. Vertical bars denote the results obtained by different input word vector spaces which are post-processed/specialised by our LEAR specialisation model using three variants of the asymmetric distance (D 1 , D 2 , D 3 ), see Section 2. Thick horizontal red lines refer to the best reported scores on each subtask for these datasets; the baseline scores are taken from Nguyen et al. (2017).
Table 2 : Analysing the importance of the synergy in the FULL LEAR model on the final performance on WBLESS , BLESS , HyperLex-All ( HL - A ) and HyperLex-Nouns ( HL - N ). Input: FAST T EXT . D2.
Table 1 : L2 norms for selected concepts from the WordNet hierarchy. Input: FAST T EXT ; LEAR : D2.

Table of Contents

  • Abstract
  • 1 Introduction
  • 2 Methodology
    • 2.1 The A TTRACT -R EPEL Framework
    • 2.2 LEAR: Encoding Lexical Entailment
  • 3 Experimental Setup
  • 4 Results and Discussion
    • 4.1 LE Directionality and Detection
    • 4.2 Graded Lexical Entailment
    • 4.3 Further Discussion
  • 5 Related Work
  • 6 Conclusion and Future Work
  • Acknowledgments
  • References
  • 5:135–146. Samuel R. Bowman, Gabor Angeli, Christopher Potts,

References

  •  Rami Al-Rfou, Bryan Perozzi, and Steven Skiena.2013. Polyglot: Distributed word representations for multilingual NLP. In Proceedings of CoNLL, pages 183–192.View this Paper
  •   Marco Baroni, Raffaella Bernardi, Ngoc-Quynh Do,and Chung-chieh Shan. 2012. Entailment above the word level in distributional semantics. In Proceedings of EACL, pages 23–32.View this Paper
  •  Marco Baroni and Alessandro Lenci. 2011. How weView this Paper
  •   BLESSed distributional semantic evaluation. In Proceedings of the GEMS 2011 Workshop, pages 1–10.
  •   Richard Beckwith, Christiane Fellbaum, Derek Gross,and George A. Miller. 1991. WordNet: A lexical database organized on psycholinguistic principles. Lexical acquisition: Exploiting on-line resources to build a lexicon, pages 211–231.View this Paper
  •  Jiang Bian, Bin Gao, and Tie-Yan Liu. 2014. Knowledge-powered deep learning for word embedding. In Proceedings of ECML-PKDD, pages 132–148.View this Paper
  •  Or Biran and Kathleen McKeown. 2013. Classifying taxonomic relations between pairs of Wikipedia articles. In Proceedings of IJCNLP, pages 788–794.View this Paper
  •   Piotr Bojanowski, Edouard Grave, Armand Joulin, and
  •   Tomas Mikolov. 2017. Enriching word vectors with
  •   subword information. Transactions of the ACL,
+- Similar Papers (10)
+- Cited by (1)