Logo RUB

Ruhr-Universität Bochum

Sprachwissenschaftliches Institut

  • Startseite
  • Aktuelles
  • Blog
  • Newsletter
  • Kontakt
  • Ansprechpartner
  • Anreise
  • Personen
  • Forschung
  • Professuren
  • Projekte
  • Tools & Ressourcen
  • Vortragsreihe
  • Arbeitsberichte
  • Studium
  • Beratung
  • Für Studierende
  • Für Interessierte
  • Lehrveranstaltungen
  • Student Guide
  • Modulhandbuch
  • Studienbüro Linguistik (SBL)
  • Prüfungsanmeldung
  • Prüfungsordnungen
  • Fachschaftsrat Linguistik
  • Sitemap
  • Datenschutz
  • Impressum

Subjective Bias and Consistency in Human Evaluation of Natural Language Generation

Jacopo Amidei (The Open University, Milton Keynes), 26.01.2021, 16:00

The Natural Language Generation (NLG) community relies on shared evaluation techniques to understand progress in the field. In this talk, I will focus on the problem of the reliability of human evaluation studies in NLG. Based on an analysis of papers published over 10 years (from 2008 to 2018) in NLG-specific conferences and on an observational study, I will show some shortcomings with existing approaches to reporting the reliability for human intrinsic evaluation of NLG systems. Then, I will present a new proposal for reporting reliability based on the use of correlation coefficients. The correlation coefficients can be used to measure the extent to which judges follow a systematic pattern in their assessments, even when their individual interpretations of the phenomena are not identical. Our proposal offers a new approach to measure judges’ relative consistency, which provides insights about the trust-ability of human judgements.