ACL 2020 paper (originated from R250 Project)

Automated speech grader

ABOUT THE PROJECT

This work was presented at ACL2020.

This project attempted to automatically predict CEFR grades of L2 English speakers from ASR transcriptions of a business English exam. The motivation behind working in a no-audio scenario is to simulate working with smart speaker systems. APIs provided by smart speakers (such as Alexa and Google Home) only provide transcriptions to third parties for privacy reasons.

The two key experiments performed in the paper are:

  • a comparison between neural models (RNN and transformers) against a linear regression model with manually engineered features
  • an investigation into whether training with sequence tagging auxiliary objectives leads to an improvement in neural graders

The auxilliary objectives investigated in this work were:

  • Part of speech tags (Penn Treebank)
  • Grammatical relation to head token (Universal Dependencies)
  • Language modelling (next/previous token prediction, masked language modelling)
  • Native language prediction

The paper also provides some analysis on the impact of filled pauses and ASR word error rate on the performance of a speech grader.

The speech graders were built using Python, PyTorch and the HuggingFace transformers library.

MACHINE LEARNING ENGINEER | SOFTWARE DEVELOPER

Working in London, UK.

Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY