Getting Over the Boring Stuff Quicker - Building a Semi-Automated Speech Audio Annotation Tool

Sat September 05, 11:35 AM–12:00 PM • Back to program

Start time	11:35
End time	12:00
Countdown link	Open timer

Developing a new deep learning model requires a large amount of data to be collected and annotated. While the process of data collection can be expedited by making use of publicly available data, it can be time consuming to annotate and label the large amounts of data needed to train a high accuracy model.

Annotation tools for audio data, especially speech data, are currently very limited. This talk explores the development of a tool that takes a novel ‘semi-automated’ approach to speech audio annotation. This new approach streamlines the normally monotonous process of manual annotation, by creating a modular system and graphical interface. It combines manual human annotation with automated annotation that leverages a mixture of technologies, including pre-trained models, existing speech-recognition APIs, and model training-inference loops.

The talk will discuss the concepts and building blocks of such a semi-automated pipeline for data annotation. A live demo of the annotation interface will be shown.

Xin Liang She / her

Xin is a machine learning engineer at Eliiza, who has considerable experience in both software engineering and machine learning. She has worked with technologies such as deep learning, computer vision, natural language processing, signal processing, web and product development, and cloud engineering. With her great passion for Artificial Intelligence, she is focusing on using her engineering skills to develop, build, productionise and scale machine learning solutions.