The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests
Authors:
Björn W. Schuller,
Anton Batliner,
Shahin Amiriparian,
Alexander Barnhill,
Maurice Gerczuk,
Andreas Triantafyllopoulos,
Alice Baird,
Panagiotis Tzirakis,
Chris Gagne,
Alan S. Cowen,
Nikola Lackovic,
Marie-José Caraty,
Claude Montacié
Abstract:
The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classi…
▽ More
The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit; in addition, wav2vec2 models are used.
△ Less
Submitted 1 May, 2023; v1 submitted 28 April, 2023;
originally announced April 2023.
Prediction of User Request and Complaint in Spoken Customer-Agent Conversations
Authors:
Nikola Lackovic,
Claude Montacié,
Gauthier Lalande,
Marie-José Caraty
Abstract:
We present the corpus called HealthCall. This was recorded in real-life conditions in the call center of Malakoff Humanis. It includes two separate audio channels, the first one for the customer and the second one for the agent. Each conversation was anonymized respecting the General Data Protection Regulation. This corpus includes a transcription of the spoken conversations and was divided into t…
▽ More
We present the corpus called HealthCall. This was recorded in real-life conditions in the call center of Malakoff Humanis. It includes two separate audio channels, the first one for the customer and the second one for the agent. Each conversation was anonymized respecting the General Data Protection Regulation. This corpus includes a transcription of the spoken conversations and was divided into two sets: Train and Devel sets. Two important customer relationship management tasks were assessed on the HealthCall corpus: Automatic prediction of type of user requests and complaints detection. For this purpose, we have investigated 14 feature sets: 6 linguistic feature sets, 6 audio feature sets and 2 vocal interaction feature sets. We have used Bidirectional Encoder Representation from Transformers models for the linguistic features, openSMILE and Wav2Vec 2.0 for the audio features. The vocal interaction feature sets were designed and developed from Turn Takings. The results show that the linguistic features always give the best results (91.2% for the Request task and 70.3% for the Complaint task). The Wav2Vec 2.0 features seem more suitable for these two tasks than the ComPaRe16 features. Vocal interaction features outperformed ComPaRe16 features on Complaint task with a 57% rate achieved with only six features.
△ Less
Submitted 27 July, 2022;
originally announced August 2022.