A Scalable Video Search Engine Based on Audio Content Indexing and Topic Segmentation
Authors:
Julien Lawto,
Jean-Luc Gauvain,
Lori Lamel,
Gregory Grefenstete,
Guillaume Gravier,
Julien Despres,
Camille Guinaudeau,
Pascale Sébillot
Abstract:
One important class of online videos is that of news broadcasts. Most news organisations provide near-immediate access to topical news broadcasts over the Internet, through RSS streams or podcasts. Until lately, technology has not made it possible for a user to automatically go to the smaller parts, within a longer broadcast, that might interest them. Recent advances in both speech recognition sys…
▽ More
One important class of online videos is that of news broadcasts. Most news organisations provide near-immediate access to topical news broadcasts over the Internet, through RSS streams or podcasts. Until lately, technology has not made it possible for a user to automatically go to the smaller parts, within a longer broadcast, that might interest them. Recent advances in both speech recognition systems and natural language processing have led to a number of robust tools that allow us to provide users with quicker, more focussed access to relevant segments of one or more news broadcast videos. Here we present our new interface for browsing or searching news broadcasts (video/audio) that exploits these new language processing tools to (i) provide immediate access to topical passages within news broadcasts, (ii) browse news broadcasts by events as well as by people, places and organisations, (iii) perform cross lingual search of news broadcasts, (iv) search for news through a map interface, (v) browse news by trending topics, and (vi) see automatically-generated textual clues for news segments, before listening. Our publicly searchable demonstrator currently indexes daily broadcast news content from 50 sources in English, French, Chinese, Arabic, Spanish, Dutch and Russian.
△ Less
Submitted 27 November, 2011;
originally announced November 2011.
Utilisation de la linguistique en reconnaissance de la parole : un état de l'art
Authors:
Stéphane Huet,
Pascale Sébillot,
Guillaume Gravier
Abstract:
To transcribe speech, automatic speech recognition systems use statistical methods, particularly hidden Markov model and N-gram models. Although these techniques perform well and lead to efficient systems, they approach their maximum possibilities. It seems thus necessary, in order to outperform current results, to use additional information, especially bound to language. However, introducing su…
▽ More
To transcribe speech, automatic speech recognition systems use statistical methods, particularly hidden Markov model and N-gram models. Although these techniques perform well and lead to efficient systems, they approach their maximum possibilities. It seems thus necessary, in order to outperform current results, to use additional information, especially bound to language. However, introducing such knowledge must be realized taking into account specificities of spoken language (hesitations for example) and being robust to possible misrecognized words. This document presents a state of the art of these researches, evaluating the impact of the insertion of linguistic information on the quality of the transcription.
△ Less
Submitted 30 May, 2006;
originally announced May 2006.