Speech becomes the next frontier in search

'Audioclipping' promises to do for multimedia content what Google did for text on the world wide web, writes Derek Scally in …

'Audioclipping' promises to do for multimedia content what Google did for text on the world wide web, writes Derek Scally in Berlin

The search is on for the world's next Google and a hot new contender has entered the race. You could call it Google for the ears but its German developers call it "audioclipping", software that converts the spoken word into a searchable text index.

"This is a new search method with a multimedia context, be it podcast, radio, television. It makes all audio content available for search," says Dietmar Kneidel of Com Vision, the company developing the technology behind audioclipping.

Like so many other technological breakthroughs, audioclipping originated with a bunch of Com Vision's Trekkie techies and a heated discussion about being able to resolve disputes over Star Trek quotes by searching the dialogue of the science fiction programme.

READ MORE

That initial idea has, two years on, now gone on the market as a fully-fledged product and its developers now have their sights on the entire English-speaking world.

It works by putting an audio signal through a classic speech recognition programme to convert it into a computer-readable file.

Software divides the speech into its smallest components - phones and phonemes. The audioclipping speech model then determines the word by comparing the phone in question with the statistical occurrences in particular sequences.

The process is optimised further by letting the program know what kind of audio is being scanned: entertainment or news and current affairs.

The technology behind Audioclipping has only become possible with the computing capability available at an affordable price in the last two years. At the start, the program needed an hour to search 10 hours of audio material. Now it can process 40 hours in one hour with an accuracy of over 85 per cent.

Crucial to its success is Audioclipping's "universal ear", meaning it is apparently not bothered by accents or sound quality, a common problem with computer speech recognition programs.

Its accuracy is improving all the time thanks to the growing index.

It has already indexed the entire German dictionary and every day adds new words and new occurrences of existing words.

"Someone said that speech is a moving target. Two months ago nobody would have mentioned 'stroke' in connection with 'Ariel Sharon', now the whole world is talking about it, so that's saved in the archive," says Kneidel.

Audioclipping has just been launched in Germany and Com Vision has been overwhelmed by interview requests from television and radio stations. Often interviews end with a request for Com Vision to contact the media company's IT department to talk about buying the technology.

At the moment, the website offers a stripped-down version of the full service, allowing limited searches of radio stations. But for German business clients, Com Vision is offering trials of the full package, which produces contextual search results, with the original audio clip just a click away.

"At the moment we are a garage firm the way Hewlett-Packard or Google once were. When it develops it would be cool to be like Google earning money through placements," says Kneidel.

Governments and companies spend huge sums of money each year on media monitor teams who record what was said where, when and by whom.

Audioclipping could render that service obsolete with a lucrative Google-meets-Lexus Nexus subscription service for corporations, governments and PR companies.

"The existing media observation services tell their people where to watch and what to look for," says Kneidel. "In comparison to what other media observation services provide, we are cheap at the price and watch everything."

But the technology has potential beyond the business world too, potentially bringing a "daily me" television service into the home.

Viewers could choose the programmes they want to watch - a film, the sports results and news on a favourite politician - all drawn from the Audioclipping index.

Another department of Com Vision is already working on technology to do just that, steered by voice control.

It's early days for audioclipping but Com Vision says it wants to concentrate on technology and seek out partners to work on the business models and selling. And, one day, the company's busy employees still hope to get around to that Star Trek index.