|
||||||||||||||||
|
March 18th, 2010
|
||||||||||||||||
|
Advancing digital securityU.S. applauds Czech researchers for voice recognition softwareBy Brandon Swanson Staff Writer, The Prague Post July 19th, 2006 issue
BRNO, SOUTH MORAVIA A narrow, cramped room on the top floor of a former monastery is an unlikely place to find researchers working on technology that could help the U.S. government track down Osama bin Laden. But a team of professors and students have developed something that has the potential to do just that. A voice recognition program designed by a group at Brno Technical University (VUT) recently took top honors in a premier international evaluation by the U.S. National Institute of Standards and Technology (NIST). The group presented its design at a conference in Puerto Rico June 3. The program is able to determine the identity of a speaker based on complex computer analysis of the briefest audio sample. "This is statistical modeling," says Pavel Matějka, a Ph.D. student who is the program's primary author. "Nobody sits here and guesses, 'Does this voice belong to this person?' " The technology has wide-ranging applications. Imagine searching the Internet for a phrase uttered in anything from speeches to sitcoms. Imagine calling a customer service center and having the operator access your records as soon as you say, "Hello." Imagine listening to a telephone conversation between two people and knowing immediately if one is a wanted criminal. Voice-recognition programs like the one developed in Brno are advancing toward that goal. The NIST, historically an agency responsible for promoting general innovation, saw its focus shift to homeland security after the Sept. 11, 2001, terrorist attacks. It now deals with identity authentication and tracking down terrorists. While such software might set off alarms for civil libertarians voice recognition software has major surveillance implications the Brno researchers say this technology has little to do with Big Brother. Defense Ministry interest The team gets most of its funding from VUT but its developments have also garnered support from the Defense Ministry, which awards it grants for specific projects. The team also works on language identification, age estimation and automated speech-to-text transcription. Matějka says he doesn't know if the ministry has a speaker identification program, but that may change.
"There are some clues that they may want this," he says. Although the Defense Ministry did not give financial support to this particular voice recognition program, ministry spokeswoman Jindřiška Verešová recognizes its merits. "It is obvious what a remarkable tool this system can be for providing security," she says. "If this system could prevent us from a single violent act or uncover a potential threat for the country, then its benefit would be undoubtedly highly valued." There are four other teams in the Czech Republic working on speech recognition technologies, including researchers at the IBM branch in Prague who are working on speech recognition for cell phones and other devices. The NIST gave 38 international teams two audio samples and teams then had to determine whether they were spoken by the same person. "This NIST evaluation is one of the top system evaluations because it has precise rules and everyone has the same data," Matějka says. "They just give you a bunch of data and you have to figure it out." NIST upped the ante by giving audio samples of different quality, with different levels of feedback and in different languages. Only about two-thirds of the samples were in English. "Someone's voice is determined by the physical shape of their vocal tract," Matějka says. "That doesn't change with the language." During the evaluation, the Brno team's program which relies on the acoustics created by the speaker's voice rather than, say, his cadence or inflection used original code as well as pieces of code from other teams to get the most accurate results. The team also collaborated with researchers in South Africa and the Netherlands, and its software produced the most accurate results of the 38 teams attending the conference. Technology's limits The program is far from perfect, and on-the-fly voice recognition is still many years away; speech-to-text transcription, for example, still gets one out of every four words wrong. For now, voice recognition software can only determine the identity of a speaker to a certain degree of accuracy, says Petr Schwarz, another member of the team. Researchers can alter the threshold of a search depending on how accurate that search needs to be. "For example, you would want a high threshold to access bank records, but a low one for doing an archive search on a particular word," he says. The team will have another evaluation this fall but it will be without the program's main author, Matějka, who will get his doctorate and move on. Other researchers, like Schwarz, are eligible for graduation but say they have stayed with the team because of the excitement of the work. That's what keeps him there. "Many times information is lost, but if you can save it and then access it, you could help hundreds of thousands of lives," Schwarz says. Sylvie Dejmková contributed to this report. Brandon Swanson can be reached at bswanson@praguepost.com Other articles in News (19/07/2006):
|
Most visited in Business Listings |
||||||||||||||
|
||||||||||||||||
Be the first to add a comment!