Speech synthesis and recognition technologies

library > Speech synthesis and recognition technologies

“Without… screen & keyboard!”

On Wednesday, January 20th, the SciFY team organized at the INNOVATHENS Innovation and Entrepreneurship Hub of the Technopolis of the Municipality of Athens a seminar dedicated to speech synthesis and recognition Technologies.

Almost 100 people watched Emilios Halamandaris, co-founder and director of INNOETICS, and Nasos Katsamanis, researcher of the Athena R.C. and co-founder of Beenotes, analyze the technologies of speech synthesis (text-to-speech) and speech recognition (speech-to-text).

Vassilis Salapatas, one of the co-founders of SciFY, opened the event by presenting the new improved version of the innovative application ICSee, funded by the Latsis Foundation. In particular, he said, “With ICSee (I Can See), people with limited vision can read a small text, such as a restaurant menu or the price on a receipt, which would otherwise be very difficult or even impossible to read they do. Thus, their daily life improves significantly.” We suggest that those of you with Android devices (smartphone or tablet) download the application from Google Play free of charge.

“Look, I don’t need a screen!”

Then, Emilios Halamandaris took the floor. He talked about speech production imitation (parametric, voice imitation as a signal, and hybrid), as well as the systems used by speech synthesis, such as Text normalization and Unit selection. Additionally, he spoke of the areas of everyday life in which this particular technology is applied. Examples include Education (audio books, talking dolls), GPS navigation, mass media announcements, human-machine interface, as well as the preservation of languages that tend to disappear!

“Look, I don’t need a keyboard!”

In the second part of the event, Nasos Katsamanis analyzed speech recognition technology, meaning the conversion of spoken words into text. Characteristic examples of the use of this technology are the subtitling of videos on YouTube, Google voice input, and telephone monitoring. He then explained that to create an application that uses speech recognition technology, data, a list of phonemes, acoustic models, and voice recognition tools such as KALDI & CMU Sphinx are required.

Additionally, we were informed that in some cases, it is good when… “The walls have ears“. More specifically, reference was made to how our fellow citizens with some form of disability can use the electrical devices in their homes with simple voice commands. The project’s name is DIRHA.

Enjoy the relevant video: “Home sweet home… Listen!

And finally, for those wondering about the relationship between beekeeping with the above, at SciFY Academy, we learned that speech recognition technology is used by beekeepers to take voice notes when taking inventory of their bees as they cannot take written notes due to their clothing!
In closing, we’d like to thank the guest speakers Emilios Chalamandaris and Nasos Katsamanis, the Latsis Foundation for the funding it provided so that we’d upgrade the ICSee application as well as all of you for attending the event.
See you soon! Download Mr. Halamandaris’s presentation here. Download Mr. Katsamanis’s presentation here.You can watch the full 7th SciFY Academy event here.
To stay informed about future SciFY Academy events, you can register here.