Speech synthesis and recognition technologies

“Without… screen & keyboard!”

On Wednesday, January 20th, the SciFY team organized a seminar at the INNOVATHENS Innovation and Entrepreneurship Hub of the Technopolis of the Municipality of Athens, dedicated to speech synthesis and recognition Technologies.Almost 100 people watched Emilios Halamandaris, co-founder and director of INNOETICS, and Nasos Katsamanis, a researcher at Athena R.C. and co-founder of Beenotes, analyze the technologies of speech synthesis (text-to-speech) and speech recognition (speech-to-text).

Vassilis Salapatas, one of the co-founders of SciFY, opened the event by presenting the new, improved version of the innovative application ICSee, funded by the Latsis Foundation. In particular, he said, “With ICSee (I Can See), people with limited vision can read a small text, such as a restaurant menu or the price on a receipt, which would otherwise be very difficult or even impossible to read they do. Thus, their daily life improves significantly.” We recommend that those with Android devices (smartphones or tablets) download the application from Google Play for free.

“Look, I don’t need a screen!”

Then, Emilios Chalamandaris took the floor. He discussed speech production imitation (parametric, voice imitation as a signal, and hybrid), as well as the systems used in speech synthesis, such as Text normalization and Unit selection. Additionally, he discussed the areas of everyday life where this particular technology is applied. Examples include Education (audio books, talking dolls), GPS navigation, mass media announcements, human-machine interface, as well as the preservation of languages that tend to disappear! See his presentation here.

“Look, I don’t need a keyboard!”

In the second part of the event, Nasos Katsamanis analyzed speech recognition technology, which involves converting spoken words into text. Characteristic examples of this technology’s use include subtitling videos on YouTube, Google Voice input, and telephone monitoring. He then explained that to create an application that uses speech recognition technology, data, a list of phonemes, acoustic models, and voice recognition tools such as KALDI & CMU Sphinx are required.

Additionally, we were informed that in some cases, it is good when… “The walls have ears“. More specifically, reference was made to how our fellow citizens with some form of disability can use the electrical devices in their homes with simple voice commands. The project’s name is DIRHA.

Enjoy the relevant video: “Home sweet home… Listen!

Finally, for those wondering about the relationship between beekeeping and the above, at SciFY Academy, we learned that beekeepers use speech recognition technology to take voice notes when taking inventory of their bees, as they cannot write notes due to their protective clothing.

See his entire presentation here.

In closing, we’d like to thank the guest speakersEmilios Chalamandaris and Nasos Katsamanis, and the Latsis Foundation for the funding it provided, which enabled us to upgrade the ICSee application. We also appreciate all of you for attending the event.
See you soon!

You can watch the whole 7th SciFY Academy event here.
To stay informed about future SciFY Academy events, you can register here.