Artificial Intelligence has become a new thing. The only barrier that comes when dealing with such advanced technology in Pakistan is language. However, that might not be the case anymore. While, most assistants are programmed to recognize English as speech, a group of Pakistani researchers at ITU’s Center for Speech and Language Technologies (CSaLT) Laboratory have found a way to change that.
An assistant professor at Information Technology University, Lahore and a PhD in Language Technologies has created a corpus (database containing all the specific sounds used in everyday language) along with his team, which is capable of covering all the possible distinct sounds.
Dubbed as “CSaLT Phonetically Rich Urdu Speech Corpus”, the speech recognition system consists of a transcribed speech ranging for about 70-minutes with 708 sentences covering the probable 63 phenomes; with a database of 5,656 unique words. The database can be downloaded from the research center’s website.
“Speech recognition is a two-step process. The corpus will give the computer application access to all possible phonemes used in the formation of meaningful Urdu words from everyday speech,” said Dr. Raza.
“We hope that release of this corpus will also prove beneficial for regional languages in the country and languages lacking ample linguistic resources all over the world. Those interested in working on those languages can follow our technique to develop similar corpora of sentences in those languages.”
With the help of this new speech recognition technology, the users will now be able to enjoy all the modern perks of artificial intelligence and other tech updates.