In an effort to make Wikipedia more accessible to the visually impaired users, the online encyclopedia has collaborated with researchers from Sweden’s KTH Royal Institute of Technology to develop the world's first crowdsourced speech engine.
The speech synthesis platform will be optimized for the virtual encyclopedia, but will be available as an open source for portal running MediaWiki, an open source wiki package respectively.
Initially, the Wikispeech pilot project will be developed in English, Swedish and Arabic, which is set to be completed by September 2017 following which the service would be extended to rest of the 280 languages in which it is currently available.
According to Joakin Gustafson, professor of speech technology at KTH, the initial focus will be on the Swedish language followed by rudimentary English, which they believe is expedient given the huge amount of open source linguistic resources. And lastly, they plan to do the basic Arabic voice, which will be more a proof of concept.
Similar to the online encyclopedia, the speech engine would be crowdsourced. Researchers at KTH will rely on the online community’s contribution to the platform's development. The content generated would be open and freely licensed to everyone in accordance with the rules of Wikimedia Commons.
The Wikispeech is in partnership between KTH Royal Institute of Technology, the Swedish Post and Telecom Authority, Wikimedia Sweden and STTS speech technology services. Jonas Beskow, professor of speech communication at KTH and Zofia Malisz would head the project.
According to Swedish telecommunications regulator PTS, which is funding the project, apparently 25 percent of Wikipedia users, which is about 125 million users per month, prefer access to its content in spoken form.
According to Gustafson, an open source module will be created so that any open source speech synthesizer can be plugged in. Since the framework is open, it will be easy to include or substitute certain modules in the Text-to-Speech system (TTS). The TTS open source functionality could be used by anybody for any use, not just reading web pages, he added.
The researchers further want to look into the probability for letting users record how a word should actually be pronounced and then have it automatically corrected in the transcription. Initially, it will have to use phonetic transcription (IPA) to correct the dictionary, Gustafson said.
The aim is to eventually enable Wikipedia to communicate the content to users in all of the languages in which it is accessible.