Microsoft Research Collaborates With Wikipedia to Enhance Multilingual Content

REDMOND, Wash. — Oct. 18, 2010 — Microsoft Research today announced the launch of the beta version of WikiBhasha, a multilingual content creation tool for Wikipedia. The WikiBhasha tool enables contributors to Wikipedia to find content from other Wikipedia articles, translate the content into other languages, and then either compose new articles or enhance existing articles in multilingual Wikipedias. The WikiBhasha beta is available as an open source MediaWiki extension, under the Apache License 2.0 at http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/WikiBhasha, and as a user gadget in Wikipedia. The tool is also available as an installable bookmarklet at http://www.wikibhasha.org, which is hosted on the Windows Azure platform from Microsoft Corp. The name WikiBhasha derives from the well-known term “wiki,” denoting collaboration, and “bhasha,” which means “language” in Hindi and Sanskrit.

WikiBhasha will support content creation in more than 30 languages. The beta version of WikiBhasha will enable easy content creation in non-English Wikipedias by leveraging the large volume of English Wikipedia content as the source of information. Initially, the Wikimedia Foundation and Microsoft Research will also work closely with the Wikipedia user communities focusing on content creation in Arabic, German, Hindi, Japanese, Portuguese and Spanish.

“We’re always happy to see work on improving multilingual collaboration between wikis,” said Danese Cooper, CTO of the Wikimedia Foundation. “Microsoft Research is doing some interesting work with WikiBhasha, and we’re very pleased that it chose to share its client code in open source as well.”

By making it easier for the Wikipedia community to create multilingual content, Wikipedia and Microsoft Research hope to inspire a new wave of multilingual content creation.

“The WikiBhasha beta holds the promise of enabling easy creation of content in multiple languages, and also of generating a large body of parallel language data for researchers to work on to further machine translation technology,” said P. Anandan, managing director, Microsoft Research India. “Creating quality content in multiple languages can be greatly improved and accelerated with the active participation of the Wikipedia communities.”

The WikiBhasha beta is a browser-based tool that works on Wikipedia sites. It features an intuitive and simple user interface (UI) layer that stays on the target-language Wikipedia for the entire content creation process. This UI layer integrates content discovery with linguistic and collaborative services, focusing the user primarily on content creation in the target Wikipedia. A simple three-step process guides the user in the content discovery and sourcing from English Wikipedia articles, composing a local-language Wikipedia article and publishing it in the target Wikipedia. Although a typical session may be to enhance a target-language Wikipedia article, new articles may also be created following a similar process. The WikiBhasha beta currently works on Windows Internet Explorer (7.0 and 8.0) on Windows XP, Windows Vista and Windows 7, and on Firefox (3.5 or above) on Linux Fedora (11 and 12), Windows XP, Windows Vista and Windows 7.

WikiBhasha, which is supported by Microsoft’s Machine Translation system and Microsoft’s Collaborative Translations Framework, was conceptualized by the Multilingual Systems Group at Microsoft Research India. The Multilingual Systems Group explores multilingual and cross-language technologies that work seamlessly across languages, and in creation of resources for aiding computational linguistics research. The tool was developed in collaboration with the Natural Language Processing Group in Microsoft Research Redmond.

A video tutorial that familiarizes users with the rich functionalities offered by the beta of WikiBhasha can be found at http://www.wikibhasha.org.

About Microsoft Research

Founded in 1991, Microsoft Research is dedicated to conducting both basic and applied research in computer science and software engineering. Researchers focus on more than 55 areas of computing and collaborate with leading academic, government and industry researchers to advance the state of the art. Microsoft Research has expanded over the years to eight locations worldwide and a number of collaborative projects that bring together the best minds in computer science to advance a research agenda based on their unique talents and interests.

Microsoft Research has locations in Redmond, Wash.; Cambridge, Mass.; Silicon Valley, Calif.; Cambridge, England; Beijing, China; and Bangalore, India, and also conducts research at the Cairo Microsoft Innovation Center in Egypt; European Microsoft Innovation Centre in Aachen, Germany; and the eXtreme Computing Group in Redmond. Microsoft Research collaborates openly with colleges and universities worldwide to enhance the teaching and learning experience, inspire technological innovation, and broadly advance the field of computer science. More information can be found at http://www.research.microsoft.com.

About Microsoft

Founded in 1975, Microsoft (Nasdaq “MSFT”) is the worldwide leader in software, services and solutions that help people and businesses realize their full potential.

Related Posts