Wikipedia has plans to use generative AI.

The Wikimedia Foundation is the nonprofit organization that operates Wikipedia and other Wikimedia free knowledge projects. (Image – Shutterstock)

Managing content: How Wikipedia uses AI

  • Wikipedia volunteers have been using AI tools and bots to support their work since 2002. 
  • Wikipedia submitted an experimental Wikipedia plugin to OpenAI, now available for ChatGPT Plus users.
  • The Wikimedia Foundation also creates, deploys, and maintains hundreds of machine-learning models serving various roles.

With content at its core, it would only make sense for Wikipedia, the world’s largest online encyclopedia, to leverage AI to generate additional content. Given the capabilities of generative AI, Wikipedia could implement tools to help users browse for content much faster. However, the company is taking a cautious approach right now for several reasons.

Wikipedia has been around for more than two decades. The free online encyclopedia, created and edited by volunteers around the world, is hosted by the Wikimedia Foundation. Over the years, the content of Wikipedia has continued to grow. Today, the site is available in 334 languages and has information on almost every topic imaginable.

But when it comes to using AI, there are concerns about how the data is collected and presented for some use cases. In the era of misinformation and disinformation, the use of AI, especially in fact-gathering, remains questionable by many. Apart from concerns about data sources, some information collected by AI can also be outdated.

ChatGPT, for example, acknowledges that it does not have the latest information on some topics as its data is based on the information it is fed, which can be up to a specific date. On the other hand, Wikipedia’s volunteers manually update the site’s information.

For Wikipedia, AI would be a game changer. However, with challenges such as misinformation, disinformation, and fake news, among others, adequate testing is needed before such solutions can be deployed on the site.

To understand how Wikipedia plans to use AI and deal with the challenges that could arise from it, Tech Wire Asia spoke to Selena Deckelmann, Chief Product and Technology Officer at Wikipedia.

TWA: As the world’s largest encyclopedia, what’s the biggest challenge facing Wikipedia today?  

Selena Deckelmann, Chief Product and Technology Officer at Wikipedia.

Selena Deckelmann, Chief Product and Technology Officer at Wikipedia.

Every day, Wikipedia is viewed by millions of people around the world, seeking reliable information on topics that affect their lives and shape their decisions. In an age where anyone can generate a million blog posts, videos, or other content for practically no cost and with low-quality information, spreading misinformation and disinformation is a crucial issue affecting not just Wikipedia, but the internet.

With the rise of generative AI, these concerns are increasing. That’s where Wikipedia’s human-centered approach to creating reliable, neutral content verified by secondary sources becomes even more valuable.

TWA: Given that Wikipedia is made up of primarily human-generated information, how does Wikipedia deal with issues of misinformation?

The global community of volunteers who contribute to Wikipedia serves as a vigilant first line of defense against misinformation on the platform.

Since the online encyclopedia was created in 2001, volunteers have developed processes and guidelines to ensure the information is as reliable as possible. 

For example, all content on Wikipedia must be verifiable and validated by reliable sources. The information must also be written from a neutral point of view. Part of what makes this system of content moderation work so well is that the site is radically transparent. The public can see every edit and change on a Wikipedia article’s page history, the article talk page where editors discuss changes to an article, and more.

Wikipedia administrators can take disciplinary action to address negative behavior on the site, including disinformation. For example, when a user account or IP address repeatedly violates policies, Wikipedia can block or ban a user or IP address. Violations can include repeated vandalism, ‘sock puppetry,’ undisclosed paid editing, or edit warring. 

Moreover, volunteer editors for different language Wikipedias can create specific guidelines and policies based on particular issues that they come across in their language editions.

While volunteers generally address misinformation, the Wikimedia Foundation also has a Trust and Safety function that investigates more significant issues of disinformation or harmful content that might appear on the site. 

In addition, the Foundation has a technology team that regularly works with volunteers on features designed to prevent misinformation. We also give grants to organizations to explore issues of misinformation and their impact, and we partner with global organizations to prevent misinformation from spreading.

Did you know that Wikipedia has been using AI since 2002?

Wikiepdia has actually been using AI for some time. (Image – Shutterstock)

TWA: How much AI is Wikipedia using to fact-check and manage the content on the site? 

The use of AI on Wikipedia is not new. Volunteer contributors have used artificial intelligence tools and bots since 2002 to support their work. The Wikimedia Foundation has had a team dedicated to machine learning since 2017.

We believe that AI works best as an augmentation for the work that humans do on Wikipedia and other Wikimedia platforms. Our approach to AI is through closed-loop systems where humans can edit, improve, and audit the job done by AI.

For example, a bot called ClueBoth NG, created by volunteers and active for over a decade, detects vandalism on Wikipedia using a machine-learning model. The model is created using human-labeled training data. This means that volunteers use an interface to label an edit as vandalism or not, manually. A training algorithm then uses that data to create the model, which identifies new amendments suspected of vandalism and reverts them.

The Wikimedia Foundation also creates, deploys, and maintains hundreds of machine learning models serving various roles, from anti-vandalism to improving the experiences of Wikipedia editors.

For example, the Content Translation Tool, developed by the Wikimedia Foundation in 2014, integrates with external machine translation models and has been used by volunteers to help translate more than 1.5 million Wikipedia articles.

TWA: Lastly, as an open-source platform, are there plans to integrate ChatGPT-like functions for searches and queries?

We recognize that AI represents an opportunity to scale the work of volunteers on Wikipedia and other Wikimedia projects.

We recently submitted an experimental Wikipedia plugin to OpenAI, now available for ChatGPT Plus users with access to plugins. The Wikipedia plugin allows ChatGPT to fetch and summarize the most recent, up-to-date information from Wikipedia for any general knowledge query while attributing and sharing links to the Wikipedia articles from where the data is sourced.

Our aim with this experimental plugin is to understand how we can potentially open access to Wikipedia’s free, reliable knowledge through the growing channel of conversational AI in a future where more knowledge searches may begin with these tools.

Notably, the plugin makes this information available in a way that recognizes the contributions of the thousands of volunteers who create and curate the knowledge on Wikipedia. At the same time, it gives plugin users the ability to sustain Wikipedia’s unique, collaborative model for the future by offering opportunities for them to contribute back to a topic on its respective Wikipedia page.

Without clear attribution and source links to where information is collected, AI applications risk introducing an unprecedented amount of misinformation into the world, where users can’t quickly and clearly distinguish accurate information from hallucinations.

The plugin is in an experimental phase as we continue to explore the evolving AI ecosystem.

A tweet on Wikimania.

Wikimania Singapore: Wikipedia and AI

Deckelmann and the rest of the Wikipedia team were also recently in Singapore for Wikimania, an annual conference celebrating Wikimedia projects. Marking its triumphant return as an in-person gathering following the COVID-19 pandemic, the event brought together more than 2,100 global attendees, including 670 in-person Wikimedians — enthusiastic volunteers contributing to Wikimedia projects — and key figures from the digital information and education landscape.

As one of the world’s most visited websites, Wikipedia is powered by around 300,000 volunteers worldwide, all committed to promoting free access to knowledge. A highlight of the event was the Wikimedian of the Year Award presentation by Wikipedia Founder Jimmy Wales, recognizing the remarkable contributions of a small group of exceptional volunteers to Wikimedia projects across the globe.

This year’s awards featured three honorees from the Asia Pacific (APAC) region, including Malaysia, New Zealand, and Japan, highlighting the significant work being done by Wikimedians in preserving the rich heritage and culture within Asia for others to access and learn about online.

“At a time when trusted knowledge sources like Wikipedia are more important than ever, the Wikimedian of the Year awards celebrate the incredible individuals who make open knowledge possible,” said Jimmy Wales, the founder of Wikipedia.

Wikimania participants also engaged in workshops, interactive sessions, and panel discussions covering topics such as open-source technology advancements, the evolving role of Wikipedia in the age of generative AI, and strategies for closing knowledge gaps and advancing gender inclusion across Wikimedia projects.

The story behind Wikipedia.