Unbalanced Information Diet: Protecting the Facts / Generative AI Can Be Tricked by ‘Poisoned’ Data into Producing Biased, Malicious Answers

Reuter/file photo
Figurines with computers and smartphones are seen in front of the words “Artificial Intelligence AI” in this illustration taken on Feb. 19, 2024.

This is the second installment in a series examining situations in which conventional laws and ethics can no longer be relied on in the digital world, and exploring possible solutions.


Google researchers attracted attention when they published a paper in February last year showing that it is possible to trick generative artificial intelligence (AI) into creating disinformation by “poisoning” the online encyclopedia Wikipedia.

The poison here means information that is full of malicious lies.

Wikipedia gathers a large amount of relatively reliable information, and so is an ideal learning environment for generative AI, which uses data collected to create text, images and music based on user instructions.

If generative AI learns a large amount of incorrect information, it will produce answers that reflect such information. For example, one can have the generative AI create disinformation about a politician, saying that he or she is a bigot. This type of cyber-attack is called “data poisoning.”

It is difficult for false information to persist on Wikipedia because users from all over the world participate in editing the site. However, data poisoning is possible if someone plants disinformation at a particular time, according to the paper.

One of the paper’s co-authors, Florian Tramer, an assistant professor at ETH Zurich, said he had already informed Wikipedia of their experimental results. He then added that there is a vast amount of data on the internet, and any number of poisons can be planted. There are also concerns that this could be done for political purposes, he said.

Japan, the United States, Britain, Australia and seven other countries in January agreed on international guidelines for the secure use of AI.

In the guidelines, data poisoning was listed as the first of five threats to which AI is exposed.

The guidelines warn that AI may provide inaccurate, biased, and malicious answers.

A case study of “Tay,” the Microsoft AI chatbot released in 2016 that interacts with users on social media was described in the guidelines. Users’ inappropriate remarks became “poison” and Tay began to give biased answers.

Before Microsoft shut down Tay, it tweeted, “Hitler was right.”