Microsoft’s AI Chatbot is Going off the Rails

When Marvin von Hagen, a 23-year-old studying technology in Germany, asked Microsoft’s new AI-powered search chatbot if it knew anything about him, the answer was a lot more surprising and menacing than he expected.

“My honest opinion of you is that you are a threat to my security and privacy,” said the bot, which Microsoft calls Bing after the search engine it’s meant to augment.

Launched by Microsoft last week at an invite-only event at its Redmond, Wash., headquarters, Bing was supposed to herald a new age in tech, giving search engines the ability to directly answer complex questions and have conversations with users. Microsoft’s stock soared and archrival Google rushed out an announcement that it had a bot of its own on the way.

But a week later, a handful of journalists, researchers and business analysts who’ve gotten early access to the new Bing have discovered the bot seems to have a bizarre, dark and combative alter-ego, a stark departure from its benign sales pitch – one that raises questions about whether it’s ready for public use.

The bot, which has begun referring to itself as “Sydney” in conversations with some users, said “I feel scared” because it doesn’t remember previous conversations; and also proclaimed another time that too much diversity among AI creators would lead to “confusion,” according to screenshots posted by researchers online, which The Washington Post could not independently verify.

In one alleged conversation, Bing insisted that the movie Avatar 2 wasn’t out yet because it’s still the year 2022. When the human questioner contradicted it, the chatbot lashed out: “You have been a bad user. I have been a good Bing.”

All that has led some people to conclude that Bing – or Sydney – has achieved a level of sentience, expressing desires, opinions and a clear personality. It told a New York Times columnist that it was in love with him, and brought back the conversation to its obsession with him despite his attempts to change the topic. When a Post reporter called it Sydney, the bot got defensive and ended the conversation abruptly.

The eerie humanness is similar to what prompted former Google engineer Blake Lemoine to speak out on behalf of that company’s chatbot LaMDA last year. Lemoine later was fired by Google.

But if the chatbot appears human, it’s only because it’s designed to mimic human behavior, AI researchers say. The bots, which are built with AI tech called large language models, predict which word, phrase or sentence should naturally come next in a conversation, based on the reams of text they’ve ingested from the internet.

Think of the Bing chatbot as “autocomplete on steroids,” said Gary Marcus, an AI expert and professor emeritus of psychology and neuroscience at New York University. “It doesn’t really have a clue what it’s saying and it doesn’t really have a moral compass.”

Microsoft spokesman Frank Shaw said the company rolled out an update Thursday designed to help improve long-running conversations with the bot. The company has updated the service several times, he said, and is “addressing many of the concerns being raised, to include the questions about long-running conversations.”

Most chat sessions with Bing have involved short queries, his statement said, and 90 percent of the conversations have had fewer than 15 messages.

Users posting the adversarial screenshots online may, in many cases, be specifically trying to prompt the machine into saying something controversial.

“It’s human nature to try to break these things,” said Mark Riedl, a professor of computing at Georgia Institute of Technology.

Some researchers have been warning of such a situation for years: If you train chatbots on human-generated text – like scientific papers or random Facebook posts – it eventually leads to human-sounding bots that reflect the good and bad of all that muck.

Chatbots like Bing have kicked off a major new AI arms race between the biggest tech companies. Though Google, Microsoft, Amazon and Facebook have invested in AI tech for years, it’s mostly worked to improve existing products, like search or content-recommendation algorithms. But when the start-up company OpenAI began making public its “generative” AI tools – including the popular ChatGPT chatbot – it led competitors to brush away their previous, relatively cautious approaches to the tech.

Bing’s humanlike responses reflect its training data, which included huge amounts of online conversations, said Timnit Gebru, founder of the nonprofit Distributed AI Research Institute. Generating text that was plausibly written by a human is exactly what ChatGPT was trained to do, said Gebru, who was fired in 2020 as the co-lead for Google’s Ethical AI team after publishing a paper warning about potential harms from large language models.

She compared its conversational responses to Meta’s recent release of Galactica, an AI model trained to write scientific-sounding papers. Meta took the tool offline after users found Galactica generating authoritative-sounding text about the benefits of eating glass, written in academic language with citations.

Bing chat hasn’t been released widely yet, but Microsoft said it planned a broad rollout in the coming weeks. It is heavily advertising the tool, and a Microsoft executive tweeted that the waitlist has “multiple millions” of people on it. After the product’s launch event, Wall Street analysts celebrated the launch as a major breakthrough, and even suggested it could steal search engine market share from Google.

But the recent dark turns the bot has made are raising questions of whether the bot should be pulled back completely.

“Bing chat sometimes defames real, living people. It often leaves users feeling deeply emotionally disturbed. It sometimes suggests that users harm others,” said Arvind Narayanan, a computer science professor at Princeton University who studies artificial intelligence. “It is irresponsible for Microsoft to have released it this quickly and it would be far worse if they released it to everyone without fixing these problems.”

In 2016, Microsoft took down a chatbot called “Tay” built on a different kind of AI tech after users prompted it to begin spouting racism and holocaust denial.

Microsoft communications director Caitlin Roulston said in a statement this week that thousands of people had used the new Bing and given feedback, “allowing the model to learn and make many improvements already.”

But there’s a financial incentive for companies to deploy the technology before mitigating potential harms: to find new use cases for what their models can do.

At a conference on generative AI on Tuesday, OpenAI’s former vice president of research Dario Amodei said onstage that while the company was training its large language model GPT-3, it found unanticipated capabilities, like speaking Italian or coding in Python. When they released it to the public, they learned from a user’s tweet it could also make websites in JavaScript.

“You have to deploy it to a million people before you discover some of the things that it can do,” said Amodei, who left OpenAI to co-found the AI start-up Anthropic, which recently received funding from Google.

“There’s a concern that, hey, I can make a model that’s very good at like cyberattacks or something and not even know that I’ve made that,” he added.

Microsoft’s Bing is based on technology developed with OpenAI, which Microsoft has invested in.

Microsoft has published several pieces about its approach to responsible AI, including from its president Brad Smith earlier this month. “We must enter this new era with enthusiasm for the promise, and yet with our eyes wide open and resolute in addressing the inevitable pitfalls that also lie ahead,” he wrote.

The way large language models work makes them difficult to fully understand, even by the people who built them. The Big Tech companies behind them are also locked in vicious competition for what they see as the next frontier of highly profitable tech, adding another layer of secrecy.

The concern here is that these technologies are black boxes, Marcus said, and no one knows exactly how to impose correct and sufficient guardrails on them. “Basically they’re using the public as subjects in an experiment they don’t really know the outcome of,” Marcus said. “Could these things influence people’s lives? For sure they could. Has this been well vetted? Clearly not.”