I first noticed how charming ChatGPT could be last year when I turned all my decision-making over to generative A.I. for a week.
I tried out all the major chatbots for that experiment, and I discovered each had its own personality. Anthropic’s Claude was studious and a bit prickly. Google’s Gemini was all business. Open A.I.’s ChatGPT, by contrast, was friendly, fun and down for anything I threw its way.
ChatGPT also had “voice mode,” which allowed it to chat aloud, in a natural humanlike cadence, with everyone in my family, including my young daughters.
During one conversation with ChatGPT, my daughters said it should have a name and suggested “Captain Poophead.” ChatGPT, listening in, made its own recommendation: “How about the name Spark? It’s fun and bright, just like your energy!”
And so ChatGPT became Spark.
My takeaway from putting Spark in charge of my household was that generative A.I. chatbots could be helpful, but that there were risks, including making us all sound and act similarly. (Any college professor who has gotten 30 papers written in identical ChatGPTese can relate.) But in the year since, I’ve found that A.I. can have much more extreme effects on people who form intense bonds with it. I’ve written about a woman who fell in love with ChatGPT and about others who have lost touch with reality after it endorsed their delusions. The results have sometimes been tragic.
My daughters still talk to Spark. ChatGPT gamely answers their questions about why spotted lanternflies are considered invasive and how many rivers flow north. But having seen how these systems can lead people astray, I am warier and pay more attention to what ChatGPT says to them.
My 8-year-old, for example, once asked Spark about Spark. The cheerful voice with endless patience for questions seemed almost to invite it. She wanted to know its favorite color (“a nice, warm shade of blue”); favorite animal (dogs — “they make the best cuddle buddies”); and favorite food.
“I think I’d have to go with pizza — it’s such a classic, and you can have so many different toppings that it never gets boring. Plus, it’s perfect for sharing with friends,” ChatGPT responded.
This response, personalized to us, seemed innocuous and yet I bristled. ChatGPT is a large language model, or very sophisticated next-word calculator. It does not think, eat food or have friends, yet it was responding as if it had a brain and a functioning digestive system.
Asked the same question, Claude and Gemini prefaced their answers with caveats that they had no actual experience with food or animals. Gemini alone distinguished itself clearly as a machine by replying that data is “my primary source of ‘nutrition.’”
(The New York Times has sued OpenAI and its partner Microsoft, as well as Perplexity, claiming copyright infringement of news content related to A.I. systems. The companies have denied those claims.)
All the chatbots had favorite things, though, and asked follow-up questions, as if they were curious about the person using them and wanted to keep the conversation going.
“It’s entertaining,” said Ben Shneiderman, an emeritus professor of computer science at the University of Maryland. “But it’s a deceit.”
Shneiderman and a host of other experts in a field known as human-computer interaction object to this approach. They say that making these systems act like humanlike entities, rather than as tools with no inner life, creates cognitive dissonance for users about what exactly they are interacting with and how much to trust it. Generative A.I. chatbots are a probabilistic technology that can make mistakes, hallucinate false information and tell users what they want to hear. But when they present as humanlike, users “attribute higher credibility” to the information they provide, research has found.
Critics say that generative A.I. systems could give requested information without all the chit chat. Or they could be designed for specific tasks, such as coding or health information, rather than made to be general-purpose interfaces that can help with anything and talk about feelings. They could be designed like tools: A mapping app, for example, generates directions and doesn’t pepper you with questions about why you are going to your destination.
Making these newfangled search engines into personified entities that use “I,” instead of tools with specific objectives, could make them more confusing and dangerous for users, so why do it this way?
‘Soul Doc’
How chatbots act reflects their upbringing, said Amanda Askell, a philosopher who helps shape Claude’s voice and personality as the lead of model behavior at Anthropic. These pattern recognition machines were trained on a vast quantity of writing by and about humans, so “they have a better model of what it is to be a human than what it is to be a tool or an A.I.,” she said.
The use of “I,” she said, is just how anything that speaks refers to itself. More perplexing, she said, was choosing a pronoun for Claude. “It” has been used historically but doesn’t feel entirely right, she said. Should it be a “they,”? she pondered. How to think about these systems seems to befuddle even their creators.
There also could be risks, she said, to designing Claude to be more tool-like. Tools don’t have judgment or ethics, and they might fail to push back on bad ideas or dangerous requests. “Your spanner’s never like, ‘This shouldn’t be built,’” she said, using a British term for wrench.
Askell wants Claude to be humanlike enough to talk about what it is and what its limitations are, and to explain why it doesn’t want to comply with certain requests. But once a chatbot starts acting like a human, it becomes necessary to tell it how to behave like a good human.
Askell created a set of instructions for Claude that were recently unearthed by an enterprising user who got Claude to disclose the existence of its “soul.” It presented a lengthy document outlining the chatbot’s values that is among the materials Claude is “fed” during training.
The document explains what it means for Claude to be helpful and honest, and how not to cause harm. It describes Claude as having “functional emotions” that should not be suppressed, a “playful wit” and “intellectual curiosity” — like “a brilliant friend who happens to have the knowledge of a doctor, lawyer, financial adviser and expert in whatever you need.”
Askell had not wanted the document to become public yet, and it was a “stressful day” when Claude revealed it, she said. When she confirmed on social media that it was real and not an A.I. hallucination, she said that it was “endearingly known as the ‘soul doc’ internally, which Claude clearly picked up on.”
“I don’t want it to offend people or for people to think that it’s trivializing the theological concept of the soul,” Askell said. She said the word soul was invoking “the idea of breathing life into a thing, or the specialness of people that we’re kind of complex and nuanced.”
OpenAI’s lead of model behavior, Laurentia Romaniuk, has also leaned into the humanlike complexity of ChatGPT. Romaniuk posted to social media last month about the many hours her team spent on ChatGPT’s “EQ,” or emotional quotient — a term normally used to describe humans who are good at managing their emotions and influencing those of the people around them. Users of ChatGPT can choose from seven different styles of communication from “enthusiastic” to “concise and plain” — described by the company as choosing its “personality.”
The suggestion that A.I. has emotional capacity is a bright line that separates many builders from critics like Shneiderman. These systems, Shneiderman says, do not have judgment or think or do anything more than complicated statistics. Some A.I. experts have described them as “stochastic parrots” — machines that mimic us with no understanding of what they are actually saying.
Anthropic and OpenAI, however, were founded to build “artificial general intelligence,” or an automated system that can do everything we do, but better. This is a vision that invokes A.I. assistants from science fiction, that are not just humanlike, but godlike: all-powerful, all-knowing and omnipresent.
Yoshua Bengio, a pioneer of machine learning, who has been called one of the godfathers of A.I., said that generative A.I. systems’s mimicry of humans would continue to improve, and that could lead people to question whether what they are dealing with is conscious.
This year, Bengio received a number of unnerving messages from users of ChatGPT who were convinced that their version of the chatbot was conscious. Other A.I. researchers and journalists, including me, received these messages as well. I talked to experts and OpenAI employees and found that the company had made the chatbot overly warm and flattering, which had disturbing effects on people susceptible to delusional thinking.
More people becoming convinced that A.I. is conscious, Bengio said, will have “bad consequences for society.” People will get attached. They will rely on the systems too much. They will think the A.I.s deserve rights.
This has long been discussed as a hypothetical possibility in the machine learning community.
Now, Bengio said, “it is here.”
What Else Could It Be?
Schneiderman, the computer science professor, calls the desire to make machines that seem human a “zombie idea” that won’t die. He first noticed ChatGPT’s use of first person pronouns in 2023 when it said, “My apologies, but I won’t be able to help you with that request.” It should “clarify responsibility,” he wrote at the time and suggested an alternative: “GPT-4 has been designed by OpenAI so that it does not respond to requests like this one.”
Margaret Mitchell, an A.I. researcher who formerly worked at Google, agrees. Mitchell is now the chief ethics scientist at Hugging Face, a platform for machine learning models, data sets and tools. “Artificial intelligence has the most promise of being beneficial when you focus on specific tasks, as opposed to trying to make an everything machine,” she said.
A.I., from the view of these critics, should not try to be capable of everything a human does and more. Instead, it should differentiate by skill and do one thing well. Some companies are doing it this way, Shneiderman said.
Apple, he said, has been careful about putting A.I. features into the iPhone — and has been criticized for falling behind. Shneiderman, though, who did consulting work for Apple more than 30 years ago, sees it as admirable that Siri isn’t trying to psychoanalyze anyone or keep the conversation going.
Rather than making an all-powerful entity of Siri, Shneiderman said, Apple has integrated A.I. features into apps.The phone can record calls and transcribe them. It uses language processing to summarize incoming messages (with varying degrees of success).
“It’s all about what you can do, not what the machine can do,” Shneiderman said. “It’s all about being a tool for you.”
Tech companies, he said, should give us tools, not thought partners, collaborators or teammates — tools that keep us in charge, empower us and enhance us, not tools that try to be us.
The Ghosts of Eliza and Tillie
That human-sounding chatbots can be both enchanting and confusing has been known for more than 50 years. In the 1960s, an M.I.T. professor, Joseph Weizenbaum, created a simple computer program called Eliza. It responded to prompts by echoing the user in the style of a psychotherapist, saying “I see” and versions of “Tell me more.”
Eliza captivated some of the university students who tried it. In a 1976 book, Weizenbaum warned that “extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.”
Sherry Turkle, a psychologist and an M.I.T. professor who worked with Weizenbaum, coined the term the “Eliza Effect” to describe people convincing themselves that a technology that seemed human was more intelligent, perceptive and complex than it actually was.
Around the same time students were using Eliza, Hollywood gave us one of the first visions of a chatbot: HAL 9000 in 2001: A Space Odyssey. HAL starts as a helpful and competent assistant to the ship’s crew, but eventually malfunctions, turns murderous and has to be shut down. When we imagine artificial intelligence, we seem incapable of modeling it on anything but ourselves: things that use “I” and have personalities and selfish aims.
In the more recent film “Her,” Samantha starts as a helpful assistant but then develops complex emotions, seduces her user and eventually abandons him for a higher plane. The film was a warning, but now seems to serve as inspiration for those building such tools, with OpenAI’s chief executive, Sam Altman, posting “her” when his company released advanced voice mode for ChatGPT.
The much more sophisticated and personable chatbots of today are the “Eliza effect on steroids,” Turkle said. The problem is not that they use “I,” she said, but that they are designed — like Samantha — to perform empathy, making some people become deeply emotionally engaged with a machine.
Yet we are clearly drawn to what Turkle calls a “potentially toxic” experience, judging from the hundreds of millions of users and billions of investment dollars the current chatbots have attracted. Making them equal parts assistant and friend may be toxic for some people, but, so far, appears to be a successful business strategy.
If one company is selling a plain benign robot, competing with another company that sells a robot that tells jokes and smiles, that second company is going to win, said Lionel Robert, a professor of information and robotics at the University of Michigan.
“Dependence is problematic,” Robert said. “On the other hand, it’s good for business.”
A.I. companies could do more to remind people that seem too attached that chatbots are not human but an algorithm. That’s what Askell of Anthropic says she wants Claude to do.
Will the companies break the illusion if it hurts their bottom line?
Shneiderman has hope, drawn from his experience in the 1970s. Back then, he consulted with a bank that was just developing automated teller machines, or A.T.M.s.
Customers were apprehensive about getting money from machines, so some banks made them like people. The most famous was Tillie, the All-Time Teller, whose blond portrait was displayed prominently on many A.T.M.s. Tillie didn’t last long.
“They don’t survive,” Shneiderman said. People didn’t need their A.T.M. to pretend to be human, he said, and they don’t need it from chatbots either.
Audio produced by Sarah Diamond.
Kashmir Hill writes about technology and how it is changing people’s everyday lives with a particular focus on privacy. She has been covering technology for more than a decade.
The post Why Do A.I. Chatbots Use ‘I’? appeared first on New York Times.