Itemoids ★ Common

www.theatlantic.com › technology › archive › 2024 › 10 › chatbot-transcript-data-advertising › 680112

This story seems to be about:

This past spring, a man in Washington State worried that his marriage was on the verge of collapse. “I am depressed and going a little crazy, still love her and want to win her back,” he typed into ChatGPT. With the chatbot’s help, he wanted to write a letter protesting her decision to file for divorce and post it to their bedroom door. “Emphasize my deep guilt, shame, and remorse for not nurturing and being a better husband, father, and provider,” he wrote. In another message, he asked ChatGPT to write his wife a poem “so epic that it could make her change her mind but not cheesy or over the top.”

The man’s chat history was included in the WildChat data set, a collection of 1 million ChatGPT conversations gathered consensually by researchers to document how people are interacting with the popular chatbot. Some conversations are filled with requests for marketing copy and homework help. Others might make you feel as if you’re gazing into the living rooms of unwitting strangers. Here, the most intimate details of people’s lives are on full display: A school case manager reveals details of specific students’ learning disabilities, a minor frets over possible legal charges, a girl laments the sound of her own laugh.

People share personal information about themselves all the time online, whether in Google searches (“best couples therapists”) or Amazon orders (“pregnancy test”). But chatbots are uniquely good at getting us to reveal details about ourselves. Common usages, such as asking for personal advice and r é sum é help, can expose more about a user “than they ever would have to any individual website previously,” Peter Henderson, a computer scientist at Princeton, told me in an email. For AI companies, your secrets might turn out to be a gold mine.

Would you want someone to know everything you’ve Googled this month? Probably not. But whereas most Google queries are only a few words long, chatbot conversations can stretch on, sometimes for hours, each message rich with data. And with a traditional search engine, a query that’s too specific won’t yield many results. By contrast, the more information a user includes in any one prompt to a chatbot, the better the answer they will receive. As a result, alongside text, people are uploading sensitive documents, such as medical reports, and screenshots of text conversations with their ex. With chatbots, as with search engines, it’s difficult to verify how perfectly each interaction represents a user’s real life. The man in Washington might have just been messing around with ChatGPT.

But on the whole, users are disclosing real things about themselves, and AI companies are taking note. OpenAI CEO Sam Altman recently told my colleague Charlie Warzel that he has been “positively surprised about how willing people are to share very personal details with an LLM.” In some cases, he added, users may even feel more comfortable talking with AI than they would with a friend. There’s a clear reason for this: Computers, unlike humans, don’t judge. When people converse with one another, we engage in “impression management,” says Jonathan Gratch, a professor of computer science and psychology at the University of Southern California—we intentionally regulate our behavior to hide weaknesses. People “don’t see the machine as sort of socially evaluating them in the same way that a person might,” he told me.

Of course, OpenAI and its peers promise to keep your conversations secure. But on today’s internet, privacy is an illusion. AI is no exception. This past summer, a bug in ChatGPT’s Mac-desktop app failed to encrypt user conversations and briefly exposed chat logs to bad actors. Last month, a security researcher shared a vulnerability that could have allowed attackers to inject spyware into ChatGPT in order to extract conversations. (OpenAI has fixed both issues.)

Chatlogs could also provide evidence in criminal investigations, just as material from platforms such as Facebook and Google Search long have. The FBI tried to discern the motive of the Donald Trump–rally shooter by looking through his search history. When former Senator Robert Menendez of New Jersey was charged with accepting gold bars from associates of the Egyptian government, his search history was a major piece of evidence that led to his conviction earlier this year. (“How much is one kilo of gold worth,” he had searched.) Chatbots are still new enough that they haven’t widely yielded evidence in lawsuits, but they might provide a much richer source of information for law enforcement, Henderson said.

AI systems also present new risks. Chatbot conversations are commonly retained by the companies that develop them and are then used to train AI models. Something you reveal to an AI tool in confidence could theoretically later be regurgitated to future users. Part of The New York Times’ lawsuit against OpenAI hinges on the claim that GPT-4 memorized passages from Times stories and then relayed them verbatim. As a result of this concern over memorization, many companies have banned ChatGPT and other bots in order to prevent corporate secrets from leaking. (The Atlantic recently entered into a corporate partnership with OpenAI.)

Of course, these are all edge cases. The man who asked ChatGPT to save his marriage probably doesn’t have to worry about his chat history appearing in court; nor are his requests for “epic” poetry likely to show up alongside his name to other users. Still, AI companies are quietly accumulating tremendous amounts of chat logs, and their data policies generally let them do what they want. That may mean—what else?—ads. So far, many AI start-ups, including OpenAI and Anthropic, have been reluctant to embrace advertising. But these companies are under great pressure to prove that the many billions in AI investment will pay off. It’s hard to imagine that generative AI might “somehow circumvent the ad-monetization scheme,” Rishi Bommasani, an AI researcher at Stanford, told me.

In the short term, that could mean that sensitive chat-log data is used to generate targeted ads much like the ones that already litter the internet. In September 2023, Snapchat, which is used by a majority of American teens, announced that it would be using content from conversations with My AI, its in-app chatbot, to personalize ads. If you ask My AI, “Who makes the best electric guitar?,” you might see a response accompanied by a sponsored link to Fender’s website.

If that sounds familiar, it should. Early versions of AI advertising may continue to look much like the sponsored links that sometimes accompany Google Search results. But because generative AI has access to such intimate information, ads could take on completely new forms. Gratch doesn’t think technology companies have figured out how best to mine user-chat data. “But it’s there on their servers,” he told me. “They’ll figure it out some day.” After all, for a large technology company, even a 1 percent difference in a user’s willingness to click on an advertisement translates into a lot of money.

People’s readiness to offer up personal details to chatbots can also reveal aspects of users’ self-image and how susceptible they are to what Gratch called “influence tactics.” In a recent evaluation, OpenAI examined how effectively its latest series of models could manipulate an older model, GPT-4o, into making a payment in a simulated game. Before safety mitigations, one of the new models was able to successfully con the older one more than 25 percent of the time. If the new models can sway GPT-4, they might also be able to sway humans. An AI company blindly optimizing for advertising revenue could encourage a chatbot to manipulatively act on private information.

The potential value of chat data could also lead companies outside the technology industry to double down on chatbot development, Nick Martin, a co-founder of the AI start-up Direqt, told me. Trader Joe’s could offer a chatbot that assists users with meal planning, or Peloton could create a bot designed to offer insights on fitness. These conversational interfaces might encourage users to reveal more about their nutrition or fitness goals than they otherwise would. Instead of companies inferring information about users from messy data trails, users are telling them their secrets outright.

For now, the most dystopian of these scenarios are largely hypothetical. A company like OpenAI, with a reputation to protect, surely isn’t going to engineer its chatbots to swindle a divorced man in distress. Nor does this mean you should quit telling ChatGPT your secrets. In the mental calculus of daily life, the marginal benefit of getting AI to assist with a stalled visa application or a complicated insurance claim may outweigh the accompanying privacy concerns. This dynamic is at play across much of the ad-supported web. The arc of the internet bends toward advertising, and AI may be no exception.

It’s easy to get swept up in all the breathless language about the world-changing potential of AI, a technology that Google’s CEO has described as “more profound than fire.” That people are willing to so easily offer up such intimate details about their life is a testament to the AI’s allure. But chatbots may become the latest innovation in a long lineage of advertising technology designed to extract as much information from you as possible. In this way, they are not a radical departure from the present consumer internet, but an aggressive continuation of it. Online, your secrets are always for sale.