Heartburn

Search:

Don’t Be Misled by GPT-4’s Gift of Gab

www.theatlantic.com › newsletters › archive › 2023 › 03 › dont-be-misled-by-gpt-4s-gift-of-gab › 673411

This story seems to be about:

This is an edition of The Atlantic Daily, a newsletter that guides you through the biggest stories of the day, helps you discover new ideas, and recommends the best in culture. Sign up for it here.

Yesterday, not four months after unveiling the text-generating AI ChatGPT, OpenAI launched its latest marvel of machine learning: GPT-4. The new large-language model (LLM) aces select standardized tests, works across languages, and can even detect the contents of images. But is GPT-4 smart?

First, here are three new stories from The Atlantic:

Welcome to the big blur. Ted Lasso is no longer trying to feel good. How please stopped being polite A Chatty Child

Before I get into OpenAI’s new robot wonder, a quick personal story.

As a high-school student studying for my college-entrance exams roughly two decades ago, I absorbed a bit of trivia from my test-prep CD-ROM: Standardized tests such as the SAT and ACT don’t measure how smart you are, or even what you know. Instead, they are designed to gauge your performance on a specific set of tasks—that is, on the exams themselves. In other words, as I gleaned from the nice people at Kaplan, they are tests to test how you test.

I share this anecdote not only because, as has been widely reported, GPT-4 scored better than 90 percent of test takers on a simulated bar exam, and got a 710 out of 800 on the reading and writing section of the SAT. Rather, it provides an example of how one’s mastery of certain categories of tasks can easily be mistaken for broader skill command or competence. This misconception worked out well for teenage me, a mediocre student who nonetheless conned her way into a respectable university on the merits of a few crams.

But just as tests are unreliable indicators of scholastic aptitude, GPT-4’s facility with words and syntax doesn’t necessarily amount to intelligence—simply, to a capacity for reasoning and analytic thought. What it does reveal is how difficult it can be for humans to tell the difference.

“Even as LLMs are great at producing boilerplate copy, many critics say they fundamentally don’t and perhaps cannot understand the world,” my colleague Matteo Wong wrote yesterday. “They are something like autocomplete on PCP, a drug that gives users a false sense of invincibility and heightened capacities for delusion.”

How false is that sense of invincibility, you might ask? Quite, as even OpenAI will admit.

“Great care should be taken when using language model outputs, particularly in high-stakes contexts,” OpenAI representatives cautioned yesterday in a blog post announcing GPT-4’s arrival.

Although the new model has such facility with language that, as the writer Stephen Marche noted yesterday in The Atlantic, it can generate text that’s virtually indistinguishable from that of a human professional, its user-prompted bloviations aren’t necessarily deep—let alone true. Like other large-language models before it, GPT-4 “‘hallucinates’ facts and makes reasoning errors,” according to OpenAI’s blog post. Predictive text generators come up with things to say based on the likelihood that a given combination of word patterns would come together in relation to a user’s prompt, not as the result of a process of thought.

My partner recently came up with a canny euphemism for what this means in practice: AI has learned the gift of gab. And it is very difficult not to be seduced by such seemingly extemporaneous bursts of articulate, syntactically sound conversation, regardless of their source (to say nothing of their factual accuracy). We’ve all been dazzled at some point or another by a precocious and chatty toddler, or momentarily swayed by the bloated assertiveness of business-dude-speak.

There is a degree to which most, if not all, of us instinctively conflate rhetorical confidence—a way with words—with comprehensive smarts. As Matteo writes,“That belief underpinned Alan Turing’s famous imitation game, now known as the Turing Test, which judged computer intelligence by how ‘human’ its textual output read.”

But, as anyone who’s ever bullshitted a college essay or listened to a random sampling of TED Talks can surely attest, speaking is not the same as thinking. The ability to distinguish between the two is important, especially as the LLM revolution gathers speed.

It’s also worth remembering that the internet is a strange and often sinister place, and its darkest crevasses contain some of the raw material that’s training GPT-4 and similar AI tools. As Matteo detailed yesterday:

Microsoft’s original chatbot, named Tay and released in 2016, became misogynistic and racist, and was quickly discontinued. Last year, Meta’s BlenderBot AI rehashed anti-Semitic conspiracies, and soon after that, the company’s Galactica—a model intended to assist in writing scientific papers—was found to be prejudiced and prone to inventing information (Meta took it down within three days). GPT-2 displayed bias against women, queer people, and other demographic groups; GPT-3 said racist and sexist things; and ChatGPT was accused of making similarly toxic comments. OpenAI tried and failed to fix the problem each time. New Bing, which runs a version of GPT-4, has written its own share of disturbing and offensive text—teaching children ethnic slurs, promoting Nazi slogans, inventing scientific theories.

The latest in LLM tech is certainly clever, if debatably smart. What’s becoming clear is that those of us who opt to use these programs will need to be both.

Related:

ChatGPT changed everything. Now its follow-up is here. The difference between speaking and thinking Today’s News A federal judge in Texas heard a case that challenges the U.S. government’s approval of one of the drugs used for medication abortions. Credit Suisse’s stock price fell to a record low, prompting the Swiss National Bank to pledge financial support if necessary. General Mark Milley, the chair of the Joint Chiefs of Staff, said that the crash of a U.S. drone over the Black Sea resulted from a recent increase in “aggressive actions” by Russia. Dispatches The Weekly Planet: The Alaska oil project will be obsolete before it’s finished, Emma Marris writes. Up for Debate: Conor Friedersdorf argues that Stanford Law’s DEI dean handled a recent campus conflict incorrectly.

Explore all of our newsletters here.

Evening Read Arsh Raziuddin / The Atlantic

Nora Ephron’s Revenge

By Sophie Gilbert

In the 40 years since Heartburn was published, there have been two distinct ways to read it. Nora Ephron’s 1983 novel is narrated by a food writer, Rachel Samstat, who discovers that her esteemed journalist husband is having an affair with Thelma Rice, “a fairly tall person with a neck as long as an arm and a nose as long as a thumb and you should see her legs, never mind her feet, which are sort of splayed.” Taken at face value, the book is a triumphant satire—of love; of Washington, D.C.; of therapy; of pompous columnists; of the kind of men who consider themselves exemplary partners but who leave their wives, seven months pregnant and with a toddler in tow, to navigate an airport while they idly buy magazines. (Putting aside infidelity for a moment, that was the part where I personally believed that Rachel’s marriage was past saving.)

Unfortunately, the people being satirized had some objections, which leads us to the second way to read Heartburn: as historical fact distorted through a vengeful lens, all the more salient for its smudges. Ephron, like Rachel, had indeed been married to a high-profile Washington journalist, the Watergate reporter Carl Bernstein. Bernstein, like Rachel’s husband—whom Ephron named Mark Feldman in what many guessed was an allusion to the real identity of Deep Throat—had indeed had an affair with a tall person (and a future Labour peer), Margaret Jay. Ephron, like Rachel, was heavily pregnant when she discovered the affair. And yet, in writing about what had happened to her, Ephron was cast as the villain by a media ecosystem outraged that someone dared to spill the secrets of its own, even as it dug up everyone else’s.

Read the full article.

Nora Ephron’s Revenge

found Mar '23 The Atlantic

www.theatlantic.com › books › archive › 2023 › 03 › heartburn-nora-ephron-revenge-novel › 673403

This story seems to be about:

The pushback was inevitably personal. “There are also those who say that Heartburn, though funny and sad, is a great misuse of talent, a book whose only point is to nail Carl Bernstein,” New York’s Jesse Kornbluth observed. Writing under the pseudonym Tristan Vox (possibly a play on the Latin for “sorrowful voice”) in Vanity Fair in 1985, the literary critic Leon Wieseltier huffed so tempestuously about the proposed movie adaptation of Heartburn that one can only assume he passed out midway. Ephron, he insisted, had written “one of the most indecent exploitations of celebrity in recent memory.” To be unfaithful to one’s pregnant wife, he concluded, was “banal compared with the infidelity of a mother toward her children,” and if Bernstein had committed adultery, Ephron, by exposing her family to strangers with only the lightest of fictional glosses, was committing “child abuse.”

I’m a few months younger than Heartburn; I grew up amid the wreckage of a similarly busted marriage and contentious divorce. And I’ve come to think of the book over the years as something more than a juicy revenge novel or an infinitely pleasurable roman à clef. Arriving in the tail winds of the fast-and-loose 1970s, it made, amid the jokes, a sincere point about infidelity: that it wasn’t banal at all but could in fact be an irrevocable cleaving open of one’s life, one’s heart, one’s sense of home and stability and self. More radically, Heartburn also emphatically rejected the idea that infidelity was something women—or men, given the portrayal of Thelma’s husband—should have to tacitly endure.

This argument, I think, was what led to such vigorous denunciations of the book (and the movie) from certain quarters. It was too iconoclastic, too righteous. After all, excavating one’s romantic life for the sake of art and a paycheck wasn’t particularly original: In an 2004 introduction to Heartburn, Ephron wrote, “Philip Roth and John Updike picked away at the carcasses of their early marriages in book after book, but to the best of my knowledge they were never hit with the ‘thinly disguised’ thing.” Rather, the collective outrage over the novel was an attempt to wrest the narrative away from Ephron, who, some parties complained, wasn’t being fair with it. Bernstein reportedly threatened to sue; he also requested explicit provisions in their custody agreement that would give him sway over how he might be portrayed in the film.

His reaction, Ephron noted in the 2004 introduction, was “one of the most fascinating things to me about the whole episode: he cheated on me, and then got to behave as if he was the one who had been wronged because I wrote about it!” And yet, it’s undeniable that Heartburn achieved what she wanted it to: It cast the story of her marriage definitively in her terms. This is the power a gifted writer can wield. Is it fair? Not necessarily. But it’s also a power that, as Ephron accurately discerns, is almost exclusively critiqued when it’s exercised by women. Late last year, the internet erupted over an essay by the writer Isabel Kaplan about a boyfriend who had broken up with her because he was threatened by her job. “The more I share about our relationship and breakup, the more vindicated he will feel in his fears,” Kaplan wrote, citing Ephron as an example. “But if I don’t write about it, he succeeds in forcing my silence.”

[Read: The redemption of the bad mother]

That tension runs through Heartburn too. But to take the novel on its own terms for a moment, it is a wholly joyful read, a 178-page stand-up routine about marriage that’s entirely one-sided and openly so. Mark, Rachel’s husband, is introduced as a man who’s both immediately unfaithful and vividly humorless, prone to perusing home-design magazines in bed, forgetting to clean his nails, and lying about books he’s read. Thelma, apart from being tall, makes “gluey puddings.” (Rachel, a food writer, is doubly betrayed when she realizes that during the affair, she gave Thelma one of her recipes.) Rachel also skewers her parents—like Ephron’s, both alcoholics who got rich by investing in Tampax stock—her therapist, Mark’s “dumb Hemingway style he always reserved for his slice-of-life columns,” and sensitive types who express themselves through poetry. (“Show me a woman who cries when the trees lose their leaves in autumn,” Rachel observes in one chapter, “and I’ll show you a real asshole.”)

Some critics have raised stylistic objections to the novel, particularly its structural looseness—wherein Rachel recounts a few weeks of her life while thinking insistently about food—that was perhaps ahead of its time. More often, though, Heartburn’s detractors focused exclusively on Ephron’s supposed sin of betrayal. The movie, Mark Harris notes in his biography of its director, Mike Nichols, was subsequently dismissed as a trifling “woman’s picture” with “the tunnel-vision point of view of the offended party.” And yet, for the past four decades, people have pressed it into one another’s hands, as a friend pressed it into mine. They have read it and shared it and read it again. They’ve found something thrilling and metamorphic in the way that Ephron, by putting her pain on the page, transforms it into comedy. “If I tell the story, I control the version,” Rachel explains at the end of the novel. “If I tell the story, it doesn’t hurt as much.” Heartburn, you may conclude, is ultimately less about revenge than about self-preservation.