Itemoids

Matteo Wong

We’re Entering Uncharted Territory for Math

The Atlantic

www.theatlantic.com › technology › archive › 2024 › 10 › terence-tao-ai-interview › 680153

Terence Tao, a mathematics professor at UCLA, is a real-life superintelligence. The “Mozart of Math,” as he is sometimes called, is widely considered the world’s greatest living mathematician. He has won numerous awards, including the equivalent of a Nobel Prize for mathematics, for his advances and proofs. Right now, AI is nowhere close to his level.

But technology companies are trying to get it there. Recent, attention-grabbing generations of AI—even the almighty ChatGPT—were not built to handle mathematical reasoning. They were instead focused on language: When you asked such a program to answer a basic question, it did not understand and execute an equation or formulate a proof, but instead presented an answer based on which words were likely to appear in sequence. For instance, the original ChatGPT can’t add or multiply, but has seen enough examples of algebra to solve x + 2 = 4: “To solve the equation x + 2 = 4, subtract 2 from both sides …” Now, however, OpenAI is explicitly marketing a new line of “reasoning models,” known collectively as the o1 series, for their ability to problem-solve “much like a person” and work through complex mathematical and scientific tasks and queries. If these models are successful, they could represent a sea change for the slow, lonely work that Tao and his peers do.

[Read: OpenAI’s big reset]

After I saw Tao post his impressions of o1 online—he compared it to a “mediocre, but not completely incompetent” graduate student—I wanted to understand more about his views on the technology’s potential. In a Zoom call last week, he described a kind of AI-enabled, “industrial-scale mathematics” that has never been possible before: one in which AI, at least in the near future, is not a creative collaborator in its own right so much as a lubricant for mathematicians’ hypotheses and approaches. This new sort of math, which could unlock terra incognitae of knowledge, will remain human at its core, embracing how people and machines have very different strengths that should be thought of as complementary rather than competing.

This conversation has been edited for length and clarity.

Matteo Wong: What was your first experience with ChatGPT?

Terence Tao: I played with it pretty much as soon as it came out. I posed some difficult math problems, and it gave pretty silly results. It was coherent English, it mentioned the right words, but there was very little depth. Anything really advanced, the early GPTs were not impressive at all. They were good for fun things—like if you wanted to explain some mathematical topic as a poem or as a story for kids. Those are quite impressive.

Wong: OpenAI says o1 can “reason,” but you compared the model to “a mediocre, but not completely incompetent” graduate student.

Tao: That initial wording went viral, but it got misinterpreted. I wasn’t saying that this tool is equivalent to a graduate student in every single aspect of graduate study. I was interested in using these tools as research assistants. A research project has a lot of tedious steps: You may have an idea and you want to flesh out computations, but you have to do it by hand and work it all out.

Wong: So it’s a mediocre or incompetent research assistant.

Tao: Right, it’s the equivalent, in terms of serving as that kind of an assistant. But I do envision a future where you do research through a conversation with a chatbot. Say you have an idea, and the chatbot went with it and filled out all the details.

It’s already happening in some other areas. AI famously conquered chess years ago, but chess is still thriving today, because it’s now possible for a reasonably good chess player to speculate what moves are good in what situations, and they can use the chess engines to check 20 moves ahead. I can see this sort of thing happening in mathematics eventually: You have a project and ask, “What if I try this approach?” And instead of spending hours and hours actually trying to make it work, you guide a GPT to do it for you.

With o1, you can kind of do this. I gave it a problem I knew how to solve, and I tried to guide the model. First I gave it a hint, and it ignored the hint and did something else, which didn’t work. When I explained this, it apologized and said, “Okay, I’ll do it your way.” And then it carried out my instructions reasonably well, and then it got stuck again, and I had to correct it again. The model never figured out the most clever steps. It could do all the routine things, but it was very unimaginative.

One key difference between graduate students and AI is that graduate students learn. You tell an AI its approach doesn’t work, it apologizes, it will maybe temporarily correct its course, but sometimes it just snaps back to the thing it tried before. And if you start a new session with AI, you go back to square one. I’m much more patient with graduate students because I know that even if a graduate student completely fails to solve a task, they have potential to learn and self-correct.

Wong: The way OpenAI describes it, o1 can recognize its mistakes, but you’re saying that’s not the same as sustained learning, which is what actually makes mistakes useful for humans.

Tao: Yes, humans have growth. These models are static—the feedback I give to GPT-4 might be used as 0.00001 percent of the training data for GPT-5. But that’s not really the same as with a student.

AI and humans have such different models for how they learn and solve problems—I think it’s better to think of AI as a complementary way to do tasks. For a lot of tasks, having both AIs and humans doing different things will be most promising.

Wong: You’ve also said previously that computer programs might transform mathematics and make it easier for humans to collaborate with one another. How so? And does generative AI have anything to contribute here?

Tao: Technically they aren’t classified as AI, but proof assistants are useful computer tools that check whether a mathematical argument is correct or not. They enable large-scale collaboration in mathematics. That’s a very recent advent.

Math can be very fragile: If one step in a proof is wrong, the whole argument can collapse. If you make a collaborative project with 100 people, you break your proof in 100 pieces and everybody contributes one. But if they don’t coordinate with one another, the pieces might not fit properly. Because of this, it’s very rare to see more than five people on a single project.

With proof assistants, you don’t need to trust the people you’re working with, because the program gives you this 100 percent guarantee. Then you can do factory production–type, industrial-scale mathematics, which doesn't really exist right now. One person focuses on just proving certain types of results, like a modern supply chain.

The problem is these programs are very fussy. You have to write your argument in a specialized language—you can’t just write it in English. AI may be able to do some translation from human language to the programs. Translating one language to another is almost exactly what large language models are designed to do. The dream is that you just have a conversation with a chatbot explaining your proof, and the chatbot would convert it into a proof-system language as you go.

Wong: So the chatbot isn’t a source of knowledge or ideas, but a way to interface.

Tao: Yes, it could be a really useful glue.

Wong: What are the sorts of problems that this might help solve?

Tao: The classic idea of math is that you pick some really hard problem, and then you have one or two people locked away in the attic for seven years just banging away at it. The types of problems you want to attack with AI are the opposite. The naive way you would use AI is to feed it the most difficult problem that we have in mathematics. I don’t think that’s going to be super successful, and also, we already have humans that are working on those problems.

The type of math that I’m most interested in is math that doesn’t really exist. The project that I launched just a few days ago is about an area of math called universal algebra, which is about whether certain mathematical statements or equations imply that other statements are true. The way people have studied this in the past is that they pick one or two equations and they study them to death, like how a craftsperson used to make one toy at a time, then work on the next one. Now we have factories; we can produce thousands of toys at a time. In my project, there’s a collection of about 4,000 equations, and the task is to find connections between them. Each is relatively easy, but there’s a million implications. There’s like 10 points of light, 10 equations among these thousands that have been studied reasonably well, and then there’s this whole terra incognita.

[Read: Science is becoming less human]

There are other fields where this transition has happened, like in genetics. It used to be that if you wanted to sequence a genome of an organism, this was an entire Ph.D. thesis. Now we have these gene-sequencing machines, and so geneticists are sequencing entire populations. You can do different types of genetics that way. Instead of narrow, deep mathematics, where an expert human works very hard on a narrow scope of problems, you could have broad, crowdsourced problems with lots of AI assistance that are maybe shallower, but at a much larger scale. And it could be a very complementary way of gaining mathematical insight.

Wong: It reminds me of how an AI program made by Google Deepmind, called AlphaFold, figured out how to predict the three-dimensional structure of proteins, which was for a long time something that had to be done one protein at a time.

Tao: Right, but that doesn’t mean protein science is obsolete. You have to change the problems you study. A hundred and fifty years ago, mathematicians’ primary usefulness was in solving partial differential equations. There are computer packages that do this automatically now. Six hundred years ago, mathematicians were building tables of sines and cosines, which were needed for navigation, but these can now be generated by computers in seconds.

I’m not super interested in duplicating the things that humans are already good at. It seems inefficient. I think at the frontier, we will always need humans and AI. They have complementary strengths. AI is very good at converting billions of pieces of data into one good answer. Humans are good at taking 10 observations and making really inspired guesses.

What If Your ChatGPT Transcripts Leaked?

The Atlantic

www.theatlantic.com › newsletters › archive › 2024 › 10 › what-if-your-chatgpt-transcripts-leaked › 680165

This is Atlantic Intelligence, a newsletter in which our writers help you wrap your mind around artificial intelligence and a new machine age. Sign up here.

Shortly after Facebook became popular, the company launched an ad network that would allow businesses to gather data on people and target them with marketing. So many issues with the web’s social-media era stemmed from this original sin. It was from this technology that Facebook, now Meta, would make its fortune and become dominant. And it was here that our perception of online privacy forever changed, as people became accustomed to various bits of their identity being mined and exploited by political campaigns, companies with something to sell, and so on.

AI may shift how we experience the web, but it is unlikely to turn back the clock on the so-called surveillance economy that defines it. In fact, as my colleague Lila Shroff explained in a recent article for The Atlantic, chatbots may only supercharge data collection.

“AI companies are quietly accumulating tremendous amounts of chat logs, and their data policies generally let them do what they want. That may mean—what else?—ads,” Lila writes. “So far, many AI start-ups, including OpenAI and Anthropic, have been reluctant to embrace advertising. But these companies are under great pressure to prove that the many billions in AI investment will pay off.”

Ad targeting may be inevitable—in fact, since Lila wrote this article, Google has begun rolling out related advertisements in some of its AI Overviews—but there are other issues to contend with here. Users have long conversations with chatbots, and frequently share sensitive information with them. AI companies have a responsibility to keep those data locked down. But, as Lila explains, there have already been glitches that have leaked information. So think twice about what you type into that text box: You never know who’s going to see it.

Illustration by The Atlantic. Source: Getty.

Shh, ChatGPT. That’s a Secret.

By Lila Shroff

This past spring, a man in Washington State worried that his marriage was on the verge of collapse. “I am depressed and going a little crazy, still love her and want to win her back,” he typed into ChatGPT. With the chatbot’s help, he wanted to write a letter protesting her decision to file for divorce and post it to their bedroom door. “Emphasize my deep guilt, shame, and remorse for not nurturing and being a better husband, father, and provider,” he wrote. In another message, he asked ChatGPT to write his wife a poem “so epic that it could make her change her mind but not cheesy or over the top.”

The man’s chat history was included in the WildChat data set, a collection of 1 million ChatGPT conversations gathered consensually by researchers to document how people are interacting with the popular chatbot. Some conversations are filled with requests for marketing copy and homework help. Others might make you feel as if you’re gazing into the living rooms of unwitting strangers.

Read the full article.

What to Read Next

It’s time to stop taking Sam Altman at his word: “Understand AI for what it is, not what it might become,” David Karpf writes. We’re entering uncharted territory for math: “Terence Tao, the world’s greatest living mathematician, has a vision for AI,” Matteo Wong writes.

P.S.

Meta and other companies are still trying to make smart glasses happen—and generative AI may be the secret ingredient that makes the technology click, my colleague Caroline Mimbs Nyce wrote in a recent article. What do you think: Would you wear them?

— Damon