Itemoids

ChatGPT

The Order That Defines the Future of AI in America

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › biden-white-house-ai-executive-order › 675837

Earlier today, President Joe Biden signed the most sweeping set of regulatory principles on artificial intelligence in America to date: a lengthy executive order that directs all types of government agencies to make sure America is leading the way in developing the technology while also addressing the many dangers that it poses. The order explicitly pushes agencies to establish rules and guidelines, write reports, and create funding and research initiatives for AI—“the most consequential technology of our time,” in the president’s own words.

The scope of the order is impressive, especially given that the generative-AI boom began just about a year ago. But the document’s many parts—and there are many—are at times in tension, revealing a broader confusion over what, exactly, America’s primary attitude toward AI should be: Is it a threat to national security, or a just society? Is it a geopolitical weapon? Is it a way to help people?

The Biden administration has answered “all of the above,” demonstrating a belief that the technology will soon be everywhere. “This is a big deal,” Alondra Nelson, a professor at the Institute for Advanced Study who previously served as acting director of the White House Office of Science and Technology Policy, told us. AI will be “as ubiquitous as operating systems in our cellphones,” Nelson said, which means that regulating it will involve “the whole policy space itself.” That very scale almost necessitates ambivalence, and it is as if the Biden administration has taken into account conflicting views without deciding on one approach.

One section of the order adopts wholesale the talking points of a handful of influential AI companies such as OpenAI and Google, while others center the concerns of workers, vulnerable and underserved communities, and civil-rights groups most critical of Big Tech. The order also makes clear that the government is concerned that AI will exacerbate misinformation, privacy violations, and copyright infringement. Even as it heeds the recommendations of Big AI, the order additionally outlines approaches to support smaller AI developers and researchers. And there are plenty of nods toward the potential benefits of the technology as well: AI, the executive order notes, has the “potential to solve some of society’s most difficult challenges.” It could be a boon for small businesses and entrepreneurs, create new categories of employment, develop new medicines, improve health care, and much more.  

If the document reads like a smashing-together of papers written by completely different groups, that’s because it likely is. The president and vice president have held meetings with AI-company executives, civil-rights leaders, and consumer advocates to discuss regulating the technology, and the Biden administration published a Blueprint for an AI Bill of Rights before the launch of ChatGPT last November. That document called for advancing civil rights, racial justice, and privacy protections, among other things. Today’s executive order cites and expands that earlier proposal—it directly addresses AI’s demonstrated ability to contribute to discrimination in contexts such as health care and hiring, the risks of using AI in sentencing and policing, and more. These issues existed long before the arrival of generative AI, a subcategory of artificial intelligence that creates new—or at least compellingly remixed—material based on training data, but those older AI programs stir the collective imagination less than ChatGPT, with its alarmingly humanlike language.

[Read: The future of AI is GOMA]

The executive order, then, is naturally fixated to a great extent on the kind of ultrapowerful and computationally intensive software that underpins that newer technology. At particular issue are so-called dual-use foundation models, which have also been called “frontier AI” models—a term for future generations of the technology with supposedly devastating potential. The phrase was popularized by many of the companies that intend to build these models, and chunks of the executive order match the regulatory framing that these companies have recommended. One influential policy paper from this summer, co-authored in part by staff at OpenAI and Google DeepMind, suggested defining frontier-AI models as including those that would make designing biological or chemical weapons easier, those that would be able to evade human control “through means of deception and obfuscation,” and those that are trained above a threshold of computational power. The executive order uses almost exactly the same language and the same threshold.

A senior administration official speaking to reporters framed the sprawling nature of the document as a feature, not a bug. “AI policy is like running a decathlon,” the official said. “We don’t have the luxury of just picking, of saying, ‘We’re just going to do safety,’ or ‘We’re just going to do equity,’ or ‘We’re just going to do privacy.’ We have to do all of these things.” After all, the order has huge “signaling power,” Suresh Venkatasubramanian, a computer-science professor at Brown University who helped co-author the earlier AI Bill of Rights, told us. “I can tell you Congress is going to look at this, states are going to look at this, governors are going to look at this.”

Anyone looking at the order for guidance will come away with a mixed impression of the technology—which has about as many possible uses as a book has possible subjects—and likely also confusion about what the president decided to focus on or omit. The order spends quite a lot of words detailing how different agencies should prepare to address the theoretical impact of AI on chemical, biological, radiological, and nuclear threats, a framing drawn directly from the policy paper supported by OpenAI and Google. In contrast, the administration spends far fewer on the use of AI in education, a massive application for the technology that is already happening. The document acknowledges the role that AI can play in boosting resilience against climate change—such as by enhancing grid reliability and enabling clean-energy deployment, a common industry talking point—but it doesn’t once mention the enormous energy and water resources required to develop and deploy large AI models, nor the carbon emissions they produce. And it discusses the possibility of using federal resources to support workers whose jobs may be disrupted by AI but does not mention workers who are arguably exploited by the AI economy: for example, people who are paid very little to manually give feedback to chatbots.

[Read: America already has an AI underclass]

International concerns are also a major presence in the order. Among the most aggressive actions the order takes is directing the secretary of commerce to propose new regulations that would require U.S. cloud-service providers, such as Microsoft and Google, to notify the government if foreign individuals or entities who use their services start training large AI models that could be used for malicious purposes. The order also directs the secretary of state and the secretary of homeland security to streamline visa approval for AI talent, and urges several other agencies, including the Department of Defense, to prepare recommendations for streamlining the approval process for noncitizens with AI expertise seeking to work within national labs and access classified information.

Where the surveillance of foreign entities is an implicit nod to the U.S.’s fierce competition with and concerns about China in AI development, China is also the No. 1 source of foreign AI talent in the U.S. In 2019, 27 percent of top-tier U.S.-based AI researchers received their undergraduate education in China, compared with 31 percent who were educated in the U.S, according to a study from Macro Polo, a Chicago-based think tank that studies China’s economy. The document, in other words, suggests actions against foreign agents developing AI while underscoring the importance of international workers to the development of AI in the U.S.

[Read: The new AI panic]

The order’s international focus is no accident; it is being delivered right before a major U.K. AI Safety Summit this week, where Vice President Kamala Harris will be delivering a speech on the administration’s vision for AI. Unlike the U.S.’s broad approach, or that of the EU’s AI Act, the U.K. has been almost entirely focused on those frontier models—“a fairly narrow lane,” Nelson told us. In contrast, the U.S. executive order considers a full range of AI and automated decision-making technologies, and seeks to balance national security, equity, and innovation. The U.S. is trying to model a different approach to the world, she said.

The Biden administration is likely also using the order to make a final push on its AI-policy positions before the 2024 election consumes Washington and a new administration potentially comes in, Paul Triolo, an associate partner for China and a technology-policy lead at the consulting firm Albright Stonebridge, told us. The document expects most agencies to complete their tasks before the end of this term. The resulting reports and regulatory positions could shape any AI legislation brewing in Congress, which will likely take much longer to pass, and preempt a potential Trump administration that, if the past is any indication, may focus its AI policy almost exclusively on America’s global competitiveness.

Still, given that only 11 months have passed since the release of ChatGPT, and its upgrade to GPT-4 came less than five months after that, many of those tasks and timelines appear somewhat vague and distant. The order gives 180 days for the secretaries of defense and homeland security to complete a cybersecurity pilot project, 270 days for the secretary of commerce to launch an initiative to create guidance in another area, 365 days for the attorney general to submit a report on something else. The senior administration official told reporters that a newly formed AI Council among the agency heads, chaired by Bruce Reed, a White House deputy chief of staff, would ensure that each agency makes progress at a steady clip. Once the final deadline passes, perhaps the federal government’s position on AI will have crystallized.

But perhaps its stance and policies cannot, or even should not, settle. Like the internet itself, artificial intelligence is a capacious technology that could be developed, and deployed, in a dizzying combination of ways; Congress is still trying to figure out how copyright and privacy laws, as well as the First Amendment, apply to the decades-old web, and every few years the terms of those regulatory conversations seem to shift again.

A year ago, few people could have imagined how chatbots and image generators would change the basic way we think about the internet’s effects on elections, education, labor, or work; only months ago, the deployment of AI in search engines seemed like a fever dream. All of that, and much more in the nascent AI revolution, has begun in earnest. The executive order’s internal conflict over, and openness to, different values and approaches to AI may have been inevitable, then—the result of an attempt to chart a path for a technology when nobody has a reliable map of where it’s going.

The New Big Tech

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › big-ai-silicon-valley-dominance › 675752

Just about everything you do on the internet is filtered through a handful of tech companies. Google is synonymous with search, Amazon with shopping; much of that happens on phones made by Apple. You might not always know when you’re interacting with the tech giants. Google and Meta alone capture something like half of online ad revenue in the United States. Movies, music, workplace software, and government benefits are all hosted on Big Tech’s data servers.

Big Tech’s stranglehold has lasted for so long that, even with recent antitrust lawsuits and whistleblower exposés, it’s difficult to imagine a world in which these companies are not so dominant. But the craze over generative AI is raising that very possibility. OpenAI, a start-up with only a few hundred employees, kicked off the generative-AI boom with ChatGPT last November and, almost a year later, is still making fools of trillion-dollar rivals. In an age when AI promises to transform everything, new companies are hurtling forward, and some of the behemoths are struggling to keep up. “We’re at one of these moments that could be a succession moment” for the tech industry, Tim Wu, a professor at Columbia Law School who helped design the Biden administration’s antitrust and tech policy, told me.

Succession is hardly guaranteed, but a post–Big Tech world might not herald actual competition so much as a Silicon Valley dominated by another slate of fantastically large and powerful companies, some old and some new. Big Tech has wormed it way into every corner of our lives; now Big AI could be about to do the same.

Chatbots and their ilk are still in their early stages, but everything in the world of AI is already converging around just four companies. You could refer to them by the acronym GOMA: Google, OpenAI, Microsoft, and Anthropic. Shortly after OpenAI released ChatGPT last year, Microsoft poured $10 billion into the start-up and shoved OpenAI-based chatbots into its search engine, Bing. Not to be outdone, Google announced that more AI features were coming to Search, Maps, Docs, and more, and introduced Bard, its own rival chatbot. Microsoft and Google are now in a race to integrate generative AI into just about everything. Meanwhile, Anthropic, a start-up launched by former OpenAI employees, has raised billions of dollars in its own right, including from Google. Companies such as Slack, Expedia, Khan Academy, Salesforce, and Bain are integrating ChatGPT into their products; many others are using Anthropic’s chatbot, Claude.

Executives from GOMA have also met with leaders and officials around the world to shape the future of AI’s deployment and regulation. The four have overlapping but separate proposals for AI safety and regulation, but they have joined together to create the Frontier Model Forum, a consortium whose stated mission is to protect against the supposed world-ending dangers posed by terrifyingly capable models that do not yet exist but, it warns, are right around the corner. That existential language—about bioweapons and nuclear robots—has since migrated its way into all sorts of government proposals and language. If AI is truly reshaping the world, these companies are the sculptors.

Some of Big Tech’s old guard, meanwhile, haven’t been at the forefront of AI and are scrambling to get there. Apple has moved slowly on developing or incorporating generative AI, with one of its flashiest AI announcements centered on the mundane autocorrect. Siri remains the same old Siri. Amazon doesn’t have a salient language model and took almost a year to begin backing a major AI start-up in Anthropic; Meta’s premier language model is free to use, perhaps as a way to dissuade people from paying for OpenAI products. The company’s AI division is robust, but as a whole, Meta continues to lurch between social media, the metaverse, and chatbots.

Despite the large number of start-ups unleashed by the AI frenzy, the big four are already amassing technical and business advantages that are starting to look a lot like those of the current tech behemoths. Search, e-commerce, and the other Big Tech kingdoms were “prone towards tipping to just one or two dominant firms,” Charlotte Slaiman, the vice president of the nonprofit Public Knowledge, told me. “And I fear that AI may be like that as well.” Running a generative AI model such as ChatGPT comes at an “eye-watering” cost, in the words of OpenAI CEO Sam Altman, because the most advanced software requires a huge amount of computing power. One analysis estimated that Altman’s chatbot costs $700,000 a day to run, which OpenAI would not confirm or deny. A conversation with Bard could cost 10 times more than a Google Search, according to Alphabet Chairman John Hennessy (other estimates are much higher).

Those computing and financial costs mean that companies that have already built huge amounts of cloud services, such as Google and Microsoft, or start-ups closely partnered with them, such as Anthropic and OpenAI, might be uncatchable in the AI race. In addition to raw computing power, creating these programs also demands a huge amount of training data, and these companies have a big head start in collecting them: Every chat with GPT-4 might be fodder for GPT-5. “There’s a lot of potential for anticompetitive conduct or just natural business-model pressures” to crowd out competition, Adam Conner, the vice president of technology policy at the Center for American Progress, a left-of-center think tank, told me.

These companies’ access to Washington, D.C., might also help lock in their competitive advantage. Framing their technology as powerful enough to end civilization has turned out to be perversely fantastic PR, allowing GOMA to present itself as trustworthy and steer conversations around AI regulation. “I don’t think we’ve ever seen this particular brand of corporate policy posturing as public relations,” Amba Kak, the executive director of the AI Now Institute and a former adviser on AI at the Federal Trade Commission, told me. If regulators continue to listen, America’s AI policy could functionally amount to Big AI regulating itself.

For their part, the four GOMA companies have provided various visions for a healthy AI industry. A spokesperson from Google noted the company’s support for a competitive AI environment, including the large and diverse set of third-party and open-source programs offered on Google Cloud, as well as the company’s partnerships with numerous AI start-ups. Kayla Wood, a spokesperson for OpenAI, pointed me to a blog post in which the company states that it supports start-up and open-source AI projects that don’t pose “existential risk.” Katie Lowry, a spokesperson for Microsoft, told me that the company has said that AI companies choose Microsoft’s cloud services “to enable AI innovation,” and the company’s CEO, Satya Nadella, has framed Bing as a challenger of Google’s dominance. Anthropic, which did not respond to multiple requests for comment, might be better known for its calls to develop trustworthy models than for an actual product.

A scenario which Big AI dislodges, or at least unsettles, Big Tech is far from preordained. Exactly where the tech industry and the internet are headed will be hard to discern until it becomes clearer exactly what AI can do, and exactly how it will make money. If AI ends up being nothing more than empty hype, Big AI may not be that big at all. Still, the most successful chatbots are, at least for now, built on top of the data and computing infrastructure that existing Silicon Valley giants have been constructing for years. “There is no AI today without Big Tech,” Kak said. Microsoft, Google, and Amazon control some two-thirds of cloud-computing resources around the world, and Meta has its own formidable network of data centers.

Even if their own programs don’t take off, then, Amazon and Meta are still likely to prosper in a world of generative AI as a result of their large cloud-computing services. Those data centers may also tip the power balance among Big AI toward Microsoft and Google and away from the start-ups. Even if OpenAI or Anthropic find unbelievable success, if their chatbots run on Microsoft’s and Amazon’s cloud services, then Microsoft and Amazon will profit. “It’s hard for me to see any Big Tech companies being dislodged,” Conner said. And if people talk to those chatbots on an iPhone, then Apple isn’t going anywhere either.

Then again, the social-media landscape had its dominant players in the mid-2000s, and instead, Facebook conquered all. Yahoo predated Google by years. Certainly, in the 1980s, nobody thought that some college dropouts could beat IBM in personal computing, yet Apple did just that. “If you bet against the online bookstore, you made the wrong bet,” Wu said, later adding, “Taking a look at the necessary scale now and extrapolating that into the future is a very common error.” More efficient programs, better computers, or efforts to build new data centers could make newer AI companies less dependent on existing cloud computing, for instance. Already, there are whispers that OpenAI is exploring making its own, specialized computer chips for AI. And other start-ups and open-source software, such as from MosaicML and Stability AI, could very well find rapid success and reconfigure the makeup of Big AI as it currently stands.

More likely is not a future in which Big AI takes over the internet entirely or one in which Big Tech sets itself up for another decade of rule, but a future in which they coexist: Google, Amazon, Apple, and the rest of the old guard continue to dominate search and shopping and smartphones and cloud computing, while a related set of companies control the chatbots and other AI models weaving their way into how we purchase, socialize, learn, work, and entertain ourselves. Microsoft offers a lesson in how flexible a tech giant can be: After massive success in the dot-com era, the company fell behind in the age of Apple and Google; it reinvented itself in the 2010s and is now riding the AI wave.

If GOMA has its way, perhaps one day Bing will make your travel plans and suggest convenient restaurants; ChatGPT will do your taxes and give medical advice; Claude will tutor your children; Bard will do your Christmas shopping. A Microsoft or OpenAI AI assistant will have helped code the apps you use for everything, and DALL-E will have helped animate your favorite television show. And all of that will happen via Google Chrome or Safari, on a physical MacBook or a Microsoft Surface or an Android purchased on Amazon. Somehow, Big Tech might be just emerging from its infancy.

OpenAI is set to see its valuation at $80 billion—making it the third most valuable startup in the world

Quartz

qz.com › openai-is-set-to-see-its-valuation-at-80-billion-makin-1850950928

OpenAI is in talks to strike a deal involving employee shares that would boost its value to $80 billion, according to recent reports. For the maker of generative artificial intelligence tools ChatGPT and DALL-E, that number is triple its valuation in January.

Read more...

AI Takes on Expiration Dates

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › ai-food-preservation-chemistry-rancidity-detection › 675723

This article was originally published by The Conversation.

Have you ever bitten into a nut or a piece of chocolate expecting a smooth, rich taste only to encounter an unexpected and unpleasant chalky or sour flavor? That taste is rancidity in action, and it affects pretty much every product in your pantry. Now artificial intelligence can help scientists tackle this issue more precisely and efficiently.

We’re a group of chemists who study ways to extend the life of food products, including those that go rancid. We recently published a study describing the advantages of AI tools to help keep oil and fat samples fresh for longer. Because oils and fats are common components in many food types, including chips, chocolate, and nuts, the outcomes of the study could be broadly applied and even affect other areas, including cosmetics and pharmaceuticals.

Food can go rancid when it’s exposed to the air for a while—a process called oxidation. In fact, many common ingredients, but especially lipids, which are fats and oils, react with oxygen. The presence of heat or UV light can accelerate the process.

Oxidation leads to the formation of smaller molecules, such as ketones and aldehydes, that give rancid foods a characteristic rank scent. Repeatedly consuming rancid foods can threaten your health.

Fortunately, both nature and the food industry have an excellent shield against rancidity: antioxidants. Antioxidants include a broad range of natural molecules, such as vitamin C, and synthetic molecules capable of protecting your food from oxidation.

[Read: A crumpled, dried-out relic of the pandemic]

While there are a few ways antioxidants work, overall they can neutralize some of the processes that cause rancidity and preserve the flavors and nutritional value of your food for longer. Many customers don’t even know they are consuming added antioxidants, because food manufacturers typically add them in small amounts during preparation.

But you can’t just sprinkle some vitamin C on your food and expect to see a preservative effect. Researchers have to carefully choose a specific set of antioxidants and precisely calculate the amount of each.

Combining antioxidants does not always strengthen their effect. In fact, there are cases in which using the wrong antioxidants, or mixing them with the wrong ratios, can decrease their protective effect—that’s called “antagonism.” Finding out which combinations work for which types of food requires many experiments that are time-consuming, require specialized personnel, and increase the food’s overall cost.

Exploring all possible combinations would necessitate an enormous amount of time and resources, so researchers are stuck with a few mixtures that provide only some level of protection against rancidity. Here’s where AI comes into play.

You’ve probably seen AI tools such as ChatGPT in the news or have played around with them yourself. These types of systems can take in big sets of data and identify patterns, then generate an output that could be useful to the user.

As chemists, we wanted to teach an AI tool how to look for new combinations of antioxidants. For this, we selected a type of AI capable of working with textual representations, which are written codes describing the chemical structure of antioxidants. First, we fed our AI a list of about a million chemical reactions and taught the program some simple chemistry concepts, like how to identify important features of molecules.

Once the machine could recognize general chemical patterns, such as how certain molecules react with one another, we fine-tuned it by teaching it some more advanced chemistry. For this step, our team used a database of roughly 1,100 mixtures previously described in the research literature.

[Read: Computers are learning to smell]

At this point, the AI could predict the effect of combining any set of two or three antioxidants in under a second. Its prediction aligned with the effect described in the literature 90 percent of the time.

But these predictions didn’t quite align with the experiments our team performed in the lab. In fact, we found that our AI was able to correctly predict only a few of the oxidation experiments we performed with real lard, which shows the complexities of transferring results from a computer to the lab.

Luckily, AI models aren’t static tools with predefined yes-and-no pathways. They’re dynamic learners, so our research team can continue feeding the model new data until it sharpens its predictive capabilities and can accurately predict the effect of each antioxidant combination. The more data the model gets, the more accurate it becomes, much like how humans grow through learning.

We found that adding about 200 examples from the lab enabled the AI to learn enough chemistry to predict the outcomes of the experiments performed by our team, with only a slight difference between the predicted and the real value.

A model like ours may one day be able to aid scientists developing better ways to preserve food by coming up with the best antioxidant combinations for the specific foods they’re working with—kind of like having a very clever assistant.

We Don’t Actually Know If AI Is Taking Over Everything

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › ai-technology-secrecy-transparency-index › 675699

Since the release of ChatGPT last year, I’ve heard some version of the same thing over and over again: What is going on? The rush of chatbots and endless “AI-powered” apps has made starkly clear that this technology is poised to upend everything—or, at least, something. Yet even the AI experts are struggling with a dizzying feeling that for all the talk of its transformative potential, so much about this technology is veiled in secrecy.

It isn’t just a feeling. More and more of this technology, once developed through open research, has become almost completely hidden within corporations that are opaque about what their AI models are capable of and how they are made. Transparency isn’t legally required, and the secrecy is causing problems: Earlier this year, The Atlantic revealed that Meta and others had used nearly 200,000 books to train their AI models without the compensation or consent of the authors.

Now we have a way to measure just how bad AI’s secrecy problem actually is. Yesterday,  Stanford University’s Center for Research on Foundational Models launched a new index that tracks the transparency of 10 major AI companies, including OpenAI, Google, and Anthropic. The researchers graded each company’s flagship model based on whether its developers publicly disclosed 100 different pieces of information—such as what data it was trained on, the wages paid to the data and content-moderation workers who were involved in its development, and when the model should not be used. One point was awarded for each disclosure. Among the 10 companies, the highest-scoring barely got more than 50 out of the 100 possible points; the average is 37. Every company, in other words, gets a resounding F.

Take OpenAI, which was named to indicate a commitment to transparency. Its flagship model, GPT-4, scored a 48—losing significant points for not revealing information such as the data that were fed into it, how it treated personally identifiable information that may have been captured in said scraped data, and how much energy was used to produce the model. Even Meta, which has prided itself on openness by allowing people to download and adapt its model, scored only 54 points. “A way to think about it is: You are getting baked cake, and you can add decorations or layers to that cake,” says Deborah Raji, an AI accountability researcher at UC Berkeley who wasn’t involved in the research. “But you don’t get the recipe book for what’s actually in the cake.”

[Read: These 183,000 books are fueling the biggest fight in publishing and tech]

Many companies, including OpenAI and Anthropic, have held that they keep such information secret for competitive reasons or to prevent risky proliferation of their technology, or both. I reached out to the 10 companies indexed by the Stanford researchers. An Amazon spokesperson said the company looks forward to carefully reviewing the index. Margaret Mitchell, a researcher and chief ethics scientist at Hugging Face, said the index misrepresented BLOOMZ as the firm’s model; it was in fact produced by an international research collaboration called the BigScience project that was co-organized by the company. (The Stanford researchers acknowledge this in the body of the report. For this reason, I marked BLOOMZ as a BigScience model, not a Hugging Face one, on the chart above.) OpenAI and Cohere declined a request for comment. None of the other companies responded.

The Stanford researchers selected the 100 criteria based on years of existing AI research and policy work, focusing on inputs into each model, facts about the model itself, and the final product’s downstream impacts. For example, the index references scholarly and journalistic investigations into the poor pay for data workers who help perfect AI models to explain its determination that the companies should specify whether they directly employ the workers and any labor protections they put in place. The lead creators of the index, Rishi Bommasani and Kevin Klyman, told me they tried to keep in mind the kinds of disclosures that would be most helpful to a range of different groups: scientists conducting independent research about these models, policy makers designing AI regulation, and consumers deciding whether to use a model in a particular situation.

In addition to insights about specific models, the index reveals industry-wide gaps of information. Not a single model the researchers assessed provides information about whether the data it was trained on had copyright protections or other rules restricting their use. Nor do any models disclose sufficient information about the authors, artists, and others whose works were scraped and used for training. Most companies are also tight-lipped about the shortcomings of their models, whether their embedded biases or how often they make things up.

That every company performs so poorly is an indictment on the industry as a whole. In fact, Amba Kak, the executive director of the AI Now Institute, told me that in her view, the index was not high enough of a standard. The opacity within the industry is so pervasive and ingrained, she told me, that even 100 criteria don’t fully reveal the problems. And transparency is not an esoteric concern: Without full disclosures from companies, Raji told me, “it’s a one-sided narrative. And it is almost always the optimistic narrative.”

[Read: AI’s present matters more than its imagined future]

In 2019, Raji co-authored a paper showing that several facial-recognition products, including ones being sold to the police, worked poorly on women and people of color. The research shed light on the risk of law enforcement using faulty technology. As of August, there have been six reported cases of police falsely accusing people of a crime in the U.S. based on flawed facial recognition; all of the accused are Black. These latest AI models pose similar risks, Raji said. Without giving policy makers or independent researchers the evidence they need to audit and back up corporate claims, AI companies can easily inflate their capabilities in ways that lead consumers or third-party app developers to use faulty or inadequate technology in crucial contexts such as criminal justice and health care.

There have been rare exceptions to the industry-wide opacity. One model not included in the index is BLOOM, which was similarly produced by the BigScience project (but is different from BLOOMZ). The researchers for BLOOM conducted one of the few analyses available to date of the broader environmental impacts of large-scale AI models and also documented information about data creators, copyright, personally identifiable information, and source licenses for the training data. It shows that such transparency is possible. But changing industry norms will require regulatory mandates, Kak told me. “We cannot rely on researchers and the public to be piecing together this map” of information, she said.

Perhaps the biggest clincher is that across the board, the tracker finds that all of the companies have particularly abysmal disclosures in “impact” criteria, which includes the number of users who use their product, the applications being built on top of the technology, and the geographic distribution of where these technologies are being deployed. This makes it far more difficult for regulators to track each firm’s sphere of control and influence, and to hold them accountable. It’s much harder for consumers as well: If OpenAI technology is helping your kid’s teacher, assisting your family doctor, and powering your office productivity tools, you may not even know. In other words, we know so little about these technologies we’re coming to rely on that we can’t even say how much we rely on them.

Secrecy, of course, is nothing new in Silicon Valley. Nearly a decade ago, the tech and legal scholar Frank Pasquale coined the phrase black-box society to refer to the way tech platforms were growing ever more opaque as they solidified their dominance in people’s lives. “Secrecy is approaching critical mass, and we are in the dark about crucial decisions,” he wrote. And yet, despite the litany of cautionary tales from other AI technologies and social media, many people have grown comfortable with black boxes. Silicon Valley spent years establishing a new and opaque norm; now it’s just accepted as a part of life.

The New AI Panic

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › technology-exports-ai-programs-regulations-china › 675605

For decades, the Department of Commerce has maintained a little-known list of technologies that, on grounds of national security, are prohibited from being sold freely to foreign countries. Any company that wants to sell such a technology overseas must apply for permission, giving the department oversight and control over what is being exported and to whom.

These export controls are now inflaming tensions between the United States and China. They have become the primary way for the U.S. to throttle China’s development of artificial intelligence: The department last year limited China’s access to the computer chips needed to power AI and is in discussions now to expand them. A semiconductor analyst told The New York Times that the strategy amounts to a kind of economic warfare.

The battle lines may soon extend beyond chips. Commerce is considering a new blockade on a broad category of general-purpose AI programs, not just physical parts, according to people familiar with the matter. (I am granting them anonymity because they are not authorized to speak to the press.) Although much remains to be seen about how the controls would roll out—and, indeed, whether they will ultimately roll out at all—experts described alarming stakes. If enacted, the limits could generate more friction with China while weakening the foundations of AI innovation in the U.S.

Of particular concern to Commerce are so-called frontier models. The phrase, popularized in the Washington lexicon by some of the very companies that seek to build these models—Microsoft, Google, OpenAI, Anthropic—describes a kind of “advanced” artificial intelligence with flexible and wide-ranging uses that could also develop unexpected and dangerous capabilities. By their determination, frontier models do not exist yet. But an influential white paper published in July and co-authored by a consortium of researchers, including representatives from most of those tech firms, suggests that these models could result from the further development of large language models—the technology underpinning ChatGPT. The same prediction capabilities that allow ChatGPT to write sentences might, in their next generation, be advanced enough to produce individualized disinformation, create recipes for novel biochemical weapons, or enable other unforeseen abuses that could threaten public safety.

This is a distinctly different concern from the use of AI to develop autonomous military systems, which has been part of the motivation for limiting the export of computer chips. The threats of frontier models are nebulous, tied to speculation about how new skill sets could suddenly “emerge” in AI programs. The paper authors argue that now is the time to consider them regardless. Once frontier models are invented and deployed, they could cause harm quickly and at scale. Among the proposals the authors offer, in their 51-page document, to get ahead of this problem: creating some kind of licensing process that requires companies to gain approval before they can release, or perhaps even develop, frontier AI. “We think that it is important to begin taking practical steps to regulate frontier AI today,” the authors write.

The white paper arrived just as policy makers were contemplating the same dread that many have felt since the release of ChatGPT: an inability to parse what it all means for the future. Shortly after the paper’s publication, the White House used some of the language and framing in its voluntary AI commitments, a set of guidelines for leading AI firms that are intended to ensure the safe deployment of the technology without sacrificing its supposed benefits. Microsoft, Google, OpenAI, and Anthropic subsequently launched the Frontier Model Forum, an industry group for producing research and recommendations on “safe and responsible” frontier-model development.

[Read: AI’s present matters more than its imagined future]

Markus Anderljung, one of the white paper’s lead authors and a researcher at the Centre for the Governance of AI and the Center for a New American Security, told me that the point of the document was simply to encourage timely regulatory thinking on an issue that had become top of mind for him and his collaborators. AI models advance rapidly, he reasoned, which necessitates forward thinking. “I don’t know what the next generation of models will be capable of, but I’m really worried about a situation where decisions about what models are put out there in the world are just up to these private companies,” he said.

For the four private companies at the center of discussions about frontier models, though, this kind of regulation could prove advantageous. Conspicuously absent from the gang is Meta, which similarly develops general-purpose AI programs but has recently touted a commitment to releasing at least some of them for free. This has posed a challenge to the other firms’ business models, which rest in part on being able to charge for the same technology. Convincing regulators to control frontier models could restrict the ability of Meta and any other firms to continue publishing and developing their best AI models through open-source communities on the internet; if the technology must be regulated, better for it to happen on terms that favor the bottom line.

Reached for comment, the tech companies at the center of this conversation were fairly tight-lipped. A Google DeepMind spokesperson told me the company believes that “a focus on safety is essential to innovating responsibly,” which is why it is working with industry peers through the forum to advance research on both near- and long-term harms. An Anthropic spokesperson told me the company believes that models should be tested prior to any kind of deployment, commercial or open-source, and that identifying the appropriate tests is the most important question for government, industry, academia, and civil society to work on. Microsoft’s president, Brad Smith, has previously emphasized the need for government to play a strong role in promoting secure, accountable, and trustworthy AI development. OpenAI did not respond to a request for comment.

The obsession with frontier models has now collided with mounting panic about China, fully intertwining ideas for the models’ regulation with national-security concerns. Over the past few months, members of Commerce have met with experts to hash out what controlling frontier models could look like and whether it would be feasible to keep them out of reach of Beijing. A spokesperson for the department told me it routinely assesses the landscape and adjusts its regulations as needed. She declined a more detailed request for comment.

That the white paper took hold in this way speaks to a precarious dynamic playing out in Washington. The tech industry has been readily asserting its power, and the AI panic has made policy makers uniquely receptive to their messaging, says Emily Weinstein, who spoke with me as a research fellow at Georgetown’s Center for Security and Emerging Technology and has since joined Commerce as a senior adviser. Combined with concerns about China and the upcoming election, it’s engendering new and confused policy thinking about how exactly to frame and address the AI-regulatory problem. “Parts of the administration are grasping onto whatever they can because they want to do something,” Weinstein told me.

[Read: The AI crackdown is coming]

The discussions at Commerce “are uniquely symbolic” of this dynamic, she added. The department’s previous chip-export controls “really set the stage for focusing on AI at the cutting edge”; now export controls on frontier models could be seen as a natural continuation. Weinstein, however, called it “a weak strategy”; other AI and tech-policy experts I spoke with sounded their own warnings as well.

The decision would represent an escalation against China, further destabilizing a fractured relationship. Since the chip-export controls were announced on October 7 last year, Beijing has engaged in different apparent retaliatory measures, including banning products from the U.S. chip maker Micron Technology and restricting the export of certain chipmaking metals. Many Chinese AI researchers I’ve spoken with in the past year have expressed deep frustration and sadness over having their work—on things such as drug discovery and image generation—turned into collateral in the U.S.-China tech competition. Most told me that they see themselves as global citizens contributing to global technology advancement, not as assets of the state. Many still harbor dreams of working at American companies.

AI researchers also have a long-standing tradition of regularly collaborating online. Whereas major tech firms, including those represented in the white paper, have the resources to develop their own models, smaller organizations rely on open sourcing—sharing and building on code released to the broader community. Preventing researchers from releasing code would give smaller developers fewer pathways than ever to develop AI products and services, while the AI giants currently lobbying Washington may see their power further entrenched. “If the export controls are broadly defined to include open-source, that would touch on a third-rail issue,” says Matt Sheehan, a Carnegie Endowment for International Peace fellow who studies global technology issues with a focus on China.

What’s frequently left out of considerations as well is how much this collaboration happens across borders in ways that strengthen, rather than detract from, American AI leadership. As the two countries that produce the most AI researchers and research in the world, the U.S. and China are each other’s No. 1 collaborator in the technology’s development. They have riffed off each other’s work to advance the field and a wide array of applications far faster than either one would alone. Whereas the transformer architecture that underpins generative-AI models originated in the U.S., one of the most widely used algorithms, ResNet, was published by Microsoft researchers in China. This trend has continued with Meta’s open-source model, Llama 2. In one recent example, Sheehan saw a former acquaintance in China who runs a medical-diagnostics company post on social media about how much Llama 2 was helping his work. Assuming they’re even enforceable, export controls on frontier models could thus “be a pretty direct hit” to the large community of Chinese developers who build on U.S. models and in turn contribute their own research and advancements to U.S. AI development, Sheehan told me.

[Read: Tech companies’ friendly new strategy to destroy one another]

But the technical feasibility of such export controls is up in the air as well. Because the premise of these controls rests entirely on hypothetical threats, it’s essentially impossible to specify exactly which AI models should be restricted. Any specifications could also be circumvented easily, whether through China accelerating its own innovation or through American firms finding work-arounds, as the previous round of controls showed. Within a month of the Commerce Department announcing its blockade on powerful chips last year, the California-based chipmaker Nvidia announced a less powerful chip that fell right below the export controls’ technical specifications, and was able to continue selling to China. Bytedance, Baidu, Tencent, and Alibaba have each since placed orders for about 100,000 of Nvidia’s China chips to be delivered this year, and more for future delivery—deals that are worth roughly $5 billion, according to the Financial Times.

An Nvidia spokesperson said the kinds of chips that the company sells are crucial to accelerating beneficial applications globally, and that restricting its exports to China “would have a significant, harmful impact on U.S. economic and technology leadership.” The company is, however, unsurprisingly in favor of controlling frontier-AI models as an alternative, which it called a more targeted action with fewer unintended consequences. Bytedance, Baidu, Tencent, and Alibaba did not respond to a request for comment.

In some cases, fixating on AI models would serve as a distraction from addressing the root challenge: The bottleneck for producing novel biochemical weapons, for example, is not finding a recipe, says Weinstein, but rather obtaining the materials and equipment to actually synthesize the armaments. Restricting access to AI models would do little to solve that problem.

Sarah Myers West, the managing director of the AI Now Institute, told me there could be another benefit to the four companies pushing for frontier-model regulation. Evoking the specter of future threats shifts the regulatory attention away from present-day harms of their existing models, such as privacy violations, copyright infringements, and job automation. The idea that “this is a technology that carries significant dangers, so we don’t want it to fall into the wrong hands—I think that very much plays into the fear-mongering anti-China frame that has often been used as a means to pretty explicitly stave off any efforts and regulatory intervention” of the here and now, she said.

I asked Anderljung what he thinks of this. “People overestimate how much this is in the interest of these companies,” he told me, caveating that as an external collaborator he cannot fully know what the companies are thinking. A regulator could very well tell a company after a billion-dollar investment in developing a model that it is not allowed to deploy the technology. “I don’t think it’s at all clear that that would be in the interest of companies,” he said. He added that such controls would be a “yes, and” kind of situation. They would not in any way replace the need for other types of AI regulation on existing models and their harms. “It would be sad,” he said, if the fixation on frontier models crowded out those other discussions.

But West, Weinstein, and others I spoke with said that this is exactly what’s happening. “AI safety as a domain even a few years ago was much more heterogeneous,” West told me. Now? “We’re not talking about the effects on workers and the labor impacts of these systems. We’re not talking about the environmental concerns.” It’s no wonder: When resources, expertise, and power have concentrated so heavily in a few companies, and policy makers are seeped in their own cocktail of fears, the landscape of policy ideas collapses under pressure, eroding the base of a healthy democracy.

Biden’s New Student-Debt Strategy

The Atlantic

www.theatlantic.com › newsletters › archive › 2023 › 10 › student-loan-repayments-biden › 675551

This is an edition of The Atlantic Daily, a newsletter that guides you through the biggest stories of the day, helps you discover new ideas, and recommends the best in culture. Sign up for it here.

Yesterday, President Joe Biden announced an additional $9 billion in student-loan forgiveness. Since Biden’s mass student-loan-forgiveness plan was struck down by the Supreme Court this past summer (student-loan repayments officially resumed on October 1), his administration has been focusing on narrower strategies for relieving student debt, such as an income-driven repayment plan. I called Atlantic staff writer Adam Harris, who covers higher education, to discuss what’s next for the Americans most affected by the return of repayment, and the case for higher education as a public good.

First, here are three new stories from The Atlantic:

The red pill of humility Kevin McCarthy got what he wanted. This movie plot is the stuff of HR nightmares.

The Basis of Public Happiness

Isabel Fattal: What do you make of yesterday’s news of another $9 billion in debt relief?

Adam Harris: There are a few different programs that this relief, which covers about 125,000 people, is coming out of; it is the result of changes Biden made to income-driven repayment plans, as well as public-service loan forgiveness and relief for some borrowers with disabilities.

Over the past several years, the Biden administration has forgiven something like $127 billion in student debt—more than any other administration. Now it’s using some of the programs and levers already available to try to relieve even more. The current total is nothing to scoff at, but it still is only a small crack in the armor of this $1 trillion debt burden we have in the United States. What they’re trying to do is provide as much relief as possible under the programs that they believe are still legal.

Isabel: Who will likely be most affected by the return of student-loan payments this month?

Adam: A consistent fact over the past 20 years is that the borrowers who are most at risk for being in default, who are struggling to repay their student debt, are typically low income and from racial-minority groups—Black borrowers, Latino borrowers. A few months ago, the Consumer Financial Protection Bureau warned that basically one in five student borrowers has risk factors that indicate they could struggle now that student-loan payments have resumed. We know that discretionary spending helps the economy, and big-box retailers like Best Buy and Target have recently expressed concerns about the impacts of the return of repayment on their businesses. A Goldman Sachs report said that something like $70 billion of discretionary income will now be going toward these student-loan payments. If you think of discretionary income, it's not necessarily people going out and buying TVs. It’s that they have a little bit of additional money to do things with.

It’s not necessarily the folks who have $40,000, $50,000, $60,000, $70,000 in student debt, who went to medical school or went to law school, who make up the majority of borrowers who struggle. It’s people who started college and didn’t end up finishing. It’s people who have fewer than $10,000 in student-loan debt who will be likely struggling to repay that debt, even with a repayment plan that’s something like an extra $100 or $200 a month. That’s a car payment. That’s a bill that they will have to consider paying late.

Isabel: You wrote last year that mass student-debt forgiveness is not a solution for the underlying issue of college affordability in America. Are there notable government initiatives in place to tackle the issue of college affordability right now?

Adam: The Biden administration reintroduced a free-community-college proposal in its budget plan this past March. It was ultimately unsuccessful, but it shows that the administration is still interested in some of those programs that will remove the necessity for debt on the front end. Oftentimes we think of higher education as a private good, something that is for the benefit of the student who gets the degree, rather than thinking of it as a public good. At the founding of this nation, some of the Founding Fathers effectively said there is nothing that better deserves your patronage than education.

“Knowledge is in every country the surest basis of public happiness.” That’s George Washington to Congress in his first State of the Union address, saying that in order to build good citizens, you need educated citizens. I often think of that in this moment, when we’re requiring people to go deeply into debt in order to afford this thing that at the beginning people thought was essential to citizenship.

Isabel: Is there anything else you’re thinking about these days in terms of student debt?

Adam: There was a really interesting paper released recently, less focused on student-loan repayment and more about how we think and talk about student loans and how the media covers student loans. Dominique Baker was the lead researcher on it. One of the biggest findings was that very few of the people who had written articles about student loans among eight major publications had ever attended a community college, and the majority of them attended Ivy Plus or public flagship colleges.

If you look across America, around 40 percent of students who are enrolled in higher education in the nation attend community colleges. I have a lot of friends who started college, did not finish college, and now have something like $8,000 of student debt that they’re looking at, saying, How am I going to pay that off with my job that is only giving me enough to afford the basics of living? There are a lot of opportunities for the situation that we are in to spiral into an unsustainable one for a lot of people.

Related:

How student debt has contributed to “delayed” adulthood Why some students are skipping college

Today’s News

At least 51 people have died after a Russian missile strike near the Ukrainian city of Kupiansk, in one of the deadliest attacks on civilians of the war. In a sweeping move, the Biden administration waived 26 federal laws in South Texas to allow for border-wall construction. Last month was the hottest September ever recorded, to the alarm of climate scientists.

Evening Read

Illustration by Ricardo Rey

Does Sam Altman Know What He’s Creating?

By Ross Andersen

On a Monday morning in April, Sam Altman sat inside OpenAI’s San Francisco headquarters, telling me about a dangerous artificial intelligence that his company had built but would never release. His employees, he later said, often lose sleep worrying about the AIs they might one day release without fully appreciating their dangers. With his heel perched on the edge of his swivel chair, he looked relaxed. The powerful AI that his company had released in November had captured the world’s imagination like nothing in tech’s recent history. There was grousing in some quarters about the things ChatGPT could not yet do well, and in others about the future it may portend, but Altman wasn’t sweating it; this was, for him, a moment of triumph.

In small doses, Altman’s large blue eyes emit a beam of earnest intellectual attention, and he seems to understand that, in large doses, their intensity might unsettle. In this case, he was willing to chance it: He wanted me to know that whatever AI’s ultimate risks turn out to be, he has zero regrets about letting ChatGPT loose into the world. To the contrary, he believes it was a great public service.

Read the full article.

More From The Atlantic

The cases against Trump: A guide The West armed Ukraine for a caricature of modern war.

Culture Break

Sharon Core / Trunk Archives

Read. C Pam Zhang’s new novel, Land of Milk and Honey, asks whether seeking pleasure amid collapse is inherently immoral.

Listen. In the latest episode of Radio Atlantic, host Hanna Rosin explains the real reason Biden’s political wins don’t register with voters.

Play our daily crossword.

Katherine Hu contributed to this newsletter.

When you buy a book using a link in this newsletter, we receive a commission. Thank you for supporting The Atlantic.

Artists Are Losing the War Against AI

The Atlantic

www.theatlantic.com › technology › archive › 2023 › 10 › openai-dall-e-3-artists-work › 675519

Late last month, after a year-plus wait, OpenAI quietly released the latest version of its image-generating AI program, DALL-E 3. The announcement was filled with stunning demos—including a minute-long video demonstrating how the technology could, given only a few chat prompts, create and merchandise a character for a children’s story. But perhaps the widest-reaching and most consequential update came in two sentences slipped in at the end: “DALL-E 3 is designed to decline requests that ask for an image in the style of a living artist. Creators can now also opt their images out from training of our future image generation models.”

The language is a tacit response to hundreds of pages of litigation and countless articles accusing tech firms of stealing artists’ work to train their AI software, and provides a window into the next stage of the battle between creators and AI companies. The second sentence, in particular, cuts to the core of debates over whether tech giants like OpenAI, Google, and Meta should be allowed to use human-made work to train AI models without the creator’s permission—models that, artists say, are stealing their ideas and work opportunities.

OpenAI is claiming to offer artists a way to prevent, or “opt out” of, their work being included among the millions of photos, paintings, and other images that AI programs like DALL-E 3 train on to eventually generate images of their own. But opting out is an onerous process, and may be too complex to meaningfully implement or enforce. The ability to withdraw one’s work might also be coming too late: Current AI models have already digested a massive amount of work, and even if a piece of art is kept away from future programs, it’s possible that current models will pass the data they’ve extracted from those images on to their successors. If opting out affords artists any protection, it might extend only to what they create from here on out; the work published online in all the time before 2023 could already be claimed by the machines.

“The past? It’s done—most of it, anyway,” Daniel Gervais, a law professor at Vanderbilt University who studies copyright and AI, told me. Image-generating programs and chatbots in wide commercial use have already consumed terabytes of images and text, some of which has likely been obtained without permission. Once such a model has been completed and deployed, it is not economically feasible for companies to retrain it in response to individual opt-out requests.

Even so, artists, writers, and others have been agitating to protect their work from AI recently. The ownership of not only paintings and photographs but potentially everything on the internet is at stake. Generative-AI programs like DALL-E and ChatGPT have to process and extract patterns from enormous amounts of pixels and text to produce realistic images and write coherent sentences, and the software’s creators are always looking for more data to improve their products: Wikipedia pages, books, photo libraries, social-media posts, and more.

In the past several days, award-winning and self-published authors alike have expressed outrage at the revelation, first reported in this magazine, that nearly 200,000 of their books had been used to train language models from Meta, Bloomberg, and other companies without permission. Lawsuits have been filed against OpenAI, Google, Meta, and several other tech companies accusing them of copyright infringement in the training of AI programs. Amazon is reportedly collecting user conversations to train an AI model for Alexa; in response to the generative-AI boom, Reddit now charges companies to scrape its forums for “human-to-human conversations”; Google has been accused of training AI on user data; and personal information from across the web is fed into these models. Any bit of content or data points that any person has ever created on the web could be fodder for AI, and as of now it’s unclear whether anyone can stop tech companies from harvesting it, or how.

[Read: These 183,000 books are fueling the biggest fight in publishing and tech]

In theory, opting out should provide artists with a clear-cut way to protect a copyrighted work from being vacuumed into generative-AI models. They just have to add a piece of code to their website to stop OpenAI from scraping it, or fill out a form requesting that OpenAI remove an image from any training datasets. And if the company is building future models, such as a hypothetical DALL-E 4, from scratch, it should be “straightforward to remove these images,” Alex Dimakis, a computer scientist at the University of Texas at Austin and a co-director of the National AI Institute for Foundations of Machine Learning, told me. OpenAI would prune opted-out images from the training data before commencing any training, and the resulting model would have no knowledge of those works.

In practice, the mechanism might not be so simple. If DALL-E 4 is based on earlier iterations of the program, it will inevitably learn from the earlier training data, opted-out works included. Even if OpenAI trains new models entirely from scratch, it is possible, perhaps even probable, that AI-generated images from DALL-E 3, or images produced by similar models found across the internet, will be included in future training datasets, Alex Hanna, the director of research at the Distributed AI Research Institute, told me. Those synthetic training images, in turn, will bear traces of the human art underlying them.

Such is the labyrinthine, recursive world emerging from generative AI. Based on human art, machines create images that may be used to train future machines. Those machines will then create their own images, for which human art is still, albeit indirectly, a crucial source. And the cycle begins anew. A painting is a bit like a strand of DNA passed from one generation to the next, accumulating some mutations along the way.

Research has suggested that repeatedly training on synthetic data could be disastrous, compounding biases and producing more hallucinations. But many developers also believe that, if selected carefully, machine outputs can rapidly and cheaply augment training datasets. AI-generated data are already being used or experimented with to train new models from OpenAI, Google, Anthropic, and other companies. As more and more synthetic images and text flood the web, that feedback loop—generation after generation of AI models passing on patterns learned from human work, regardless of the creators’ permission—could become inescapable.

[Read: AI is an existential threat to itself]

In the opt-out form released last month, OpenAI wrote that, once trained, AI programs “no longer have access to [their training] data. The models only retain the concepts that they learned.” While technically true, experts I spoke with agreed that generative-AI programs can retain a startling amount of information from an image in their training data—sometimes enough to reproduce it almost perfectly. “It seems to me AI models learn more than just concepts, in the sense that they also learn the form such concepts have assumed,” Giorgio Franceschelli, a computer scientist at the University of Bologna, told me over email. “In the end, they are trained to reproduce the work as-is, not its concepts.”

There are more quotidian concerns as well. The opt-out policy shifts the burden from ultra-wealthy companies asking for permission onto people taking it away—the assumption is that a piece of art is available to AI models unless the artist says otherwise. “The subset of artists who are even aware and have the time of day to go and learn how to [opt out] is a pretty small subset,” Kelly McKernan, a painter who is suing Stability AI and Midjourney for allegedly infringing artists’ copyrights with their image-generating models, told me. (A spokesperson for Stability AI wrote in a statement that the company “has proactively solicited opt-out requests from creators, and will honor these over 160 million opt-out requests in upcoming training.” Midjourney did not immediately respond to a request for comment, but has filed a motion to dismiss the lawsuit.) The same could be true of an author having to separately flag every book, editorial, or blog post they’ve written.

Exactly how OpenAI will remove flagged images, or by what date, is unclear. The company declined an interview and did not respond to a written request for comment. Multiple computer scientists told me the company will likely use some sort of computer vision model to comb through the dataset, similar to a Google Image search. But every time an image is cropped, compressed, or otherwise edited, it might become harder to identify, Dimakis said. It’s unclear if a company would catch a photograph of an artist’s painting, rather than the image itself, or that it would not knowingly feed that photo into an AI model.  

Copyright and fair use are complicated, and far from decided matters when it comes to AI training data—courts could very well rule that nonconsensually using an image to train AI models is perfectly legal. All of this could make removal or winning litigation even harder, Gervais told me. Artists who have allowed third-party websites to license their work may have no recourse to claw those images back at all. And OpenAI is only one piece of the puzzle—one company perfectly honoring every opt-out request will do nothing for all the countless others until there is some sort of national, binding regulation.

Not everyone is skeptical of the opt-out mechanism, which has also been implemented for future versions of the popular image-generating model from Stability AI. Problems identifying copies of images or challenges with enforcement will exist with any policy, Jason Schultz, the director of the Technology Law and Policy Clinic at NYU, told me, and might end up being “edge case–ish.” Federal Trade Commission enforcement could keep companies compliant. And he worries that more artist-friendly alternatives, such as an opt-in mechanism—no training AI on copyrighted images unless given explicit permission—or some sort of revenue-sharing deal, similar to Spotify royalties, would benefit large companies with the resources to go out and ask every artist or divvy up some of their profits. Extremely strict copyright law when it comes to training generative AI, in other words, could further concentrate the power of large tech companies.

The proliferation of opt-out mechanisms, regardless of what one makes of their shortcomings, also shows that artists and publishers will play a key role in the future of AI. To build better, more accurate, or even “smarter” computers, companies will need to keep updating them with original writing, images, music, and so on, and originality remains a distinctly human trait.