You may be entitled to financial compensation… for your data

Jun 29, 2023

With help from Mohar Chatterjee and Derek Robertson

A computer screen with the home page of the artificial intelligence OpenAI web site, displaying its chatGPT robot, is shown,

A photo of OpenAI's website. | Marco Bertorello/AFP/Getty Images

Earlier this year we reported on the concept of “data dignity,” or the belief that individuals should be acknowledged and even compensated for the data they contribute to AI models. Today two experts propose in POLITICO Magazine an “AI dividend,” their deceptively simple policy scheme for how the average American could cash out for what they contribute to systems like ChatGPT. Read the full op-ed below.

For four decades, Alaskans have opened their mailboxes to find checks waiting for them, their cut of the black gold beneath their feet. This is Alaska’s Permanent Fund, funded by the state’s oil revenues and paid to every Alaskan each year. We’re now in a different sort of resource rush, with companies peddling bits instead of oil: generative AI.

Everyone is talking about these new AI technologies — like ChatGPT — and AI companies are touting their awesome power. But they aren’t talking about how that power comes from all of us. Without all of our writings and photos that AI companies are using to train their models, they would have nothing to sell. Big Tech companies are currently taking the work of the American people, without our knowledge and consent, without licensing it, and are pocketing the proceeds.

You are owed profits for your data that powers today’s AI, and we have a way to make that happen. We call it the AI Dividend.

Our proposal is simple, and harkens back to the Alaskan plan. When Big Tech companies produce output from generative AI that was trained on public data, they would pay a tiny licensing fee, by the word or pixel or relevant unit of data. Those fees would go into the AI Dividend fund. Every few months, the Commerce Department would send out the entirety of the fund, split equally, to every resident nationwide. That’s it.

There’s no reason to complicate it further. Generative AI needs a wide variety of data, which means all of us are valuable — not just those of us who write professionally, or prolifically, or well. Figuring out who contributed to which words the AIs output would be both challenging and invasive, given that even the companies themselves don’t quite know how their models work. Paying the dividend to people in proportion to the words or images they create would just incentivize them to create endless drivel, or worse, use AI to create that drivel. The bottom line for Big Tech is that if their AI model was created using public data, they have to pay into the fund. If you’re an American, you get paid from the fund.

Under this plan, hobbyists and American small businesses would be exempt from fees. Only Big Tech companies — those with substantial revenue — would be required to pay into the fund. And they would pay at the point of generative AI output, such as from ChatGPT, Bing, Bard, or their embedded use in third-party services via Application Programming Interfaces.

Our proposal also includes a compulsory licensing plan. By agreeing to pay into this fund, AI companies will receive a license that allows them to use public data when training their AI. This won’t supersede normal copyright law, of course. If a model starts producing copyright material beyond fair use, that’s a separate issue.

Using today’s numbers, here’s what it would look like. The licensing fee could be small, starting at $0.001 per word generated by AI. A similar type of fee would be applied to other categories of generative AI outputs, such as images. That’s not a lot, but it adds up. Since most of Big Tech has started integrating generative AI into products, these fees would mean an annual dividend payment of a couple hundred dollars per person.

The idea of paying you for your data isn’t new, and some companies have tried to do it themselves for users who opted in. And the idea of the public being repaid for use of their resources goes back to well before Alaska’s oil fund. But generative AI is different: It uses data from all of us whether we like it or not, it’s ubiquitous, and it’s potentially immensely valuable. It would cost Big Tech companies a fortune to create a synthetic equivalent to our data from scratch, and synthetic data would almost certainly result in worse output. They can’t create good AI without us.

Our plan would apply to generative AI used in the U.S. It also only issues a dividend to Americans. Other countries can create their own versions, applying a similar fee to AI used within their borders. Just like an American company collects VAT for services sold in Europe, but not here, each country can independently manage their AI policy.

Don’t get us wrong; this isn’t an attempt to strangle this nascent technology. Generative AI has interesting, valuable and possibly transformative uses, and this policy is aligned with that future. Even with the fees of the AI Dividend, generative AI will be cheap and will only get cheaper as technology improves. There are also risks — both every day and esoteric — posed by AI, and the government may need to develop policies to remedy any harms that arise.

Our plan can’t make sure there are no downsides to the development of AI, but it would ensure that all Americans will share in the upsides — particularly since this new technology isn’t possible without our contribution.

LISTEN TO POLITICO'S ENERGY PODCAST: Check out our daily five-minute brief on the latest energy and environmental politics and policy news. Don't miss out on the must-know stories, candid insights, and analysis from POLITICO's energy team. Listen today.

class action on data rights

A class action lawsuit was filed against OpenAI Wednesday over the data the company scrapes from the internet to train its powerful artificial intelligence models. The lawsuit, which joins the ranks of cases like Getty Images vs. Stability AI, is part of a growing discussion over whether AI-generated content infringes on the intellectual property and rights of internet users.

AI companies argue that content created by generative AI falls under fair use because they transform the original work. And while the Supreme Court issued a landmark fair use decision this year, it has yet to weigh in on the generative AI issue specifically.

Tracey Cowan, a partner at the Clarkson Law Firm, also flagged an additional angle to consider: That the data scraped from the internet includes the personal information and photographs of minors. “We can’t let these sharp corporate practices continue to go unchallenged, if for no other reason than to protect our children from the mass exploitation that we’re already seeing across the internet,” Cowan said in a press release. The Washington Post reported that the law firm is actively looking for more plaintiffs to join the sixteen they already have.

The lawsuit asks for a temporary freeze on the commercial use of OpenAI’s GPT 3.5, GPT 4.0, Dall-E, and Vall-E models. It also asks for “data dividends” as financial compensation for those people whose data was used to develop and train these models.

OpenAI did not immediately respond to a request for comment. — Mohar Chatterjee

tracking ai trackers

The landscape around AI “compliance” with still-nascent regulations is about as hazy as the skies in Washington today.

That means efforts to grade AI companies on how well they follow the rules — at least those that exist — are tricky to figure out. Stanford University’s Kevin Klyman, one of the authors of a report on compliance with the draft AI Act that we covered in DFD last week, spoke with POLITICO Digital Bridge’s Mark Scott today about his efforts to track how well systems from Meta, OpenAI, and others complied with the regulatory grinder.

“Because there’s so much up in the air, and because some of the requirements are under-specified. we had to fill on some of the gaps,” Klyman told Mark. One of the big sticking points for AI watchdogs and regulators is transparency, where Klyman says the major players are doing pretty well: “The way providers that are doing a good job at handling risks and mitigations is they have a section of their work related to the model where they say, ‘Here are all of the potential dangers from this model,'” he described.

Our colleague Mark, on the other hand, isn’t quite buying it yet. “Having a transparency section on a website about potential downsides and how a company will handle such risks doesn’t mean such safeguards will be enforced,” he wrote. “For that, you need greater disclosures on how these models operate — something, unfortunately, that almost all the firms did badly on.” — Derek Robertson

Tweet of the Day

the future in 5 links

AI and TV commercials: An aesthetic match made in heaven.
Big newsrooms are teaming up to deal with AI disruption.
Meta is promising behavior analysis “orders of magnitude” larger than GPT-4’s.
AI-generated code promises both great power and great danger.
Europe’s AI Act is still fighting to keep pace with technological developments.

Stay in touch with the whole team: Ben Schreckinger (bschreckinger@politico.com); Derek Robertson (drobertson@politico.com); Mohar Chatterjee (mchatterjee@politico.com); and Steve Heuser (sheuser@politico.com). Follow us @DigitalFuture on Twitter.

If you’ve had this newsletter forwarded to you, you can sign up and read our mission statement at the links provided.

SUBSCRIBE TO POWER SWITCH: The energy landscape is profoundly transforming. Power Switch is a daily newsletter that unlocks the most important stories driving the energy sector and the political forces shaping critical decisions about your energy future, from production to storage, distribution to consumption. Don’t miss out on Power Switch, your guide to the politics of energy transformation in America and around the world. SUBSCRIBE TODAY.