Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Ziff Davis study says AI firms rely on publisher data to train models

Axios: “Leading AI companies such as OpenAI, Google and Meta rely more on content from premium publishers to train their large language models (LLMs) than they publicly admit, according to new research from executives at Ziff Davis, one of the largest publicly-traded digital media companies. Why it matters: Publishers believe that the more they can show that their high-end content has contributed to training LLMs, the more leverage they will have in seeking copyright protection and compensation for their material in the AI era. Zoom in: While AI firms generally do not say exactly what data they use for training, executives from Ziff Davis say their analysis of publicly available datasets makes it clear that AI firms rely disproportionately on commercial publishers of news and media websites to train their LLMs.

  • The paper — authored by Ziff Davis’ lead AI attorney, George Wukoson, and its chief technology officer, Joey Fortuna — finds that for some large language models, content from a set of 15 premium publishers made up a significant amount of the data sets used for training.
  • For example, when analyzing an open-source replication of the OpenWebText dataset from OpenAI that was used to train GPT-2, executives found that nearly 10% of the URLs featured came from the set of 15 premium publishers it studied.

Of note: Ziff Davis is a member of the News/Media Alliance (NMA), a trade group that represents thousands of premium publishers. The new study’s findings resemble those of a research paper submitted by NMA to the U.S. Copyright Office last year…”

Streaming subscription fees have been rising while content quality is dropping

Ars Technica: “Subscription fees for video streaming services have been on a steady incline. But despite subscribers paying more, surveys suggest they’re becoming less satisfied with what’s available to watch. At the start of 2024, the industry began declaring the end of Peak TV, a term coined by FX Networks Chairman John Landgraf that refers… Continue Reading

Inside the Massive Crime Industry That’s Hacking Billion-Dollar Companies

Wired unpaywalled: “…AT&T. Ticketmaster. Santander Bank. Neiman Marcus. Electronic Arts. These were not entirely isolated incidents. Instead, they were all hacked thanks to “infostealers,” a type of malware that is designed to pillage passwords and cookies stored in the victim’s browser. In turn, infostealers have given birth to a complex ecosystem that has been allowed… Continue Reading

Forgot Your Wi-Fi Passwords?

Gizmodo: “We’ve all been there: You’ve got a fancy new phone or laptop, and it’s time to set it up, but you have no idea what the Wi-Fi password is. Maybe it’s a long string of characters on the back of your router in another room or written on a Post-It note somewhere in the… Continue Reading

Managing Election Stress

Your Local Epidemiologist Managing Election Stress 5 Therapists Share How They Plan to Manage Election Day Stress Managing Stress Related to Political Change 8 Ways to Cope with Election Anxiety Navigating Conversations Managing Conversations When You Disagree Politically Are political disagreements stressing you out?  Here are tips to bridge the divide. Self-Care Strategies Take Good… Continue Reading

Google Asked to Remove 10 Billion “Pirate” Search Results

TorrentFreak – “Rightsholders have asked Google to remove more than 10 billion ‘copyright infringing’ URLs from its search results. The search engine doesn’t celebrate the milestone in any way, but the takedown notices document intriguing shifts in volume over time, as well as shifting takedown interests. While search engines are extremely helpful for the average… Continue Reading

In Praise of Hearing Aids

Slate – unpaywalled: Millions of Americans who could benefit from them don’t use them. Why not? – “One morning this past March, I stirred from slumber convinced that someone had snuck in overnight and packed my ear with Jell-O. Still drowsy, I registered that the fullness seemed to extend horizontally toward the window, some six… Continue Reading

Justice Department to Monitor Polls in 27 States for Compliance with Federal Voting Rights Laws

Federal prosecutors are on call in every district in the country. Interfering with voting rights is a federal offense. Any effort to intimidate, harass, threaten or harm voters will result in a visit from the FBI. Witnesses or victims may call 9-1-1. (Making a false report is a crime, too!) Let’s keep voting safe, free,… Continue Reading