Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Wired – “The project’s leader says that allowing everyone to access the collection of public-domain books will help “level the playing field” in the AI industry. Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The dataset was created by Harvard’s newly formed Institutional Data Initiative with funding from both Microsoft and OpenAI. It contains books scanned as part of the Google Books project that are no longer protected by copyright. Around five times the size of the notorious Books3 dataset that was used to train AI models like Meta’s Llama, the Institutional Data Initiative’s database spans genres, decades, and languages, with classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries. Greg Leppert, executive director of the Institutional Data Initiative, says the project is an attempt to “level the playing field” by giving the general public, including small players in the AI industry and individual researchers, access to the sort of highly-refined and curated content repositories that normally only established tech giants have the resources to assemble. “It’s gone through rigorous review,” he says…However the IDI’s dataset is released, it will be joining a host of similar projects, startups, and initiatives that promise to give companies access to substantial and high-quality AI training materials without the risk of running into copyright issues. Firms like Calliope Networks and ProRata have emerged to issue licenses and manage compensation schemes designed to get creators and rights holders paid for providing AI training data…”

TIME Person of the Year Fact Check Transcript

Fact-Checking What Donald Trump Said in His 2024 Person of the Year Interview With TIME.  Question..has TIME ever fact checked the statements of their choice for Person of the Year? They had to interpret, clarify, restate and correct his statements. TIME has published the transcript of that conversation. In addition, below is a review for… Continue Reading

Teens, Social Media and Technology 2024

Pew: Most teens use social media and have a smartphone, and nearly half say they’re online almost constantly. “Nine-in-ten teens report using YouTube, slightly down from 95% in 2022. Roughly six-in-ten teens say they use TikTok and Instagram, and 55% say the same for Snapchat. YouTube tops the list of the online platforms we asked… Continue Reading

Website Shows How Much Google’s AI Can Glean From Your Photos

Wired – A photo sharing startup founded by an ex-Google engineer found a clever way to turn Google’s tech against itself. “…Last month, Ente launched https://Theyseeyourphotos.com, a website and marketing stunt designed to turn Google’s technology against itself. People can upload any photo to the website, which is then sent to a Google Cloud computer… Continue Reading

Nikon Comedy Wildlife Photography Awards

2024 Winners Portfolio – “The free competition is open to all wildlife photography novices, amateurs and professionals and celebrates the hilarity of our natural world. From a surprised otter to a swearing turtle, Comedy Wildlife’s photographs transcend cultures and ages to bring a smile to everyone’s face. You can find out more about our competition,… Continue Reading

Courtroom Seating Pilot Program

The Supreme Court is implementing a pilot program in which members of the public may apply for Courtroom seating through a fully automated online lottery. Individuals who receive tickets through the lottery will be able to come to the Court knowing that they have reserved seating for a particular argument or non-argument session. The pilot… Continue Reading

Trump Advisers Seek to Shrink or Eliminate Bank Regulators

WSJ via MSN – Trump Advisers Seek to Shrink or Eliminate Bank Regulators – “The Trump transition team has started to explore pathways to dramatically shrink, consolidate or even eliminate the top bank watchdogs in Washington. In recent interviews with potential nominees to lead bank regulatory agencies, Trump advisers and officials from his newfound Department of… Continue Reading

AI GOvernance and Regulatory Archive

Center for Security and Emerging Technology at Georgetown University – Welcome to ETO AGORA (AI GOvernance and Regulatory Archive), a living collection of AI-relevant laws, regulations, standards, and other governance. Browse Documents or search Thematic Collection. How do I use the interface? What’s included and what isn’t? Download AGORA data in bulk Continue Reading

PetSavers Ageing Canine Toolkit

“This PetSavers funded resource, developed from research carried out at the University of Liverpool, has been designed to inform pet owners on the common conditions associated with canine ageing, and encourage regular health screening, veterinary communication and age-appropriate care. Click here to see our collection of resources supporting the toolkit.” BSAVA PetSavers Ageing Canines Booklet… Continue Reading