Ars Technica: “Newly unsealed emails allegedly provide the “most damning evidence” yet against Meta in a copyright case raised by book authors alleging that Meta illegally trained its AI models on pirated books. Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta’s unredacted emails were made public for the first time. The new evidence showed that Meta torrented “at least 81.7 terabytes of data across multiple shadow libraries through the site Anna’s Archive, including at least 35.7 terabytes of data from Z-Library and LibGen,” the authors’ court filing said. And “Meta also previously torrented 80.6 terabytes of data from LibGen.” “The magnitude of Meta’s unlawful torrenting scheme is astonishing,” the authors’ filing alleged, insisting that “vastly smaller acts of data piracy—just .008 percent of the amount of copyrighted works Meta pirated—have resulted in Judges referring the conduct to the US Attorneys’ office for criminal investigation.” Seeding expands authors’ distribution theory Book authors had been pressing Meta for more information on the torrenting because of the seemingly obvious copyright concern of Meta seeding, and thus seemingly distributing, the pirated books in the dispute. But Meta resisted those discovery attempts after an order denied authors’ request to review Meta’s torrenting and seeding data. That didn’t stop authors from gathering evidence anyway, including a key document that starts with at least one staffer appearing to uncomfortably joke about the possible legal risks, eventually growing more serious about raising his concerns. “Torrenting from a corporate laptop doesn’t feel right,” Nikolay Bashlykov, a Meta research engineer, wrote in an April 2023 message, adding a smiley emoji. In the same message, he expressed “concern about using Meta IP addresses ‘to load through torrents pirate content.'” By September 2023, Bashlykov had seemingly dropped the emojis, consulting the legal team directly and emphasizing in an email that “using torrents would entail ‘seeding’ the files—i.e., sharing the content outside, this could be legally not OK.” Emails discussing torrenting prove that Meta knew it was “illegal,” authors alleged. And Bashlykov’s warnings seemingly landed on deaf ears, with authors alleging that evidence showed Meta chose to instead hide its torrenting as best it could while downloading and seeding terabytes of data from multiple shadow libraries as recently as April 2024…”
Sorry, comments are closed for this post.