Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Category Archives: E-Mail

Why won’t Google give a straight answer on whether Bard was trained on Gmail data?

Skiff Blog: “… Google’s Smart Compose feature was trained on Gmail users’ private emails.Bard is not Google’s only language-focused machine learning model. Anyone who’s used Gmail in the past few years knows about the Smart Compose and Smart Reply features, which auto-complete sentences for you as you go.According to Google’s 2019 paper introducing Smart Compose, the feature was trained on “user-composed emails.” Along with the email’s contents, the model also made use of these emails’ subjects, dates and locations. So it’s plainly true that some of Google’s language models have been trained on Gmail users’ emails. Google has not confirmed whether any training data is shared between these earlier models and Bard, but the idea that a new model would build on the strengths of another doesn’t seem far-fetched…the fact that both Smart Compose and Smart Reply were unambiguously trained on Gmail users’ data seems to be an underappreciated topic of public interest in its own right, which brings us to point 3…3. Google researchers have extensively documented the risk of leaking private data from their own machine-learning models, some of which are acknowledged to be trained on “private text communications between users.”In a 2021 paper, Google researchers laid out the privacy risks presented by large language models. They wrote:“The most direct form of privacy leakage occurs when data is extracted from a model that was trained on confidential or private data. For example, GMail’s autocomplete model [10] is trained on private text communications between users, so the extraction of unique snippets of training data would break data secrecy.”As part of this research, Google’s scientists demonstrated their ability to extract “memorized” data — meaning raw training data that reveals its source — from OpenAI’s GPT-2. They emphasized that — although they had chosen to probe GPT-2 because it posed fewer ethical risks since it was trained on publicly available data — the attacks and techniques they laid out in their research “directly apply to any language model, including those trained on sensitive and non-public data”, of which they cite Smart Compose as an example. 4. Google has never denied that Bard was trained on data from Gmail. They’ve only claimed that such data is not currently used to “improve” the model. This point is subtle but significant. Following the controversy around AI researcher Kate Crawford’s tweet, Google crafted an official response to questions about Bard’s use of Gmail data (after having deleted a more immediate response discussed in point 1 above). That statement, which they added to Bard’s FAQ page, is:“Bard responses may also occasionally claim that it uses personal information from Gmail or other private apps and services. That’s not accurate, and as an LLM interface, Bard does not have the ability to determine these facts. We do not use personal data from your Gmail or other private apps and services to improve Bard.”There are two important details in this statement. One is the use of the adjective “personal”. Google has not said that it’s inaccurate that Bard uses information from Gmail, only that it’s inaccurate that it uses personal information from Gmail. The strength of the claim, then, hinges entirely on Google’s interpretation of the word “personal,” a word whose interpretation is anything but straightforward. The other, possibly more significant, detail is that Google has conspicuously never used the past tense in its denials of Bard’s use of Gmail data. In their first tweet on the subject, Google said Bard “is not trained on Gmail data” and in the official FAQ, they write that they do not “use personal data from your Gmail or other private apps and services to improve Bard.” Neither of these statements is inconsistent with Bard having been trained on Gmail data in the past…”

Google Builds on Tech’s Latest Craze With Its Own A.I. Products

Washington Post: “Google is changing the way we search with AI. It could upend the web. Google Search will start answering some queries directly by generating its own results — a move dreaded by publishers and bloggers..” The New York Times: “On Wednesday [May 10, 2023], at its annual conference in Mountain View, Calif., the… Continue Reading

Whistleblowers Are the Conscience of Society, Yet Suffer Gravely For Trying to Hold the Rich and Powerful Accountable For Their Sins

Via LLRX –  Whistleblowers Are the Conscience of Society, Yet Suffer Gravely For Trying to Hold the Rich and Powerful Accountable For Their Sins – Lawyer, activist, author, and whistleblower Ashley Gjovik states: “I blew the whistle and was met with an experience so destructive that I did not have the words to describe what… Continue Reading

Chatbots Sound Like They’re Posting on LinkedIn

The Atlantic – “Large language models make things up, but the worse problem may be in how they present those falsehoods…If you spend any time on the internet, you’re likely now familiar with the gray-and-teal screenshots of AI-generated text. At first they were meant to illustrate ChatGPT’s surprising competence at generating human-sounding prose, and then… Continue Reading

Personalized AI-Written Spam May Soon Be Flooding Your Inbox

Gizmodo: “…Now, the arms race between spam blockers and spam senders is about to escalate with the emergence of a new weapon: generative artificial intelligence. With recent advances in AI made famous by ChatGPT, spammers could have new tools to evade filters, grab people’s attention and convince them to click, buy or give up personal… Continue Reading

Proton launches an end-to-end encrypted password manager

The Verge: “Proton, the company behind Proton Mail, has announced the launch of a new password manager: Proton Pass [beta]. While the service will eventually become free for everyone to use, it’s currently only available as a beta to Proton’s Lifetime and Visionary users for now. As is the case with Proton’s other products, Proton… Continue Reading

The Unbearable White Maleness of AI

Dame Magazine: “We have entered the era of the cute “AI” stunt, and its implications are more immediately disconcerting than the looming specter of a robot apocalypse (and certainly more amusing). The gag goes something like this: A journalist, tasked with covering “artificial intelligence,” asks a computer program to do something for them, such as… Continue Reading

Were you caught up in the latest data breach? Here’s how to tell

ZDNet: “Wondering if your information was posted online from a data breach? Here’s how to check if your accounts are at risk and what to do next…IBM estimates that the average cost of a data breach in 2022 for companies was $4.35 million, with 83% of organizations experiencing one or more security incidents. However, talk of… Continue Reading