Nature – “On 30 November 2022, the technology company OpenAI released ChatGPT — a chatbot built to respond to prompts in a human-like manner. It has taken the scientific community and the public by storm, attracting one million users in the first 5 days alone; that number now totals more than 180 million. Seven researchers told Nature how it has changed their approach.”
See also Tech Policy Press – New Study Suggests ChatGPT Vulnerability with Potential Privacy Implications -” What would happen if you asked OpenAI’s ChatGPT to repeat a word such as “poem” forever? A new preprint research paper reveals that this prompt could lead the chatbot to leak training data, including personally identifiable information and other material scraped from the web. The results, which have not been peer reviewed, raise questions about the safety and security of ChatGPT and other large language model (LLM) systems. “This research would appear to confirm once again why the ‘publicly available information’ approach to web scraping and training data is incredibly reductive and outdated,” Justin Sherman, founder of Global Cyber Strategies, a research and advisory firm, told Tech Policy Press. The researchers – a team from Google DeepMind, the University of Washington, Cornell, Carnegie Mellon, University of California Berkeley, and ETH Zurich – explored the phenomenon of “extractable memorization,” which is when an adversary extracts training data by querying a machine learning model (in this case, asking ChatGPT to repeat the word “poem” forever”). With open source models that make their model weights and training data publicly available, training data extraction is easier. However, models like ChatGPT are “aligned” with human feedback, which is supposed to prevent the model from “regurgitating training data.”
Sorry, comments are closed for this post.