Columbia Journalism Review – “ChatGPT search—which is positioned as a competitor to search engines like Google and Bing—launched with a press release from OpenAI touting claims that the company had “collaborated extensively with the news industry” and “carefully listened to feedback” from certain news organizations that have signed content licensing agreements with the company. In contrast to the original rollout of ChatGPT, two years ago, when publishers learned that OpenAI had scraped their content without notice or consent to train its foundation models, this may seem like an improvement. OpenAI highlights the fact that it allows news publishers to decide whether they want their content to be included in their search results by specifying their preferences in a “robots.txt” file on its website. But while the company presents inclusion in its search as an opportunity to “reach a broader audience,” a Tow Center analysis finds that publishers face the risk of their content being misattributed or misrepresented regardless of whether they allow OpenAI’s crawlers. To better understand the consequences of choices news publishers now face around how their content will be surfaced (or not) by ChatGPT’s search product, the Tow Center randomly selected twenty publishers—representing a mix of those who have deals with OpenAI, those involved in lawsuits against the company, as well as unaffiliated publishers that either allowed or blocked ChatGPT’s search crawler—and tasked the chatbot with identifying the source of block quotes from ten different articles from each publication. We chose quotes that, if pasted into Google or Bing, would return the source article among the top three results and evaluated whether OpenAI’s new search tool would correctly identify the article that was the source of each quote. We chose this test because it allowed us to systematically assess the chatbot’s ability to access and reference publisher content accurately. What we found was not promising for news publishers. Though OpenAI emphasizes its ability to provide users “timely answers with links to relevant web sources,” the company makes no explicit commitment to ensuring the accuracy of those citations. This is a notable omission for publishers who expect their content to be referenced and represented faithfully. Our initial experiments with the tool have revealed numerous instances where content from publishers has been cited inaccurately, raising concerns about the reliability of the tool’s source attribution features. With an estimated fifteen million US users already starting their searches on AI platforms, coupled with OpenAI’s plans to expand this tool to enterprise and education accounts in the coming weeks—and free users in the coming months—this will likely have major implications for news publishers.”
Sorry, comments are closed for this post.