Engel, Christoph and McAdams, Richard H., Asking GPT for the Ordinary Meaning of Statutory Terms (February 6, 2024). MPI Collective Goods Discussion Paper, No. 2024/5, Available at SSRN: https://ssrn.com/abstract=4718347 or http://dx.doi.org/10.2139/ssrn.4718347
“We report on our test of the Large Language Model (LLM) ChatGPT (GPT) as a tool for generating evidence of the ordinary meaning of statutory terms. We explain why the most useful evidence for interpretation involves a distribution of replies rather than only what GPT regards as the single “best” reply. That motivates our decision to use Chat 3.5 Turbo instead of Chat 4 and to run each prompt we use 100 times. Asking GPT whether the stat-utory term “vehicle” includes a list of candidate objects (e.g., bus, bicycle, skateboard) al-lows us to test it against a benchmark, the results of a high-quality experimental survey (Tobia 2000) that asked over 2,800 English speakers the same questions. After learning what prompts fail and which one works best (a belief prompt combined with a Likert scale reply), we use the successful prompt to test the effects of “informing” GPT that the term appears in a particular rule (one of five possible) or that the legal rule using the term has a particular purpose (one of six possible). Finally, we explore GPT’s sensitivity to meaning at a particular moment in the past (the 1950s) and its ability to distinguish extensional from intensional meaning. To our knowledge, these are the first tests of GPT as a tool for gen-erating empirical data on the ordinary meaning of statutory terms. Legal actors have good reason to be cautious, but LLMs have the potential to radically facilitate and improve legal tasks, including the interpretation of statutes.”
Sorry, comments are closed for this post.