Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

You Are Here: Home » Internet, Knowledge Management, Search Engines » Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources

Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources

by Sabrina I. Pacifici on Mar 1, 2015

“The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy. The facts are automatically extracted from each source by information extraction methods commonly used to construct knowledge bases. We propose a way to distinguish errors made in the extraction process from factual errors in the web source per se, by using joint inference in a novel multi-layer probabilistic model. We call the trustworthiness score we computed Knowledge-Based Trust (KBT). On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of the results confirms the effectiveness of the method. ”

Xin Luna Dong, Evgeniy Gabrilovich, Kevin Murphy, Van Dang, Wilko Horn, Camillo Lugaresi, Shaohua Sun, Wei Zhang [Google research team] – arXiv:1502.03519 [cs.DB]

Facebook Tweet LinkedIn

Sorry, comments are closed for this post.

Support beSpacific

Research updates provided daily since 2002, with an emphasis on primary sources.
Subscribe to our Mailing List
Follow beSpacific
Searchable Database – Over 45,000 Postings

Searchable database of over 45,000 postings!
Awards for BeSpacific

American Bar Association

BeSpacific: “No one better has her finger on the pulse of the legal information world than Sabrina Pacifici, law librarian and author of the blog BeSpacific,” writes blogger Robert Ambrogi. “Launched in 2002, BeSpacific is one of the longest-running legal blogs and, remarkably, Sabrina seems more prolific today than ever. She posts multiple items every day, covering the gamut of law, technology and knowledge discovery and topics ranging from cybersecurity to legal research to government regulation to civil liberties to IP and more. For me, BeSpacific is one of my daily must-reads and has been for 14 years straight.”

Expert Institute Award for Best Legal Tech Blog 2016, 2017 and 2018
BeSpacific - 3rd Place
Subjects

Pages
LLRX

Sabrina is also the solo Editor, Publisher and Founder of LLRX.com® – Legal, technology and knowledge discovery resources on the “moving edge” for Librarians, Lawyers, Researchers, Academic and Public Interest Communities – launched in 1996.
Archives – 2002 to Present
Archives – 2002 to Present
Calendar

December 2024

M T W T F S S

« Nov

1

2 3 4 5 6 7 8

9 10 11 12 13 14 15

16 17 18 19 20 21 22

23 24 25 26 27 28 29

30 31