Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Category Archives: Microsoft

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft

Wired – “The project’s leader says that allowing everyone to access the collection of public-domain books will help “level the playing field” in the AI industry. Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The dataset was created by Harvard’s newly formed Institutional Data Initiative with funding from both Microsoft and OpenAI. It contains books scanned as part of the Google Books project that are no longer protected by copyright. Around five times the size of the notorious Books3 dataset that was used to train AI models like Meta’s Llama, the Institutional Data Initiative’s database spans genres, decades, and languages, with classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries. Greg Leppert, executive director of the Institutional Data Initiative, says the project is an attempt to “level the playing field” by giving the general public, including small players in the AI industry and individual researchers, access to the sort of highly-refined and curated content repositories that normally only established tech giants have the resources to assemble. “It’s gone through rigorous review,” he says…However the IDI’s dataset is released, it will be joining a host of similar projects, startups, and initiatives that promise to give companies access to substantial and high-quality AI training materials without the risk of running into copyright issues. Firms like Calliope Networks and ProRata have emerged to issue licenses and manage compensation schemes designed to get creators and rights holders paid for providing AI training data…”

US officials urge Americans to use encrypted apps amid unprecedented cyberattack

The Cybersecurity and Infrastructure Security Agency (CISA), National Security Agency (NSA), Federal Bureau of Investigation (FBI) and international partners published today a joint guide, Enhanced Visibility and Hardening Guidance for Communications Infrastructure, that provides best practices to protect against a People’s Republic of China (PRC)-affiliated threat actor that has compromised networks of major global telecommunications… Continue Reading

Pete Recommends – Weekly highlights on cyber security issues, November 23, 2024

Via LLRX – Pete Recommends – Weekly highlights on cyber security issues, November 23, 2024 – Privacy and cybersecurity issues impact every aspect of our lives – home, work, travel, education, finance, health and medical records – to name but a few. On a weekly basis Pete Weiss highlights articles and information that focus on… Continue Reading

LinkedIn launches its first AI agent to take on the role of job recruiters

TechCrunch: “LinkedIn, the social platform used by professionals to connect with others in their field, hunt for jobs, and develop skills, is taking the wraps off its latest effort to build artificial intelligence tools for users. Hiring Assistant is a new product designed to take on a wide array of recruitment tasks, from ingesting scrappy… Continue Reading

Pete Recommends – Weekly highlights on cyber security issues, October 26, 2024

Pete Recommends – Weekly highlights on cyber security issues, October 26, 2024 – Privacy and cybersecurity issues impact every aspect of our lives – home, work, travel, education, finance, health and medical records – to name but a few. On a weekly basis Pete Weiss highlights articles and information that focus on the increasingly complex and… Continue Reading

Microsoft brings AI-powered overviews to Bing

TechCrunch: Microsoft has launched its answer to Google’s AI-powered search experiences: Bing generative search. On the heels of a pilot in July, Bing generative search — albeit still under development — began rolling out to all U.S. users this morning. The easiest way to invoke it is by searching “Bing generative search” on Bing; Microsoft also… Continue Reading

How I Use Microsoft Word to Instantly Check Documents for Plagiarism

How to Geek: “Microsoft Word isn’t just for typing documents; it has a built-in feature called the Similarity Checker that checks your document plagiarism right from your word editor. This tool not only highlights potential plagiarism but also guides you in citing sources correctly. [This is a front line use case, and other applications are… Continue Reading

Slack, Teams, Google Chat: Is There Any Safe Place to Complain About Work Online?

WSJ via MSN: “Workers are getting too comfortable venting on their employers’ chat apps. We tend to forget that nothing we say there is private. Disney last week said it was quitting Slack, after a hacker gained access to an executive’s account and leaked millions of intraoffice messages. They included computer code, details about unreleased… Continue Reading

LinkedIn Is Training AI on User Data Before Updating Its Terms of Service

TechCrunch: “LinkedIn may have trained AI models on user data without updating its terms. LinkedIn users in the U.S. — but not the EU, EEA, or Switzerland, likely due to those regions’ data privacy rules — have an opt-out toggle in their settings screen disclosing that LinkedIn scrapes personal data to train “content creation AI… Continue Reading

Pete Recommends – Weekly highlights on cyber security issues, August 31, 2024

Via LLRX – Pete Recommends – Weekly highlights on cyber security issues, August 31, 2024 – Privacy and cybersecurity issues impact every aspect of our lives – home, work, travel, education, finance, health and medical records – to name but a few. On a weekly basis Pete Weiss highlights articles and information that focus on… Continue Reading