Microsoft Claims Its New Tool Can Correct AI Hallucinations

by

An anonymous reader quotes a report from TechCrunch: Microsoft today revealed Correction, a service that attempts to automatically revise AI-generated text that's factually wrong. Correction first flags text that may be erroneous -- say, a summary of a company's quarterly earnings call that possibly has misattributed quotes -- then fact-checks it by comparing the text with a source of truth (e.g. uploaded transcripts). Correction, available as part of Microsoft's Azure AI Content Safety API (in preview for now), can be used with any text-generating AI model, including Meta's Llama and OpenAI's GPT-4o.

"Correction is powered by a new process of utilizing small language models and large language models to align outputs with grounding documents," a Microsoft spokesperson told TechCrunch. "We hope this new feature supports builders and users of generative AI in fields such as medicine, where application developers determine the accuracy of responses to be of significant importance."
Experts caution that this tool doesn't address the root cause of hallucinations. "Microsoft's solution is a pair of cross-referencing, copy-editor-esque meta models designed to highlight and rewrite hallucinations," reports TechCrunch. "A classifier model looks for possibly incorrect, fabricated, or irrelevant snippets of AI-generated text (hallucinations). If it detects hallucinations, the classifier ropes in a second model, a language model, that tries to correct for the hallucinations in accordance with specified 'grounding documents.'"

Os Keyes, a PhD candidate at the University of Washington who studies the ethical impact of emerging tech, has doubts about this. "It might reduce some problems," they said, "But it's also going to generate new ones. After all, Correction's hallucination detection library is also presumably capable of hallucinating." Mike Cook, a research fellow at Queen Mary University specializing in AI, added that the tool threatens to compound the trust and explainability issues around AI. "Microsoft, like OpenAI and Google, have created this issue where models are being relied upon in scenarios where they are frequently wrong," he said. "What Microsoft is doing now is repeating the mistake at a higher level. Let's say this takes us from 90% safety to 99% safety -- the issue was never really in that 9%. It's always going to be in the 1% of mistakes we're not yet detecting."