Exploring Next
Exploring Next — Ep 375 w/ Justy & Cody — Hallucinations Undermine Trust; Metacognition is a Way Forward
Justy and Cody dig into a paper arguing that the real trust problem with language models is not merely being wrong, but being wrong with unwarranted confidence. They unpack the paper’s shift from answer-versus-abstain to ‘faithful uncertainty,’ where a model’s wording should reflect its actual internal uncertainty. Cody breaks down the discrimination-versus-calibration distinction and why that matters for both chatbots and tool-using agents. Justy pushes on what this means in production, where hedging can either build trust or feel slippery if it is not tied to real behavior.