Daily digest — 2026-06-21

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

The paper introduces the Heuristic Override Benchmark (HOB), consisting of 500 instances across various heuristic and constraint families, to evaluate large language models (LLMs) on reasoning tasks. Analysis of six models, including Gemini 3.1 Pro, reveals that surface cues can significantly override implicit constraints, with no model exceeding 75% performance on strict evaluations; a minimal hint can improve performance by 15 percentage points, indicating a constraint-inference failure. This research highlights the critical need for practitioners to understand and mitigate heuristic biases in LLMs, as explicit goal decomposition and internal deliberation can enhance reasoning capabilities.

arXiv cs.AI — 12 d agoResearch

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

The study investigates the relationship between student trust in AI and their reliance on AI-generated suggestions during programming tasks, using a sample of 432 undergraduates. Findings reveal a non-linear relationship where increased trust correlates with decreased appropriate reliance on AI, moderated by AI literacy and need for cognition. This highlights the necessity for educational frameworks that foster critical evaluation of AI outputs, which is essential for practitioners developing AI tools in educational contexts.

arXiv cs.AI — 12 d ago · found 10 d agoSafety

MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

MedFeat is a novel feature engineering framework designed for clinical tabular prediction, integrating model-awareness and feature importance signals to enhance feature discovery in LLMs. The framework demonstrates a statistically significant average improvement of over 10% compared to state-of-the-art baselines across various clinical tasks, addressing challenges such as class imbalance and interpretability in healthcare data. This advancement is crucial for practitioners as it allows for more targeted and effective feature transformations, potentially leading to improved model performance in clinical applications.

arXiv cs.AI — 12 d agoTraining

The day in AI, distilled.

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

Models & Releases

Research

Tooling & Open Source