Daily digest — 2026-06-18

ERAlign: Energy-based Representation Alignment of GNNs and LLMs on Text-attributed Graphs

The paper introduces ERAlign, an Energy-based Representation Alignment framework designed to enhance the integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) on Text-attributed Graphs (TAGs). By projecting GNN-encoded structures and LLM-derived text embeddings into a shared latent space and optimizing alignment through an Energy-based Model objective, ERAlign achieves superior representation consistency, demonstrated through state-of-the-art performance across eight TAG datasets with varying supervision levels. This approach addresses representation drift and improves generalization, making it a significant advancement for practitioners working on multi-modal learning tasks involving graphs and textual data.

arXiv cs.AI — 54 d agoResearch

On the Condition Number Dependency in Bilevel Optimization

The paper presents new lower bounds on the oracle complexity for finding $\epsilon$-stationary points in bilevel optimization, particularly when the upper-level problem is nonconvex and the lower-level problem is strongly convex. It establishes a lower bound of $\Omega(\kappa_y^{5/2} \epsilon^{-2})$, which highlights a significant gap in condition number dependency between bilevel and minimax problems, and extends results to various settings including high-order smooth functions and stochastic oracles. This work is crucial for practitioners as it provides deeper insights into the complexity landscape of bilevel optimization, potentially guiding the design of more efficient algorithms in real-world applications.

arXiv cs.AI — 54 d agoTraining

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

TruthRL is a novel reinforcement learning framework designed to enhance the truthfulness of large language models (LLMs) by optimizing for both accurate responses and appropriate abstention when uncertain. Implemented using Generalized Reward Policy Optimization (GRPO), TruthRL employs a ternary reward system that distinguishes between correct answers, hallucinations, and abstentions, leading to a significant reduction in hallucinations from 43.5% to 19.4% and an increase in truthfulness from 5.3% to 37.2% across four knowledge-intensive benchmarks. This approach is crucial for practitioners as it addresses the dual challenge of accuracy and uncertainty management in LLMs, enabling more reliable deployment in real-world applications.

arXiv cs.AI — 54 d agoSafety

The day in AI, distilled.

ERAlign: Energy-based Representation Alignment of GNNs and LLMs on Text-attributed Graphs

On the Condition Number Dependency in Bilevel Optimization

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Models & Releases

Research

Tooling & Open Source