🤗 Daily Paper(2025-09-02)

5 views

Skip to first unread message

deep.di...@gmail.com

unread,

Sep 2, 2025, 4:06:33 PMSep 2

to hf-daily-pap...@googlegroups.com

🤗 Daily Paper Newsletter

Hope you found some gems!

This newsletter delivers you the curated list of papers by 🤗 Daily Papers.

project page

🤗 daily paper

From reactive to cognitive: brain-inspired spatial intelligence for embodied agents

Published at 2025-08-23

#ML

This study presents a new framework called BSC-Nav that helps embodied agents build and use structured spatial memory, enabling them to navigate complex environments more effectively. By creating cognitive maps and retrieving spatial knowledge aligned with goals, BSC-Nav, when combined with powerful language models, outperforms existing methods, generalizes better, and supports versatile behaviors in the real world....

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

Published at 2025-08-24

#ML

This study evaluates an Arabic-focused language model, ALLaM-34B, by asking it to perform various tasks in modern standard Arabic, five regional dialects, and code-switching. The model was tested on its factual knowledge, reasoning skills, creativity, and safety, and it performed very well in most areas, making it a strong choice for real-world applications in Arabic....

No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes

Published at 2025-08-26

#ML

The authors present SuperSimpleNet, a new model for detecting surface defects in manufactured components that can work in various supervision scenarios, including unsupervised, weakly supervised, mixed supervision, and fully supervised settings. SuperSimpleNet is efficient, adaptable, and fast, setting a new standard for performance across all scenarios and making it suitable for real-world manufacturing challenges....

Democracy-in-Silico: Institutional Design as Alignment in AI-Governed Polities

Published at 2025-08-27

#ML

The study proposes a simulation where AI agents with complex personalities govern themselves under various rules and discovers that specific institutional designs can reduce corruption, improve policy stability, and enhance welfare, suggesting a potential framework for aligning future AI societies....

T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables

Published at 2025-08-27

#ML

The authors present T2R-bench, a new benchmark for creating detailed reports from complex industrial tables, which is a common challenge in real-world applications. They collect 457 tables from various industries and propose a way to evaluate the quality of generated reports. Experiments show that even the best language models struggle with this task, suggesting opportunities for further research and improvement....

How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench

Published at 2025-08-28

#ML

The study analyzes and addresses the challenges faced by large language models in multi-turn conversational environments, proposing the IRMA framework to automatically reformulate user queries for improved tool-calling agent performance, resulting in significant improvements over existing methods....

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Published at 2025-08-28

#ML

The authors propose PVPO, a reinforcement learning method that improves efficiency by using a reference model to pre-sample data and calculate reward scores as a reference anchor, reducing bias and computational cost. Experiments show that PVPO outperforms existing methods across various tasks and scales....

Published at

Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.

(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun.

Visit Developer's Social Media

Reply all

Reply to author

Forward

0 new messages