🤗 Daily Paper(2025-07-29)

6 views

Skip to first unread message

deep.di...@gmail.com

unread,

Jul 29, 2025, 4:07:12 PMJul 29

to hf-daily-pap...@googlegroups.com

🤗 Daily Paper Newsletter

Hope you found some gems!

This newsletter delivers you the curated list of papers by 🤗 Daily Papers.

project page

🤗 daily paper

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

Published at 2025-07-22

#ML

This study presents RLCR, a method for training language models that not only enhances their accuracy but also ensures their confidence estimates are reliable. RLCR improves calibration without sacrificing accuracy, outperforming traditional RL training and post-hoc confidence scoring, and can further enhance performance via confidence-weighted scaling methods....

Met^2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems

Published at 2025-07-23

#ML

The paper presents a new method called Met^2Net that improves the accuracy of weather prediction for complex systems by using two separate stages of training for different variables, and a self-attention mechanism to better understand the relationships between variables, resulting in significant reductions in prediction errors....

ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment

Published at 2025-07-25

#ML

The study presents a new framework called ScenePainter to create consistent and immersive 3D view sequences for long-term video synthesis and 3D scene reconstruction. This framework addresses the problem of semantic drift in generated scenes by aligning the outpainter's scene-specific prior with the current scene comprehension, using a hierarchical graph structure called SceneConceptGraph to construct relations among multi-level scene concepts....

UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities

Published at 2025-07-25

#ML

The authors propose a new method called UloRL to improve the reasoning abilities of large language models, particularly when generating ultra-long outputs. They do this by dividing the output into shorter segments and using dynamic masking to prevent issues during training, resulting in faster training and better performance on various tasks compared to a larger model....

Agentic Reinforced Policy Optimization

Published at 2025-07-26

#ML

The authors present a new algorithm called Agentic Reinforced Policy Optimization (ARPO) that improves large language models' (LLMs) performance in multi-turn tool interactions by balancing their long-term reasoning abilities and tool usage proficiency. ARPO's entropy-based adaptive rollout mechanism promotes exploration in uncertain steps after tool usage, and it outperforms existing trajectory-level RL algorithms across various benchmarks, requiring only half the tool-use budget....

ForCenNet: Foreground-Centric Network for Document Image Rectification

Published at 2025-07-26

#ML

The study presents a new method called Foreground-Centric Network (ForCenNet) to correct geometric distortions in document images, which outperforms existing techniques by focusing on foreground elements and using a curvature consistency loss to better understand distorted geometric distributions....

Region-based Cluster Discrimination for Visual Representation Learning

Published at 2025-07-26

#ML

The authors present a new method called RICE that improves region-based visual understanding and OCR capabilities, specifically addressing the limitations of current vision-language contrastive models in dense prediction tasks. RICE uses a large-scale candidate region dataset, a Region Transformer layer, and a unified region cluster discrimination loss to enhance object and OCR learning, outperforming previous methods in various tasks and being publicly available for use....

Diversity-Enhanced Reasoning for Subjective Questions

Published at 2025-07-27

#ML

This study presents a new method called MultiRole-R1 that improves the performance of large reasoning models on subjective questions by incorporating multiple perspectives. By using unsupervised data construction and reinforcement learning with diversity as a reward signal, MultiRole-R1 enhances both the accuracy and diversity of subjective reasoning tasks, demonstrating its effectiveness on six benchmarks....

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Published at 2025-07-28

#ML

The study presents a comprehensive review of self-evolving agents, focusing on their ability to adapt and evolve in real-time, contrasting them with the static nature of Large Language Models. The survey outlines a structured framework for understanding and designing self-evolving agents, discussing evolutionary mechanisms, adaptation methods, and evaluation metrics, while also highlighting applications and challenges in the pursuit of Artificial Super Intelligence....

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Published at 2025-07-28

#ML

The research presents a new multimodal model, ARC-Hunyuan-Video, which can effectively understand real-world short videos by processing their visual, audio, and textual signals. This model excels in tasks like video captioning, summarization, and question answering, and has shown significant improvements in user engagement and satisfaction when deployed in real-world applications....

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Published at 2025-07-28

#ML

The researchers created a publicly accessible, large-scale image-editing dataset called GPT-IMAGE-EDIT-1.5M, which contains over 1.5 million high-quality image editing examples. They used GPT-4o to improve three popular image-editing datasets and fine-tuned open-source models on their new dataset, resulting in better image editing performance that outperforms previous open-source methods and comes close to proprietary models....

GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis

Published at 2025-07-28

#ML

GenoMAS is a new framework that uses a team of LLM-based agents to analyze gene expression data. This system integrates structured workflows with autonomous agents to handle complex and large datasets, outperforming previous methods and providing biologically plausible results....

Geometric-Mean Policy Optimization

Published at 2025-07-28

#ML

This study presents Geometric-Mean Policy Optimization (GMPO), a more stable alternative to a recent method called Group Relative Policy Optimization (GRPO) that enhances the reasoning abilities of large language models. GMPO improves stability by optimizing the geometric mean of token-level rewards instead of the arithmetic mean, and it outperforms GRPO on various benchmarks, including mathematical and multimodal reasoning tasks....

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment

Published at 2025-07-28

#ML

The JAM model is a flow-based song generator that provides fine-grained control over word timing and duration, and improves song quality through aesthetic alignment without manual data annotations. The model is evaluated using the public JAME dataset, and it outperforms existing lyrics-to-song models in music-specific attributes....

Music Arena: Live Evaluation for Text-to-Music

Published at 2025-07-28

#ML

Music Arena is a new platform for live evaluation of text-to-music models, allowing users to input prompts and compare system outputs. It offers a standardized, transparent, and music-specific approach to gathering human preferences, filling a gap in the current TTM evaluation methods....

Reconstructing 4D Spatial Intelligence: A Survey

Published at 2025-07-28

#ML

This study proposes a new method to categorize techniques for reconstructing 4D spatial intelligence, or understanding moving 3D scenes, into five levels ranging from basic 3D attributes to incorporating physical laws. The authors also discuss challenges and potential advancements in each level, and provide a resource page for tracking developments in the field....

Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning

Published at 2025-07-28

#ML

The authors propose Rep-MTL, a Multi-Task Learning approach that uses representation-level task saliency to improve interactions between task-specific optimization and shared representation learning. By balancing task-specific learning and cross-task sharing, Rep-MTL mitigates negative transfer and promotes complementary information sharing, resulting in competitive performance gains on various benchmarks....

SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers

Published at 2025-07-28

#ML

The authors have developed a pipeline called SAND-Math to generate high-quality, complex mathematical problems and solutions, which significantly improves the performance of mathematical reasoning LLMs. By increasing the difficulty of the problems, they achieved a 17.85 absolute points boost in performance on the AIME25 benchmark compared to the next-best synthetic dataset....

SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment

Published at 2025-07-28

#ML

The study presents SmallThinker, a new family of large language models optimized for local device deployment, addressing issues like limited memory and slow storage. The authors introduce innovative techniques such as sparse structures and a pre-attention router to reduce computational demands and storage latency, enabling the models to outperform larger LLMs on consumer CPUs....

Published at

Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.

(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun.

Visit Developer's Social Media

Reply all

Reply to author

Forward

0 new messages