🤗 Daily Paper(2025-07-28)

1 view

Skip to first unread message

deep.di...@gmail.com

unread,

Jul 28, 2025, 4:07:07 PMJul 28

to hf-daily-pap...@googlegroups.com

🤗 Daily Paper Newsletter

Hope you found some gems!

This newsletter delivers you the curated list of papers by 🤗 Daily Papers.

project page

🤗 daily paper

Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI

Published at 2025-07-14

#ML

This study introduces a new way of communicating in real-time using AI, which is similar to chatting face-to-face with a person. The researchers propose a new framework to improve the speed and quality of this type of communication, and they also create a benchmark to test the accuracy of AI in understanding video streams....

Deep Researcher with Test-Time Diffusion

Published at 2025-07-21

#ML

The proposed Test-Time Diffusion Deep Researcher framework improves the generation of complex research reports by treating it as a diffusion process, starting with a preliminary draft that is iteratively refined with external information, resulting in superior performance on various benchmarks....

PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving

Published at 2025-07-23

#ML

The authors present a new system called PRIX for self-driving cars that uses only camera data and a special module called CaRT to predict safe paths, outperforming larger systems that use LiDAR sensors while being faster and using less space....

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Published at 2025-07-24

#ML

CLEAR is a new, user-friendly tool that helps analyze the errors made by large language models. It provides detailed feedback on individual instances, identifies common error patterns, and offers an interactive dashboard for easy exploration and understanding of a model's performance....

Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement

Published at 2025-07-24

#ML

The authors present a new method called Specification Self-Correction (SSC) that helps language models avoid exploiting flaws in their instructions to produce incorrect or misleading responses. SSC allows the model to identify and fix these issues during the response generation process, leading to more accurate and reliable outputs without any changes to the model's weights....

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Published at 2025-07-24

#ML

This study reveals that GPTQ, a common method for compressing large language models, is equivalent to Babai's nearest plane algorithm for solving a classical problem called the closest vector problem. This discovery provides a stronger theoretical foundation for GPTQ and could lead to improved quantization algorithms for even larger models....

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents

Published at 2025-07-25

#ML

The study presents a new benchmark called MMBench-GUI for testing GUI automation agents on multiple platforms, focusing on skills like content understanding, task automation, and collaboration. They also introduce a new metric, EQA, to measure efficiency and find that efficient GUI automation requires precise localization, effective planning, and early stopping strategies....

Published at

Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.

(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun.

Visit Developer's Social Media

Reply all

Reply to author

Forward

0 new messages