Scaling Laws for LLM Based Data Compression

ram

unread,

Sep 29, 2025, 4:34:27 PMSep 29

to Hutter Prize

I was experimenting with finding compression-related scaling laws for LLMs,
and wrote about my observations here https://fullwrong.com/2025/07/23/scaling-compression/

Would like to know more about whether if there is any flaw in the methodology, or if this is a problem worth exploring.

Matt Mahoney

unread,

Sep 30, 2025, 1:37:39 PMSep 30

to Hutter Prize

Interesting. What I noticed was that the more parameters you have, the longer it takes before further training has no effect. That is consistent with the general rule that the information capacity of the parameters (on the order of 1 bit per parameter) should match the compressed size of the training data.

If you want to turn this into a research paper, I would suggest placing the source code and documentation online so that others can reproduce the results.

ram

unread,

Sep 30, 2025, 4:30:42 PMSep 30

to hutter...@googlegroups.com

Thanks for connecting this to the information capacity principle - I hadn't explicitly connected my observations to the 1-bit-per-parameter capacity rule before.

I've started writing a paper on this.(I have limited research experience, so I'm not yet confident enough to post a preprint on arXiv), but I've prepared a draft here. The codebase is available here.

I'm still exploring what additional experiments would strengthen the work. I'm also interested in understanding the theoretical implications of compression scaling laws. Dr. Hutter previously pointed me to https://arxiv.org/abs/2102.04074, which I'm currently working through.

If you have any suggestions on experiments or theoretical directions worth exploring, I'd greatly appreciate the guidance

--
You received this message because you are subscribed to a topic in the Google Groups "Hutter Prize" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hutter-prize/5osakN3uk3I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hutter-prize...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hutter-prize/b40c178c-3880-446e-a4cc-0f83f33d2c59n%40googlegroups.com.

paper.pdf

ram

unread,

Oct 13, 2025, 6:39:50 AMOct 13

to Hutter Prize

Hi Matt,

Thanks again for your earlier reply , the 1-bit-per-parameter capacity link was very helpful. I’d like to read and cite the relevant work you mentioned about parameter information capacity and the point where additional training stops helping. Could you point me to any papers, posts that formalize this idea?

Best,
Ram

Reply all

Reply to author

Forward