Thanks for connecting this to the information capacity principle - I hadn't explicitly connected my observations to the 1-bit-per-parameter capacity rule before.
I've started writing a paper on this.(I have limited research experience, so I'm not yet confident enough to post a preprint on arXiv), but I've prepared a draft here. The codebase is available here.
I'm still exploring what additional experiments would strengthen the work. I'm also interested in understanding the theoretical implications of compression scaling laws. Dr. Hutter previously pointed me to https://arxiv.org/abs/2102.04074, which I'm currently working through.
If you have any suggestions on experiments or theoretical directions worth exploring, I'd greatly appreciate the guidance
--
You received this message because you are subscribed to a topic in the Google Groups "Hutter Prize" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hutter-prize/5osakN3uk3I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hutter-prize...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hutter-prize/b40c178c-3880-446e-a4cc-0f83f33d2c59n%40googlegroups.com.