Begin 30 Day Comment Period On Kaido Orav's fx2-cmix

194 views

Skip to first unread message

James Bowery

unread,

Sep 3, 2024, 5:56:07 PM9/3/24

to Hutter Prize

We're now in the 30 day comment and verification period for Kaido Orav's submission sharing credit with Bryan Knoll called fx2-cmix, which has exceeded the 1% improvement award threshold.

Source code is published at:

https://github.com/kaitz/fx2-cmix

Improvement:

1.58585% = 100*(1-110793000/112578322)
% improvement = 100*(1-S/priorS)
110492000 := 441463 + 110351665

S := length(cmix)+length(archive9)
S := length(comp9.exe/zip)+length(archive9.exe)

Submission Description

This submission contains fallowing major modifications on top of the recent fx-cmix Hutter Prize winner:

NLP (Natural language processing)
online reverse dictionary transform
single pass wikipedia transform
updated order of articles.

More detailed changes

cmix changes:

mixers contexts are more similar to fxcm mixer contexts.
mixers have weight update skipping when error is below threshold (improves speed).
removed the weight regularizer from the mixer (improves speed).
executable binary size reduced due to "simpler" code.
Removed 7 indirect nonstationary predictors, 6 match model predictors, 3 mixers. This improves compression time and at the same time allows fxcm to be more complex and slower.

fxcm changes:

Reverse dictionary transform. We load the dictionary when it is found after decompressing it. Text has a separate buffer from coded byte stream buffer.
Natural language processing using stemmer (from paq8px(d)).
Stemmer has new word types: Article, Conjunction, Adposition, ConjunctiveAdverb.
Some word (related) contexts are changed based on what type of word was last. Some words are removed from word streams depending on the last word type. This improves compression.

Reply all

Reply to author

Forward

0 new messages