Hi Aram,
Thank you for your question.
* What you are after is in general known under the names
'Prediction with Expert Advice' or 'Online Learning',
which makes no assumptions on the observation sequence but
restricts the class of predictors,
e.g. to the (countable) class of computable predictors.
* I briefly mention this in my book in Sec.3.7.4 and expand on it
in http://www.hutter1.net/ai/bayespea.htm
* Under log-loss you can continue to use Bayesian mixture
predictors,
but for other losses you have to combine the predictors=experts
somewhat differently to achieve good bounds.
See e.g. http://arxiv.org/abs/cs.AI/0504078
and references therein.
* This can be extended to acting agents, but is ugly: http://dx.doi.org/10.1007/11564089_28
* And here's an experimental comparison: http://arxiv.org/abs/cs.LG/0508073",
Marcus
More information: http://www.hutter1.net/ait.htm
---
You received this message because you are subscribed to the Google Groups "Algorithmic Information Theory" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ait0+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ait0/23ca95a3-537e-47a0-8c43-85153cc007den%40googlegroups.com.
If you make the action space (e.g. {0,1}) convex (e.g. [0;1]],
you also have to specify the loss function for this expanded
action space.
If you do this in the natural linear way, you can avoid
randomization,
but for non-convex action space or non-convex loss function,
randomization is unavoidable.
I seem not to be able to find a good reference for this.
Very roughly, if the loss function is convex and smooth at the
minimum,
you can even get a regret of O(log T) instead of O(√T).
https://doi.org/10.1111/j.1751-5823.2001.tb00457.x
Oops, I just realized that my last reply slightly mischaracterized the Bayesian mixture-of-environments setting, saying that it measures regret with respect to computable agents, when in fact it measures regret with respect to the optimal agent. So, I suppose (2) is really a specialization rather than a generalization. If anyone points out additional errors or ambiguities on my part, I'll make sure to address them.
In any case, my question is about the setting with arbitrary sequences, where I'd like to bound the regret of some deterministic agent (perhaps Chapter 3.4's Λ_ξ) with respect to the set of computable agents. I'm under the impression that this question has a positive answer for "convex" action spaces, since they can emulate randomized PEA strategies, but is open in the general case. I'm curious to hear your thoughts on it.
Thanks again!
On Wednesday, March 16, 2022 at 8:24:30 PM UTC-7 Aram Ebtekar wrote:
To view this discussion on the web visit https://groups.google.com/d/msgid/ait0/95f37bb1-6145-4100-b0ee-6f14c8886da8n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ait0/21f049ef-062a-4031-7621-a2910a0365ff%40gmx.net.