| Berry's Paradox and universal distribution | Wei Dai | 28/05/09 04:56 | This is a reply to a part of Abram's "join post" on the everything-list (http://dragonlogic-ai.blogspot.com/2008/11/interesting-mailing-list-i-recently.html). I'm the owner of that mailing list but haven't been following it closely in recent times, so I only just noticed Abram's post. I'm replying here since the topic might be more suitable for this list. Abram Demski wrote: This is my position as well: there is no reason to assume that the universe The argument is very simple. Consider an arbitrary probability distribution (I've stated variants of this argument before at [1] and [2], but this Is there a way around this? If not, how should we (or an AI) deal with [1] "is induction unformalizable?" |
| Re: Berry's Paradox and universal distribution | Eliezer Yudkowsky | 28/05/09 20:21 | Solomonoff's universal distribution isn't trying to assume the
universe is computable. It's trying to beat any computable probability distribution you could put over the universe. In other words, if your attempt to *calculate*, to *talk about* and *reason about* the universe is computable - even if you are talking *about* systems that you think are uncomputable - then your ratiocinations appear somewhere in Solomonoff induction. > (http://dragonlogic-ai.blogspot.com/2008/11/interesting-mailing-list-i...). > [1] "is induction unformalizable?"http://groups.google.com/group/everything-list/browse_thread/thread/c... > [2]http://lesswrong.com/lw/f9/a_request_for_open_problems/bws |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 29/05/09 12:31 | Eliezer,
Good point. Although, since there are other uncomputable distributions that dominate the Solomonoff distribution, that description does not uniquely recommend the Solomonoff distribution. One alternative would be Jurgen Schmidhuber's universal distribution over what he called "descriptions". But, that distribution is not even computable in the limit, unlike the Solomonoff distribution. This could be used to argue that the Solomonoff distribution is the "best we can do" in some sense... but, even though we cannot approximate Jurgen's distribution to arbitrary precision given enough processing power, I suppose we could *partially* approximate it, using a theorem prover to prove convergence for as many descriptions as possible... and, a similar argument holds for even more difficult distributions. The question is, is there a significant sense in which this is different from approximating the Solomonoff distribution? (For those on this list not familiar with these issues, I recommend starting by looking up "algorithmic information theory".) --Abram -- Abram Demski http://dragonlogic-ai.blogspot.com/ |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 29/05/09 12:34 | Eliezer Yudkowsky wrote: That's an interesting perspective. I agree that a human being, assuming he But what if the human could augment himself with uncomputable physics, and |
| Re: [one-logic] Berry's Paradox and universal distribution | AbramDemski | 29/05/09 12:40 | Wei Dai,
Glad to hear from you. I am struck anew by the simplicity of this paradox every time I look at it. For those on this list not familiar with the background material for this paradox, it essentially relies on a formalization of Occam's Razor. The basic assumption is that, in absence of evidence, the only deciding factor between two theories is their length written down-- probability decreases as length increases. So, anything with a short description length is fairly probable. I've attempted to reply to this paradox before, but I only sort of waved my hands and pointed to assumptions that might be questionable. Right now, my first thought is that the lexographically least object such that P(x) < 1/3 ^^^ 3 is ill-defined. (For those not familiar with knuth's up-arrow notation, I recommend looking it up, just because it is fun. Although, really, all you need to know is that 1/3^^^3 is a very small probability with a fairly short description.) So, what plausible arguments make such an entity ill-defined? Clearly it will not have a place on the Tarski hierarchy, since it refers to all levels. In a loopy theory, I'd think it would turn out to be one of those things about which we must remain silent. (That is, a a loopy theory like Kripke's theory of truth, but applied to probability as well...) Of course, these strategies are (imo) not "referentially complete", which is the whole point... Overall, my feeling is that any treatment of the more standard paradoxes tends to carry over to a treatment of this paradox. --Abram 2009/5/28 Wei Dai <wei...@weidai.com>: |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 29/05/09 18:20 | Abram Demski wrote: Did you post your previous reply somewhere? > Right now, my first thought is that the lexographically least object But that object is well-defined for any particular P that's well-defined, > Clearly it will not have a place on the Tarski hierarchy, since it I might be out of my depth here, but it seems to me that the BTW, I've found some academic research into whether it's ok to quantify over |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 30/05/09 16:34 | Wei Dai,
I sent it to you, if I recall correctly. All I did was question the assumption that description length should be the only factor in determining probability. I still find this objection plausible, but perhaps not the best objection. After all, I do not take your paradox as *proof* that description length should not be the only factor. Does the paradox rely critically on an assumption that description length should be the only factor? Or can you derive the paradox just from the assumption that the prior should dominate other priors? That was my intention, yes. Yes, sounds right. In such an approach the prior would fail to be well-defined. The semantics of the logic would fail to be well-defined according to itself, too, so that is not surprising. So, what I am saying is that I do not agree with *current* solutions to these problems, but for those solutions, it looks like everything carries over just fine to your paradox. So it seems as if any *actual* solution will have the same property. On the other hand, a solution might still fall prey to your paradox, but give a good story as to why the situation is acceptable. This seems fairly plausible, too. --Abram |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 30/05/09 19:46 | > I sent it to you, if I recall correctly. Yes, I see that we had an email conversation on this topic last June. I > Does the paradox rely critically on an assumption that description Ok, here's a version based on dominance: for any probability distribution P > So, what I am saying is that I do not agree with *current* solutions Yes, I'm hoping that's the case too, which is why I'm here. :) > On the other hand, a solution might still fall prey to your paradox, I'm still thinking through implications of Eliezer's response, which seems |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 02/06/09 06:58 | Wei,
Hmmm. With respect to inductive priors, I understand Eliezer's response as saying that what *I'm* doing, in asking for a better prior, is (at best) only making *more specific* requirements on a prior-- since it will be computably approximated, there is no way it can strictly dominate the solomonoff prior. In other words: any description-length-based prior over some computable deduction system *is* a solomonoff prior, if the deduction system is turing-complete; but I'm hoping it may also have some additional desirable properties. (Such properties include explicit recognition of a halting oracle in the environment, if there is sufficient evidence for it.) How would Eliezer's response carry over to other paradoxes? "First-order logic does not necessarily assume that the domain of discourse is computable, but rather that the logic must be run on a computer." Something like that? --Abram |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 09/06/09 23:05 | > (Such properties include explicit recognition of a halting oracle in I have found some discussion about how human beings might be able to http://www.cs.nyu.edu/pipermail/fom/2004-March/008010.html > "First-order logic does not necessarily assume that the domain of But what's the counterpart of Solomonoff's Universal Distribution in this |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 10/06/09 11:58 | Wei Dai,
This proposal seems wrong to me. A language that contains all truths that can be proved by computable logics will simply contain all truths that can be stated finitely! For each truth, there is a computable logic that just spits out that one statement. So, your logic would just contain all true finite sentences. Practically, what this means is, we can always usefully extend a computable logic by adding more true axioms. (The trouble, of course, is knowing which axioms are true.) For me, this is an indication that finding the widest set of knowable mathematical truths isn't the real problem (though it is still an interesting problem). The more basic question is what the broadest set of meaningful mathematical *statements* is. IE, what is the universal language? Set theory? Kripke's theory of truth? Etc. --Abram |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 10/06/09 20:34 | Eliezer Yudkowsky wrote: Damn my poor memory again. I just remembered that I already came up with an Let me try a different example here. Consider a hyperintelligent being (i.e. To make things simple, the HI chooses the computable algorithm to be "always So, Solomonoff's universal distribution is guaranteed to beat (technically, |
| Re: Berry's Paradox and universal distribution | Eliezer Yudkowsky | 10/06/09 21:51 | On Jun 10, 8:34 pm, "Wei Dai" <wei...@weidai.com> wrote:Actually, no. I believe the proofs are over probabilities, and the most elegant proofs are over accumulated log probabilities (Bayes- score). So what we want to do is establish that in the limit, and regardless of the environment, whether the environment is computable or not, there is no computable algorithm whose accumulated log-score beats the Solomonoff distribution - an (uncomputable) weighted mixture of all computable mixtures. And then the proof seems relatively trivial, since any computable algorithm has a finite weight in this mixture. Indeed the difference in their scores is bounded by (the log of) the initial finite weight, unless I've missed something. So it's possible that you could put forth some sort of theorem about a computable decision process being defeated by an uncomputable environment. In fact this is quite easy - let a sequence of ladies and tigers always be placed by the environment to make a decision process perfectly wrong. Then it will be dominated by *any* other process that is not perfectly identical to it, including, probably, "always choose the door on the right", and so on. But *epistemically* speaking, Solomonoff induction cannot do infinitely worse than anything computable. In the lady and the tiger case, I expect it would assign probabilities closer and closer to exactly 50%, so that despite being "wrong" on every occasion - despite assigning greater than 50% probability to the wrong answer on each occasion - the margin of greaterness would become so tiny that it would lose by only a finite amount to an algorithm that treated the ladies and tigers as a fair coin. (And of course there is no computable algorithm that actually makes the correct predictions!) |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 11/06/09 14:31 | Eliezer Yudkowsky wrote: Solomonoff's original proof was over probabilities, but other proofs are Another natural question is to ask for relations between the total number of > So what we want to do is establish that in the limit, and I think you're right, but only if the computable algorithm isn't allowed to So, given that the Solomonoff distribution is not guaranteed to be best |
| Re: Berry's Paradox and universal distribution | Eliezer Yudkowsky | 11/06/09 17:23 | On Jun 11, 2:31 pm, "Wei Dai" <wei...@weidai.com> wrote:Hey, well, more the worse for them, I'd say. If you're dealing with non-exceptional situations - non-devilish environments - then shouldn't a proof of *epistemic* error-boundedness generally carry over to a proof of decision error-boundedness? In other words, are you sure you're not assuming that the environment is a superintelligent adversary which is strictly more superintelligent than you, which is the sort of reasoning that leads people to adopt randomized algorithms? In general, even though I see epistemic rationality as ultimately deriving from instrumental rationality, I think that epistemics are simpler than decisions, and so I would trust more a proof that began with epistemic correctness and moved from there to instrumental correctness. I'm not clear on what you mean by this. The Solomonoff distribution contains all computable and consistent probability mixtures. Is the idea that there are concrete decision problems where we can do better by admitting computable yet inconsistent probability mixtures? This, to say the least, is a non-obvious idea, and it needs to be cashed out a bit more clearly in terms of the admissible language of the problem and its solution. Any concrete sequence of sensory data can be predicted by Solomonoff as well as by any computable consistent mixture, up to a bounded total probabilistic error. So what sort of decision problem are we talking about, with a computable decision algorithm, which goes outside this formalism, in a way that enables it to exhibit decisive instrumental superiority despite its lack of decisive epistemic superiority? Is it only pathological cases with adversarial superior superintelligences that can would be expected to produce an infinite series of lady-and- tiger bad bets, or is there some more natural case? By the way, if at any point I stop making sense - I'm being briefer here than I am in my blog posts - be sure to let me know. |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 11/06/09 18:10 | Eliezer, Wei Dai,
I am thinking that a better way of interpreting Wei Dai is to add *incompletely specified* probability distributions, not inconsistent ones. So, for example, a probability distribution that is described in terms of the busy beaver function would be incompletely specified, because the logic we're using will only be able to prove some facts about it. In Wei Dai's example, the epistemology and the decision theory interact in an interesting way-- so that although we cannot compute the utility, we know something about how it compares to the probability. It is hard to see exactly how this would lead to action, however, since a specific utility value is not known. --Abram |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 11/06/09 20:00 | To restate my idea, a computable decision process isn't necessarily limited to using a computable probability distribution. Suppose he has a formally specified, consistent probability function that isn't computable. He may be able to prove that EU(A) > EU(B) even if he can't compute EU(A) and EU(B), where A and B are choices that he faces. (This can be the case whether or not his utility function is computable.) This way, he can possibly make better decisions than an agent using Solomonoff's distribution. If you ask him to give you a prediction about something as a probability, he might have to give you the probability in a non-algorithmic notation, like 1/BB(100). Internally, he is also reasoning about probabilities and expected utilities in these non-algorithmic forms, so it seems reasonable to say that he in fact has P(0) = 1/BB(100) as an epistemic belief. Does that clear it up? (Of course in general he's likely to have choices C and D such that he can't Eliezer asks for more natural examples. That's reasonable and something I'll |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 12/06/09 12:02 | I've come up with a more natural example of a decision problem where Solomonoff's distribution does infinitely worse than a human/computable agent. It doesn't depend on an adversarial superintelligence. Consider the sequence of unary encodings of BB(1), BB(2), BB(4), ... again. It occurs to me that Solomonoff's distribution actually makes two kinds of mistakes. The one I stated at http://www.overcomingbias.com/2008/11/complexity-and/ was that after seeing 1s long enough, it eventually predicts a probability for 0 that's too low. The other mistake is, for a quite a while after seeing a 0, it predicts a probability for 0 that's too high. It's easier to construct an example using the latter error instead of the former. So consider a game where in each round the player is asked to bet on 0 or 1. It seems clear that for a human, except for the first few rounds where he The uncomputable probability distribution that the human is using here is What about the agent using Solomonoff's distribution? After seeing Is this example more convincing? Oh, in case there's objection that the frequency of Solomonoff's |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | Wei Dai | 19/06/09 00:26 | I wrote earlier:
I recalled that Jürgen Schmidhuber's Gödel Machine already contains ideas 1. A Gödel machine starts with an initial strategy and tries to find 2. A Gödel machine has fixed "environment axioms", essentially a |
| Re: [one-logic] Re: Berry's Paradox and universal distribution | AbramDemski | 19/06/09 09:45 | Wei Dai,
Yesterday, I wrote up some thoughts on why uncomputable models might be useful in everyday life: http://dragonlogic-ai.blogspot.com/2009/06/importance-of-uncomputable-models-i.html --Abram |