# Randomly selecting from an infinite number of choices

Discussion in 'Physics & Math' started by Nasor, Jan 20, 2009.

1. ### RJBeeryNatural PhilosopherValued Senior Member

Messages:
4,136
Trippy: the reason I made RANDOM in all caps is because that is the term that necessitates a uniform distribution.

Also, with regards to your K-40 number generator it had more merit than I originally gave it. I tweaked the idea here:

...which would work. The problem that I was having is that your proposed system 1) had no way of terminating to know when the number has been selected, and 2) had no way of choosing a digit greater than 1 (either it decayed or it did not). It was therefore confusing when you would mention waiting 6 half-lives to determine 6 decimal digits. I realize now that, the binary/decimal digit issue aside, what you MEANT was that the LENGTH of the number being chosen was pre-determined, then you use the K-40 method to generate it.

If that's what you meant then you are correct that it would in fact be a uniform distribution for a given length.

3. ### RJBeeryNatural PhilosopherValued Senior Member

Messages:
4,136
Pete, the "infinite K-40 atom" proposal (or the ones that you mentioned) do produce finite numbers if the leading digits are all zero. If you claim that producing an infinite string of zeroes is not possible then I think we're just bumping up against the very impossibility of randomly choosing a finite number from the infinite set.

5. ### phytiRegistered Senior Member

Messages:
290
The question is what is the probability of selecting 1 choice from an infinite
number. Probability compares a part to the whole. In this case, the
whole is not defineable (unbounded), is indeterminate, thus probability does not apply.
You can match the set of even integers to the set of integers, 1 to 1, but you know that given a random sample of integers, 1/2 will be even.

7. ### PeteIt's not rocket surgeryRegistered Senior Member

Messages:
10,166
Yes, but that's a big if! Infinitely big! I mentioned it in post 186:
So, let the demon spin its infinite dice [or decay its infinite atoms, or flip its infinite coins] to get an infinite random string of digits.
And let it spin them again, and again, and again, in fact let it spin them and infinite number of times, until it spins a number that begins with an infinite string of zeros.​

Well, it's not necessarily that it's not possible, its that it turns the problem into a hypertask... the demon must not just complete infinite things in a finite time, it must complete uncountably infinite things in a finite time.

This process is more involved than it might appear:
If the demon sequentially generates countably infinite random strings of infinite digits, (eg it chooses one infinite string at t=0, another infinite string at t=1/2, a third at t=3/4, a fourth at t=7/8, etc), then I think that strictly zero of those strings will end with infinite zeros.
In order to have a non-zero chance of spinning a finite number (any finite at all!), the demon must spin its infinite dice uncountably infinite times.
Interesting, I think (not certain) that it follows that when the demon spins its first natural, it will have already completed uncountably infinite spins which were rejected for not being finite.

8. ### PeteIt's not rocket surgeryRegistered Senior Member

Messages:
10,166
Yes, I do.
Given any finite size, we can of course produce a uniform distribution of that size.

Trippy, the problem is about choosing from an infinite distribution, right?
So, what is interesting about the device? The distribution before the atom decay, or the distribution after the atom decay?

After the atom decay has chosen the order of magnitude, the distribution of possible numbers is finite and uniform.

Before the atom decay has chosen the order of magnitude, the distribution of possible numbers is infinite and nonuniform.

I agree that there are many opportunities for confusion and misunderstanding in this topic. We're touching on some difficult areas - set theory, information theory, and the seamier side of probability. I don't pretend to have a complete grasp on the topic, and there is a very high probability that I've made some serious errors (or at least naive misinterpretations) in the thread. But, I'm only a business major!

Cheers,
Pete

9. ### TrippyALEA IACTA ESTStaff Member

Messages:
10,890
No. A purely random distribution can still be biased, or polymodal.
Being random does not neccessitate a uniform distribution of probabilities.
An example - rolling 1d10 until a 6 comes up and recording the sequence of numbers as a number - it's biased towads short numbers ending in 6

You've misundestood my machine, re-read my original description:

The equivalent is you sitting there, rolling 1d10 until you're told to stop, and writing the results of each individual roll down on to a piece of paper, handing that piece of paper over to the other person who then reads the sting of single digit random numbers you've written down as a single random number.

Does that help clarify how you've misunderstood my machine?

It generates a string of single digit random numbers between 0 and 9.
The length of the string is determined by the decay of the K-40 atom, and how many random numbers I can generate per second.
That is the sole role of the K-40 atom: It determines how many digits are in the final number (that is, the string). The actual value of the random number that is generated (as the string) is determined by the random number generator.

The K-40 machine has a finite, non-zero probability of generating a number of any arbitrary length, and when it does so it has an equal chance of generating a number of any value.

Because the length of the number is important, before the machine activates and spits out a number there's a 5.26% chance (You can calculate this as the sum to infinity of a Geometric series) that it's going to produce a number that will be read as one, but that includes choices of (if I can miss apply some chemistry notation*):

$1,01,001,0001,00001,000001,0000001,...,000000[0]_n1$

However, if the machine produces a result after n halflives, then the result can only be in the interval:

$000000[0]_n =< x =< 999999[9]_n$

However the machine can not produce a result in the interval:

$000000[0]_{n-1} =< x =< 999999[9]_{n-1}$

But it can produce strings with the equivalent value.

For example, although it can't produce $999999[9]_{n-1}$ it can produce as a string $099999[9]_{n-1}$ which is the same length as $999999[9]_n$ but has the same value as a number as $999999[9]_{n-1}$, but the strings $9999999[9]_{n}$ $0999999[9]_{n-1}$ and $00000[0]_{n-1}9$ all have the same probability of being output if the machine gives you your number after n half lives.

Does this help any?

*In chemistry the notation $[ ]_n$ means "Repeat what's inside the square brackets n times.

Last edited: Jan 31, 2009
10. ### TrippyALEA IACTA ESTStaff Member

Messages:
10,890
There have been several already in fact

and for myself, I will confess to having been somewhat irritated at some of the suggestions that have been made - the prime one being that I am simply practicing my debating skills. If I continue arguing against an idea it is because I am genuinely unconvinced - for whatever reason, by the arguments being presented by the person I am debating with.

11. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
It's standard practice to assume that the distribution is uniform when it is not otherwise specified. Indeed, the entire point of this thought experiment is to illustrate how that assumption will get you into trouble in certain cases.

Likewise, it's not uncommon for the phrase "purely random" to denote "uniformly distributed," the idea being that, for any non-uniform distribution, there is some degree of "predictability," in the sense that you can predict that the outcome will be near the mode (or will not be near a low-probability region) and you will end up being correct more often than not. With a uniform distribution, you can't do that: any guess at what the outcome will be is as good as any other (provided it's within the support of the distribution).

The kicker to all this is that, for an infinite set, we MUST have a non-uniform distribution. But any non-uniform distribution we can apply will emphasize certain regions of the naturals over others. And, since we have no information on which regions should be more likely, that amounts to the addition of arbitrary information to the thought experiment. Which is to say that the correct answer is "there is no way to assign a unique probability to the event that the angel finds the demon without more information."

12. ### TrippyALEA IACTA ESTStaff Member

Messages:
10,890
No it isn't. In fact, in all of the Statistics i've ever studied, assuming a 'uniform' distribution can get you in a lot of trouble.

13. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
Any assumption can get you into a lot of trouble, and so it's always better to specify the distribution (uniform or otherwise). But expediency being what it is, uniformity has become the standard assumption for cases where the distribution is not specified (at least for finite sets). There are also other reasons to consider a uniform distribution as the "default," such as its uninformative nature (in the Bayes sense).

If you haven't observed this convention, well, give it some time. It's quite common, particularly once you get outside the realm of actual math classes (math professors are notoriously picky about this kind of thing, and so tend to always specify the distribution).

14. ### TrippyALEA IACTA ESTStaff Member

Messages:
10,890
Most of the distributions I deal with on a daily basis (I work for the equivalent of teh USEPA) are normal, or log normal. There's not much call for Binomial distrbibutions in environmental monitoring (that i've come across anyway).

15. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
Yeah, those two distributions, along with uniform, account for a huge proportion of the statistics that people actually use outside of math classes. The normal is very much the "default" distribution when your domain is the real line (or plane or etc.), although it lacks the "uninformative" features of the uniform distribution.

I'm sure you could find some situations if you looked. Indeed, many of the places where the Normal distribution is applied are actually approximations to binomial distributions where the number of Bernoulli trials is large.

16. ### TrippyALEA IACTA ESTStaff Member

Messages:
10,890
Perhaps, but the main place I could see that sort of thing being useful is if I wa slooking at things like Mean time between failures, which is something that might be useful in some regards, now that I think about it.

17. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
I'm pretty sure that uniformly drawing a natural number is a supertask. Drawing a real number would be a hypertask, in my understanding.

So I think the answer must be that if you let the demon pick digits for long enough, he will eventually type out an infinite set of zeros. But it would seem that the exact same argument would show that he will eventually type an infinite number of, say, 7's, so I must be missing something.

18. ### diseaseBannedBanned

Messages:
657
I think there might be some confusion about what a number is, and making a choice. One already exists, the other is a, well, a random walk, say.

It's an enumeration problem. A calculation is a Turing machine, you know.

P.S. which means the problem: The demon goes into a hotel with an infinite number of rooms, and randomly picks a room to hide in. The point of the story was that if the angel only gets to look behind one door, there is no possible way that he could ever find the demon.

...reduces to a decidability one. Can a program be written that can look into an infinite number of 'rooms' in one attempt?
Which translates into: can a Turing machine read an infinite number of symbols (on an infinite tape), in one algorithm, and decide one of them is "the demon".

Last edited: Jan 31, 2009
19. ### diseaseBannedBanned

Messages:
657
Note: if the machine reads just one symbol on the tape, there are an infinite number of unread symbols left.
But an infinite tape has no "start position", so the problem is now: which single symbol, on an infinite length tape should the machine read, or, how many symbols should it skip over before reading one? Now there are two decisions the "algorithm" needs to make.

Last edited: Jan 31, 2009
20. ### D HSome other guyValued Senior Member

Messages:
2,257
I agree with Trippy. It is standard practice to assume a Gaussian distribution. For one thing, there's the central limit theorem. For another, there's no random distribution more amenable to mathematical analysis than a Gaussian. This latter fact leads people to assume a Gaussian where they should not.

Correct again.

21. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
On the natural numbers? The only assumption that I consider 'standard' for the natural numbers is that the elements are ordered by decreasing probability (and of course that the probability adds up to 1).

What about it? There is no linear combination of independent variables going on here.

I would suggest that the uniform distribution is even more amenable to mathematical analysis than the Gaussian, or at least "as amenable." Expectations don't get much simpler than whent he density is constant, and the domain finite :]

22. ### PeteIt's not rocket surgeryRegistered Senior Member

Messages:
10,166
I would have thought so too, but I can't construct any countably infinite set of tasks that would do the job. That's not supposed to be conclusive, just a call for further research.

Drawing a real number is definitely a supertask - for example, roll a ten sided die infinite times, once for each digit in the decimal representation.

23. ### quadraphonicsBloodthirsty BarbarianValued Senior Member

Messages:
9,391
Yeah, same boat I'm in. Probably it's time to dig out some actual papers on this stuff and find out for real...

Seems like that would work just as well for a natural number, no? Except that you'd only have to draw "half" as many numbers, since there wouldn't be any to the right of the decimal point.

I am pretty confident that drawing a real number is a hypertask, and that drawing a natural number is only a supertask. I will let you know if I manage to either find examples of constructions that illustrate this, or find out that I am wrong.

Apparently there's also such a thing as a super-dupertask, which consists of a countable number of tasks, followed by a "last" task (0.999... != 1, anyone?). I nominate "super-dupertask" for the coolest name...