# Over-thinking the stochastic painting of spheres withdrawn from a box

Discussion in 'Physics & Math' started by rpenner, May 5, 2017.

1. ### rpennerFully WiredValued Senior Member

Messages:
4,833
On April 28, FiveThirtyEight.com posted the weekend puzzle contest (which closed Sunday night) with their solutions coming this Friday, May the Fifth. Here is a beautiful problem.

You play a game with four balls: One ball is red, one is blue, one is green and one is yellow. They are placed in a box. You draw a ball out of the box at random and note its color. Without replacing the first ball, you draw a second ball and then paint it to match the color of the first. Replace both balls, and repeat the process. The game ends when all four balls have become the same color. What is the expected number of turns to finish the game?

Extra credit: What if there are more balls and more colors?
https://fivethirtyeight.com/features/can-you-solve-these-colorful-puzzles/

Code:

Graphics3D[{
Blue,
Opacity[0.5],
Sphere[{0, 1, 0}, 0.45],
Opacity[1],
Sphere[{3, 1, 0}, 0.45],Sphere[{4.01, 1.01, 0}, 0.45],
Opacity[0.8],
Arrow[BSplineCurve[{{0, 1, 0.6}, {0, 1, 2}, {3, 1, 2}, {3, 1, 0.6}}]],
Opacity[1],
Arrow[BSplineCurve[{{3, 1, 0.6}, {3, 1, 2}, {6, 1, 2}, {6, 1, 0.6}}]],
Arrow[BSplineCurve[{{4, 1, 0.6}, {4, 1, 2}, {7, 2, 2}, {7, 2, 0.6}}]],
Green, Sphere[{0, 2, 0}, 0.45], Sphere[{6, 2, 0}, 0.45],
White, Opacity[0.5], Sphere[{1, 2, 0}, 0.45], Opacity[1], Sphere[{4, 1, 0}, 0.45],
GrayLevel[0.5], Opacity[0.8], Arrow[BSplineCurve[{{1, 2, 0.6}, {1, 2, 2}, {4, 1, 2}, {4, 1, 0.6}}]],Opacity[1],
Red, Sphere[{1, 1, 0}, 0.45], Sphere[{7, 1, 0}, 0.45],
Blue, Sphere[{6, 1, 0}, 0.45],
Blue, Sphere[{7, 2, 0}, 0.45],
Yellow, Opacity[0.3], Cuboid[{-0.5, 0.5, -0.5}, {1.5, 2.5, 0.5}], Cuboid[{5.5, 0.5, -0.5}, {7.5, 2.5, 0.5}]
}, Boxed->False, ViewPoint->{3.5,-4, 3}]


danshawen likes this.

3. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Underthinking
I predict a number of people will attempt simulation to attempt to model this. Simulation is reasonably fast to code, fast to run, and directly tackles the process. However, there is not much insight and what you get is only an estimate of the solution. This estimate is subject to sampling error (as well as side effects from limitations of pseudorandom number generators) and the potential for coding errors. And if the result suggests that the answer might be an integer, this method cannot prove it.

Code:
for loop in {0,1}{0,1,2,3,4,5,6,7,8,9} ; do perl -e 'my ($n,$original_word); $n ||= 500;$original_word = q(ABCD); $m = length$original_word; for $outer_loop ( 1..100 ) {$total_steps = 0; for  $i ( 1..$n ) { $steps = 0;$word = $original_word; while ($word ne ((substr $word, 0,1) x$m)){ my $first = int rand$m; my $second = ($first + 1 + int rand ($m -1) ) %$m ; substr($word,$second, 1) = substr($word,$first, 1); $steps ++ } ;$total_steps += $steps; }$ave_steps = $total_steps /$n ; $sum_x +=$ave_steps; $sum_xx +=$ave_steps * $ave_steps ; }$mean_steps = $sum_x / 100.0 ;$std_dev = sqrt($sum_xx / 100.0 -$mean_steps * $mean_steps ); print "$mean_steps +/- \$std_dev\n"; '  ; done
8.98804 +/- 0.261424402839419
9.02652 +/- 0.280059796471987
9.02866 +/- 0.251948257386407
9.02970 +/- 0.273957861723211
9.00462 +/- 0.274443829589966
8.97784 +/- 0.269200249628428
9.03252 +/- 0.227146405650825
8.97138 +/- 0.283900643887821
9.00004 +/- 0.268573711297285
8.96382 +/- 0.259882757411781
8.99058 +/- 0.264163176086292
8.97590 +/- 0.261922488534204
8.99922 +/- 0.235823136269459
9.05588 +/- 0.256076210531268
9.02954 +/- 0.272420095440842
9.04244 +/- 0.258056827849984
9.03304 +/- 0.24901734558044
9.00062 +/- 0.271533378426867
9.00456 +/- 0.303176922604778
9.03930 +/- 0.232207558016679

Code:
BoxWhiskerChart[{8.98804, 9.02652, 9.02866, 9.0297, 9.00462, 8.97784, 9.03252, 8.97138, 9.00004, 8.96382, 8.99058, 8.9759, 8.99922, 9.05588, 9.02954, 9.04244, 9.03304, 9.00062, 9.00456, 9.0393}]
And really, where is the fun? The discovery?

danshawen likes this.

5. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Stochastic Matrixes and Simulation
Instead of having sampling error, we could simulate the probability of outcomes. First we need to enumerate all possible states, then set up a stochastic matrix which has the property that each column sums to 1 and where each element is the probability of transition (or non-transition if the element is on the diagonal where the initial and final states are the same). Repeated left multiplication of this matrix on an initial state vector gives the simulation of stepwise evolution of the probabilities.

A big gotcha: we have $4^4 = 256$ possible states. That means our matrix of transitions is $256 \times 256$ and even though it is mostly zero (no column can have more than 12 non-zero entries) it is prohibitively large. This approach also ignores that the problem was given as a puzzle.

Partitions

Beginning to think about this, we don't actually care which color the fourth ball winds up as. In fact, we don't actually care what any of the colors are. So how can we abstract out the colors? Well, we start with one ball of each of 4 colors and wind up with four balls of one color. So thinking about the set of four balls, we can replace that with the number four and replace the $4^4$ possible color states of the set of four balls with partitions of the number 4 into integers.

4 can be partitioned into distinct sums only 5 different ways:
1+1+1+1, 2+1+1, 2+2, 3+1, and 4

This looks doable.

Code:
GraphPlot[ {{"1+1+1+1"->"2+1+1",1}, {"2+1+1" -> "2+1+1", 1/2}, {"2+1+1"->"2+2", 1/6}, {"2+1+1" ->"3+1", 1/3}, {"2+2"->"2+2", 1/3}, {"2+2"->"3+1", 2/3}, {"3+1"->"2+2", 1/4}, {"3+1"->"3+1",1/2}, {"3+1"->"4", 1/4}, {"4"->"4", 1}}, EdgeRenderingFunction -> ({RGBColor[0.8,0.4, 0.4], Arrow[#1, 0.25], RGBColor[0.4, 0.8, 0.4], Inset[#3, Mean[#1] , Alignment->Center, Background->RGBColor[1,1,1,0.5]]}&), MultiedgeStyle->.3, VertexRenderingFunction ->({ Blue, Inset[#2, #1]} &), Method->SpringElectricalEmbedding ]

$\begin{pmatrix} 0 & 0 & 0 & 0 & 0 \\ 1 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & \frac{1}{6} & \frac{1}{3} & \frac{1}{4} & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & \frac{1}{2} & 0 \\ 0 & 0 & 0 & \frac{1}{4} & 1 \end{pmatrix}$

Last edited: May 5, 2017
danshawen likes this.

7. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Tricks with Matrices
In addition to the five state transition matrix, lets have it automatically add 1 to a counter for the portion of our state vectors not in the final state. Schematically:
$S_n = M^n S_0 \quad \longrightarrow \quad \begin{pmatrix} S_n \\ \textrm{count}_n \end{pmatrix} = \begin{pmatrix} M & 0 \\ \textrm{1s or 0} & 1 \end{pmatrix}^n \begin{pmatrix} S_0 \\ 0 \end{pmatrix}$
It's clear that since the state vector sums to 1, a property preserved by stochastic matrices, the count grows by no more than 1 per iteration. In fact it grows by exactly the amount of the state vector not in the final state.
Our augmented matrix looks like this:
$\begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & \frac{1}{2} & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{6} & \frac{1}{3} & \frac{1}{4} & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{4} & 1 & 0 \\ 1 & 1 & 1 & 1 & 0 & 1\end{pmatrix}$

And our augmented state vector evolves like this:
$\begin{pmatrix}1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix}0 \\ 1 \\ 0 \\ 0 \\ 0 \\ 1 \end{pmatrix}, \begin{pmatrix}0 \\ \frac{1}{2} \\ \frac{1}{6} \\ \frac{1}{3} \\ 0 \\ 2 \end{pmatrix}, \begin{pmatrix}0 \\ \frac{1}{4} \\ \frac{2}{9} \\ \frac{4}{9} \\ \frac{1}{12} \\ 3 \end{pmatrix}, \begin{pmatrix}0 \\ \frac{1}{8} \\ \frac{49}{216} \\ \frac{49}{108} \\ \frac{7}{36} \\ \frac{47}{12} \end{pmatrix}, \begin{pmatrix}0 \\ \frac{1}{16} \\ \frac{17}{81} \\ \frac{34}{81} \\ \frac{133}{432} \\ \frac{85}{18} \end{pmatrix}, \dots$

Last edited: May 5, 2017
danshawen likes this.
8. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Uh oh, where in simulation, each simulation individually came to an end, the state vector always has a non-zero component not yet in the final state.
What we want is (schematically) :
$\begin{pmatrix} S_{\infty} \\ \textrm{count}_{\infty} \end{pmatrix} = \lim_{n\to\infty} \begin{pmatrix} M & 0 \\ \textrm{1s or 0} & 1 \end{pmatrix}^n \begin{pmatrix} S_0 \\ 0 \end{pmatrix}$
But the rational numbers get increasingly unwieldy. Is it really possible to prove such a limit exist?

Enter Linear Algebra
In this case, we have a friend in eigenvalues. This matrix oddly enough has rational eigenvectors and eigenvalues. So we can quickly learn that all eigenvalues lie between 0 and 1. This is a diagonal matrix which we can raise to arbitrary (non-negative) powers by
$\begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & \frac{1}{2} & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{6} & \frac{1}{3} & \frac{1}{4} & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{4} & 1 & 0 \\ 1 & 1 & 1 & 1 & 0 & 1\end{pmatrix}^n = \begin{pmatrix} -3 & 1 & 0 & 0 & 0 & 0 \\ 6 & -2 & 1 & 0 & 0 & 0 \\ 0 & 1 & -\frac{1}{2} & -\frac{1}{18} & 0 & 0 \\ -4 & 0 & -1 & -\frac{1}{9} & 0 & 0 \\ 1 & 0 & \frac{1}{2} & \frac{1}{6} & 0 & 1 \\ 1 & 0 & 1 & 1 & 1 & 0\end{pmatrix} \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{5}{6} & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}^n \begin{pmatrix} -\frac{1}{5} & 0 & \frac{1}{5} & -\frac{1}{10} & 0 & 0 \\ \frac{2}{5} & 0 & \frac{3}{5} & -\frac{3}{10} & 0 & 0 \\ 2 & 1 & 0 & 0 & 0 & 0 \\ -\frac{54}{5} & -9 & -\frac{36}{5} & -\frac{27}{5} & 0 & 0 \\ 9 & 8 & 7 & \frac{11}{2} & 0 & 1 \\ 1 & 1 & 1 & 1 & 1 & 0 \end{pmatrix}$

Thus
$\lim_{n\to\infty} \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & \frac{1}{2} & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{6} & \frac{1}{3} & \frac{1}{4} & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{4} & 1 & 0 \\ 1 & 1 & 1 & 1 & 0 & 1\end{pmatrix}^n \begin{pmatrix}1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}$
$= \begin{pmatrix} -3 & 1 & 0 & 0 & 0 & 0 \\ 6 & -2 & 1 & 0 & 0 & 0 \\ 0 & 1 & -\frac{1}{2} & -\frac{1}{18} & 0 & 0 \\ -4 & 0 & -1 & -\frac{1}{9} & 0 & 0 \\ 1 & 0 & \frac{1}{2} & \frac{1}{6} & 0 & 1 \\ 1 & 0 & 1 & 1 & 1 & 0\end{pmatrix} \left( \lim_{n\to\infty} \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{5}{6} & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}^n \right) \begin{pmatrix} -\frac{1}{5} & 0 & \frac{1}{5} & -\frac{1}{10} & 0 & 0 \\ \frac{2}{5} & 0 & \frac{3}{5} & -\frac{3}{10} & 0 & 0 \\ 2 & 1 & 0 & 0 & 0 & 0 \\ -\frac{54}{5} & -9 & -\frac{36}{5} & -\frac{27}{5} & 0 & 0 \\ 9 & 8 & 7 & \frac{11}{2} & 0 & 1 \\ 1 & 1 & 1 & 1 & 1 & 0 \end{pmatrix} \begin{pmatrix}1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}$
$= \begin{pmatrix} -3 & 1 & 0 & 0 & 0 & 0 \\ 6 & -2 & 1 & 0 & 0 & 0 \\ 0 & 1 & -\frac{1}{2} & -\frac{1}{18} & 0 & 0 \\ -4 & 0 & -1 & -\frac{1}{9} & 0 & 0 \\ 1 & 0 & \frac{1}{2} & \frac{1}{6} & 0 & 1 \\ 1 & 0 & 1 & 1 & 1 & 0\end{pmatrix} \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} -\frac{1}{5} & 0 & \frac{1}{5} & -\frac{1}{10} & 0 & 0 \\ \frac{2}{5} & 0 & \frac{3}{5} & -\frac{3}{10} & 0 & 0 \\ 2 & 1 & 0 & 0 & 0 & 0 \\ -\frac{54}{5} & -9 & -\frac{36}{5} & -\frac{27}{5} & 0 & 0 \\ 9 & 8 & 7 & \frac{11}{2} & 0 & 1 \\ 1 & 1 & 1 & 1 & 1 & 0 \end{pmatrix} \begin{pmatrix}1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}$
$= \begin{pmatrix} -3 & 1 & 0 & 0 & 0 & 0 \\ 6 & -2 & 1 & 0 & 0 & 0 \\ 0 & 1 & -\frac{1}{2} & -\frac{1}{18} & 0 & 0 \\ -4 & 0 & -1 & -\frac{1}{9} & 0 & 0 \\ 1 & 0 & \frac{1}{2} & \frac{1}{6} & 0 & 1 \\ 1 & 0 & 1 & 1 & 1 & 0\end{pmatrix} \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} -\frac{1}{5} \\ \frac{2}{5} \\ 2 \\ -\frac{54}{5} \\ 9 \\ 1 \end{pmatrix}$
$= \begin{pmatrix} -3 & 1 & 0 & 0 & 0 & 0 \\ 6 & -2 & 1 & 0 & 0 & 0 \\ 0 & 1 & -\frac{1}{2} & -\frac{1}{18} & 0 & 0 \\ -4 & 0 & -1 & -\frac{1}{9} & 0 & 0 \\ 1 & 0 & \frac{1}{2} & \frac{1}{6} & 0 & 1 \\ 1 & 0 & 1 & 1 & 1 & 0\end{pmatrix} \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 9 \\ 1 \end{pmatrix}$
$= \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 9\end{pmatrix}$

Thus, we have the expectation of 9 steps are needed for all balls to be painted the same color, in good agreement with the simulation-derived estimate.

danshawen and iceaura like this.
9. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Automation

All of the above can be coded and run for different numbers of balls from 1..15. Maybe higher if you don't use a free-to-use cloud computing resource.
Dare we hope that mere humans be able to spot a pattern in this sequence?
Code:
2 -> 1
3 -> 4
4 -> 9
5 -> 16
6 -> 25
7 -> 36
8 -> 49
9 -> 64
10 -> 81
11 -> 100
12 -> 121
13 -> 144
14 -> 169
15 -> 196 
That was a lot of work for a very simple pattern. But the matrices get larger and larger, as the number of partitions of a integer grows rather quickly. Yet investigation shows that the eigenvalues remain rational. Something is going on.

Code:
nColors = 15 ;
nParts = PartitionsP[nColors] ;
parts = Reverse[IntegerPartitions[nColors]] ;
partToPaint[partition_] := Flatten[MapIndexed[ ( ConstantArray[#2[[1]],#1] ) &, partition ]]
indexToPaint = Map[ partToPaint, parts] ;
paintToIndex[list_] := FirstPosition[parts, Reverse[ Sort [ Map[ ( #1[[2]] ) & , Tally[list]]]]][[1]] ;
repaint[ painted_ , index_ ] := Module [ { i, j, first, repainted, newIndex },
i = Mod[index, nColors  ] ;
j = Mod[(index -  i)/nColors , nColors - 1 ] ;
first = painted[[1 + i]] ;
repainted = Flatten[ { Drop[ Drop[ painted, { 1 + i } ], {1 + j } ], { first, first }  }] ;
newIndex = paintToIndex[ repainted ] ;
newIndex ] ;
factor = nColors * (nColors - 1) ;
transitions = Normal[SparseArray[Normal[Counts[Flatten[Table[{repaint[indexToPaint[[m]], n], m}, {n, 0, factor -1}, {m, 1, nParts}], {1,2}]]]]] / factor ;
composite = PadRight[Append[ transitions,  Table[ Boole[k != nParts], {k, 1, nParts + 1}] ]] ;
eigensystem = JordanDecomposition[composite] ;
limitEig = MapAt[ (Boole[ #1 == 1]) & , eigensystem[[2]], {All, All} ] ;
initialVector = Table[ {Boole[k==1] }, {k, 1, nParts+1}] ;
expectation = ( eigensystem[[1]] . limitEig . Inverse[eigensystem[[1]]] . initialVector )[[ nParts+1,1]] 
Code:
Tally[Diagonal[eigensystem[[2]]]]

<|0 -> 41, 14/105 -> 34, 27/105 -> 24, 39/105 -> 21, 50/105 -> 14, 60/105 -> 12, 69/105 -> 8, 77/105 -> 7,  84/105 -> 4, 90/105 -> 4, 95/105 -> 2, 99/105 -> 2, 102/105 -> 1, 104/105 -> 1, 105/105 -> 2|>
All the eigenvalues are of the form $\frac{ { 15 \choose 2 } - { m \choose 2 } }{ { 15 \choose 2 } }$ (for m between 1 an 15 ) and are in large blocks.

This reflects the structure we haven't captured in our model, namely the irreversible steps are those transitions where the total number of colors goes down.

Could we have a simpler model?

danshawen likes this.
10. ### rpennerFully WiredValued Senior Member

Messages:
4,833
A Simpler Model
Code:
 nColors=15;
transitions = DiagonalMatrix[1 - Table[Binomial[nColors + 1 - k,2]/Binomial[nColors,2],{k,1,nColors}]] + DiagonalMatrix[Table[Binomial[nColors + 1- k,2]/Binomial[nColors,2],{k,1,nColors-1}], -1, nColors] ;
composite = PadRight[Append[ transitions, Table[ Boole[k != nColors], {k, 1, nColors + 1}] ]] ;
eigensystem = JordanDecomposition[composite] ;
limitEig = MapAt[ (Boole[ #1 == 1]) & , eigensystem[[2]], {All, All} ] ;
initialVector = Table[ {Boole[k==1] }, {k, 1, nColors+1}] ;
expectation = ( eigensystem[[1]] . limitEig . Inverse[eigensystem[[1]]] . initialVector )[[ nColors+1,1]] 
All that math with partitions and figuring out how the transitions work, gone.

$\lim_{n\to\infty} \begin{pmatrix} 0&0&0&0&0\\ 1&\frac{1}{2}&0&0&0\\ 0&\frac{1}{2}&\frac{5}{6}&0&0\\ 0&0&\frac{1}{6}&1&0\\ 1&1&1&0&1 \end{pmatrix} \begin{pmatrix} 1\\ 0\\ 0\\ 0\\ 0 \end{pmatrix} = \begin{pmatrix} 0\\ 0\\ 0\\ 1\\ 9 \end{pmatrix}$

But, wait, with a model this simple, we can go one step further. Each little step adds to the expectation value by its expection time of eliminating one of its colors. This is the reciprocal of the elements of the subdiagonal of the new transition matrix.

Thus for n=4, we have $\frac{1}{1} + \frac{1}{ \frac{1}{2} } + \frac{1}{ \frac{1}{6} } = 1 + 2 + 6 = 9$ Finally we have math simple enough to be a HUMAN-facing puzzle.

So for the general form we have the expectation value of:

$\sum_{k=2}^{n} \frac{ { n \choose 2 } }{ { k \choose 2} } = \sum_{k=2}^{n} \frac{n^2 -n }{k^2 - k } = (n^2-n) \sum_{k=2}^n \frac{1 }{k^2 - k } = (n-1)^2$

Last edited: May 5, 2017
danshawen likes this.
11. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Excellent blog post by Laurent Lessard on this: http://www.laurentlessard.com/bookproofs/colorful-balls-puzzle/
A PDF by Tim Black: http://math.uchicago.edu/~timblack/blog/Tim_Black_Riddler_April_28,_2017.pdf
A numeric evaluation of iterated transition matrices in MatLab by Taylor Malone: https://drive.google.com/file/d/0B2hVezMBHHMVM2tPandCUmRYN3c/view
Victor Bible calculated the distribution of terms for the n=4 case: https://vbible.github.io/2017/04/30/2017-04-30-the-riddler-painting-balls/
The probability (based on Victor Bible's formula) the game ends on term m is $P_m = 2^{-m} \left( \left( \frac{3}{5} \right)^{2-m} - 1 \right) ; \quad m \geq 3$
This is the same distribution one gets from $\begin{pmatrix} 0&0&0&1 \end{pmatrix} \begin{pmatrix} 0 & 0 & 0 &0 \\ 1 & \frac{1}{2} & 0 & 0 \\ 0 & \frac{1}{2} & \frac{5}{6} & 0 \\ 0 & 0 & \frac{1}{6} & 0 \end{pmatrix}^m \begin{pmatrix} 1 \\ 0 \\ 0 \\ 0 \end{pmatrix}$ where the lower right element is 0 to capture only transitions to the final state, not a sum of all such transitions.

Code:
nColors=4;
maimedTransitions = DiagonalMatrix[1- Table[Binomial[nColors + Boole[k!=nColors]*(1 - k),2]/Binomial[nColors,2],{k,1,nColors}]] + DiagonalMatrix[Table[Binomial[nColors + 1- k,2]/Binomial[nColors,2],{k,1,nColors-1}], -1, nColors] ;
FullSimplify[MatrixPower[maimedTransitions, m][[nColors, 1]], Assumptions->m>= nColors-1 ] // InputForm

Open question: Is there a nice form for the distribution for n balls?

Last edited: May 5, 2017
danshawen likes this.
12. ### danshawenValued Senior Member

Messages:
3,950
https://math.stackexchange.com/ques...s-in-a-box-how-many-times-do-i-need-to-pick-s

This is known as the collector's problem.

"The average number of tries needed to see each of the 4 colors is:

4/4 + 4/3 + 4/2 + 4/1 = 8-1/3

The terms <of the solution> are:

4/4 for the time to take one ball; then
4/3 for the average time it takes after the first ball until you see another code;
4/2 for the average time it takes after the first time you see the second color until you see a third, and finally
4/1 for the time you then have to wait until you see the last color.

13. ### rpennerFully WiredValued Senior Member

Messages:
4,833
That's not actually the same problem. Here the balls are actually changing colors and nothing prevents any of the remaining colors from overtaking the current leader color.

So the coupon collector's problem on n items has an expectation of $E_n = n \sum_{k=1}^{n} \frac{1}{n} = n H_n \approx \gamma n + n \ln n + \frac{1}{2} - \frac{1}{12 n} + \dots$
While Tim Black's painting puzzle has an expectation of $E_n = (n-1)^2$ which is strictly larger for n >= 4.

While ultimately linearity of expectations applies, the problem is seeing how to format the problem so that it applies.

https://en.wikipedia.org/wiki/Expected_value#Linearity

danshawen likes this.
14. ### danshawenValued Senior Member

Messages:
3,950
Kind of interesting that changing the colors only adds an average 2/3 of an extra turn to complete the puzzle.

15. ### danshawenValued Senior Member

Messages:
3,950
On the first draw, you might as well go ahead and take two colored balls out of the box (because you are going need to do that anyway). Whichever color you choose to paint the unmatched color on the first draw is therefore arbitary, and does not affect the outcome.

That being the case, I have just proposed a procedure for subracting one move from your answer of the average number of moves necessary to a solution, which is to say, the correct answer for an optimized solution would be 8, not 9. For the optimized procedure, En = (n-1)^2 - 1, which is to say, it is ALWAYS possible to shave one move off of the average number of moves calculated.

Overthinking? This is probably why the original collector's problem is really a better defined one.

16. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Wrong. It's not a problem about how to paint the balls the same color. It's a question about how long a given description of the process is expected to take.
That's both the definition and the question and you aren't allowed to change them.

danshawen likes this.
17. ### danshawenValued Senior Member

Messages:
3,950
Selecting balls is a distinctly different process from painting them, numbering them, or otherwise changing their designations, not to mention comparing them to whatever has already been selected, the only part of the original problem which remains intact.

It makes the answer you gave ambiguous, because you can't compare anything until you have selected a second ball, no matter which turn you are on.

What if you wished to parse out the different processes (not count them all as an integer increment of the same process as a simple select/compare)? I'm looking to simplify thinking about the problem to the extent it may be possible?

Last edited: May 6, 2017
18. ### danshawenValued Senior Member

Messages:
3,950
Got it.

Because painting the balls effectively decomposes the problem into operations of lower order, it breaks down the procedure like this:

En = (n-1) ^2 for n=3, the first time a ball is painted, = 4,
En = (n-1) ^2 for n=2, the next time a ball is painted, = 1,

En = (n-1) ^2 for n= 3, = 4,
the number of times comparisons of painted balls were made with balls that were of the same color (what I have already said mangles the problem so that additivity of expectations cannot be used).

4 + 1 + 4 = 9

What a cool problem! Thank you yet again, rpenner!
You will note, it even has symmetry.

Last edited: May 7, 2017
19. ### rpennerFully WiredValued Senior Member

Messages:
4,833
Not quite.
The essential thing different with n>2 is that it is possible that the lead color changes. So as the number of colors goes down, so does the chances of picking a ball in the minority as the second pick.

This is a puzzle which at the heart of it has the gradual extinction of paint colors as more and more repainting happen.

So, the most obvious thing is that the time (T) to go from n colors on n balls to n-1 colors has expectation value, ET =1.

For n>2, the time to go from n-1 colors on n balls to n-2 colors has ET = n/(n-2).
That's because the chance of first picking a majority ball and then a minority ball is (2/n)((n-2)/(n-1))
While the chance of picking two minority balls is ( (n-2)/n)((n-3)/(n-1))
This give the chance of the number of the colors going down is (n-2)/n from which we get the expected time.

In general, the time to go from k colors on n balls to k-1 colors has ET = (n^2 -n)/((n+2-k)(n+1-k)) for k >= 2 (subject to the simplifying assumption we are using the natural distribution of states with k colors)

The n=4 case is interesting because the k=2 colors case breaks down into the (3+1) state and the (2+2) state, and the first is twice as likely as the second, but only the (3,1) state is capable of transitioning to the k=1 state.
Starting from the previous (2+ 1+ 1) state we see there are 4 ways to initialize the (3+1) state and 2 ways to initialize the (2+2) state.
Starting from a mix of (12x) of the (3+1) state and (6x) of the (2+2) state we see that the next step has (10x) of the (3+1) state and (5x) of the (2+2) state.
So the chance of transitioning is (2/3) times (3/4)(1/3) = 6/36 = 1/6 so the ET to go from 2 colors to 1 is 6.

So the total time to go from n colors n balls to 1 color is the sum of these terms (linearity of expectation value).

For n= 2, ET = 1 because the first painting step is the last.
For n=3, ET = 1 + 3 because the second painting only diminishes the number of colors if the first ball is the majority color and the second is the minority color.
For n=4, ET = 1 + 2 + 6 = 9
For n=5, ET = 1 + 5/3 + 10/3 + 10 = 16
For n=6, ET = 1 + 3/2 + 5/2 + 5 + 15 = 25

Last edited: May 7, 2017
danshawen likes this.

Messages:
3,950
Excellent.