Again, for 20 flips, there are 17 sets of 4 consecutive flips (1-4, 2-5,...,17-20), each of which is dependent on the set before it.

That leaves 1 set that is independent(1-4), 1 set that is dependent on one other set(2-5), and 15 sets that are dependent on two other sets.

In order for the first 2 sets to be different, the first 5 flips can have 13 of the 16 possible configurations (same is S, D is different)

SSDS

SSDD

SDSS

SDSD

SDDS

SDDD

DSSD

DSDS

DSDD

DDSS

DDSD

DDDS

DDDD

of these combinations, only the last 2 flips influence the next groups, thus there are four possible endings:

SS = 2/16 with no set the same, 2/16 with at least one set the same

SD = 3/16 with no set the same, 1/16 with at least one set the same

DS = 4/16 with no set the same

DD = 4/16 with no set the same

This tells us there are six unique groups of first 5 flips:

1) first two sets are different, ending with SS = 2/16

2) first two sets are different, ending with SD = 3/16

3) first two sets are different, ending with DS = 4/16

4) first two sets are different, ending with DD = 4/16

5) at least one of first two sets are same, ending with SS = 2/16

6) at least one of first two sets are same, ending with SD = 1/16

by choosing between these 4 endings, we can see what probabilities develop in the remaining 15 flips.

EDIT: continuing the last thought,

in each of the remaining 15 flips, we operate off one of 4 endings:

SS:

A) next flip is the same, resulting in a set that is the same, with the ending SS = 1/2

B) next flip is different, resulting in a set that is different, with the ending SD = 1/2

SD:

A) next flip is the same, resulting in a set that is the different, with the ending DS = 1/2

B) next flip is different, resulting in a set that is different, with the ending DD = 1/2

DS:

A) next flip is the same, resulting in a set that is the different, with the ending SS = 1/2

B) next flip is different, resulting in a set that is different, with the ending SD = 1/2

DD:

A) next flip is the same, resulting in a set that is the different, with the ending DS = 1/2

B) next flip is different, resulting in a set that is different, with the ending DD = 1/2

With a 50% chance of A and B happening for each scenario.

So what we have now is a tree with six top level nodes where every node has exactly two child nodes with each node having a 50% probability (A and B for each ending). There are 15 sets of two child nodes, so for every level down the tree, the probability decreases by a factor of 1/2, until at the bottom of the tree, the total factor is (1/2)^15. This tree is used to solve for the probability for the condition that at least 4 flips are the same (SSS, or an SS node that is a child of another SS node).

Let's first revisit the 7 flips scenario, where the tree has 2 layers below the top level, looking something like this (XX) is a node without the condition, {XX} is a node with the condition:

(SS) (SD) (DS) (DD) {SS} {SD}

{SS} (SD) (DS) (DD) (SS) (SD) (DS) (DD) {SS} {SD} {DS} {DD}

{SS} {SD} (DS) (DD) (SS) (SD) (DS) (DD) {SS} (SD) (DS) (DD) (SS) (SD) (DS) (DD) {SS} {SD} {DS} {DD} {SS} {SD} {DS} {DD}

There are some definitive patterns, but the probabilities depend on how many layers are below the top level. For example, let's follow the (SS) node down 4 layers instead of 2:

(SS)

{SS} (SD)

{SS} {SD} (DS) (DD)

{SS} {SD} {DS} {DD} (SS) (SD) (DS) (DD)

{SS} {SD} {DS} {DD} {SS} {SD} {DS} {DD} {SS} (SD) (DS) (DD) (SS) (SD) (DS) (DD)

at row 1, 1/2 of the nodes have met the condition.

at row 2, 2/4 of the nodes have met the condition.

at row 3, 4/8 of the nodes have met the condition.

at row 4, 9/16 of the nodes have met the condition.

how many nodes will have met the condition at row 5? The answer should be 19/32.

how many nodes will have met the condition at row 6? The answer should be 40/64.

This is looking like a matrix problem, I just have to find a way to structure it correctly. Hopefully I figure it out tomorrow!

To represent this problem, a 5x5 matrix will be constructed for the dependency of each of the endings and the condition, such that multiplying this matrix by a vector containing the current amount of each type of node will give the amount of each type of node at the next level.

Here is the format:

Matrix A Vector x

0 0 1 0 0 ^ n (SS)

1 0 1 0 0 (SD)

0 1 0 1 0 * (DS)

0 1 0 1 0 (DD)

1 0 0 0 2 {XX}

now let's check for the (SS) case worked out above, where x = [1 0 0 0 0]':

row 1 = A*x = [0 1 0 0 1]'

row 2 = A*A*x = A^2*x = [0 0 1 1 2]'

row 15 = A^15*x = [927 1431 1705 1705 27000]'

thus in row 15, 27000/32768, or 84.20% of the nodes meet the condition.

So now this powerful matrix has been discovered, the problem can be approached slightly differently:

After the first two coin flips, we end up with 4 top level nodes of equal probability, one for each of the 4 endings, and the condition hasn't been met at any of them. That leaves 17 more nodes. Using the matrix with n=17 and x = [1 1 1 1 0]':

A^17*x = [19513 30122 35890 35890 402873]

Thus at the last row, 402873/524288, or **76.84%** of the nodes meet the condition.

Just for fun, to see what the probability is after 50 flips, we would use n=47 and the same x as before:

A&47*x = 10^14*[0.0170 0.0262 0.0312 0.0312 5.5239]

meaning that the probability of the condition being met after 50 flips is 98.12%