Posted by Mathemagical in the poker forum:
I've been reading through some old threads
on the subject of skewing in Full Tilt Poker, and I saw a disturbing
mix of solid information and misinformation about random number
generation and statistics, so I thought I'd take a moment to address
some of the issues raised and correct a few errors.
Why should you listen to me? I have a degree in Pure Mathematics
and Computer Science, and one of my primary research interests is
number theory, so I know (more or less) what I'm talking about.
I'll start out talking about random number generation in general,
and then move on to how it applies to card shuffling algorithms.
1. What is a random number?
When we talk about generating 'random numbers' in a computer, we
generally want to generate very large (say, 32- or 64-bit) random
numbers. However, from a theoretical standpoint the problem is no
different from randomly generating a 0 or 1 -- i.e., a coinflip -- so
we'll talk about unbiased coinflips instead of numbers in this section,
just until we get a grip of exactly what it is we mean when we talk
about 'random numbers.'
What is an 'random coinflip generator,' then? Think of it as a
black box (let's call it HAL) that sits idly in the corner of your room
until you ask it to perform a coinflip. Then it replies with either
'HEADS' or 'TAILS' and nothing else. HAL will give you a response each
and every time you query it, until the end of the time. However, here's
the kicker: without opening HAL and looking at its internal states, you cannot predict what its next answer will be based on its previous answers.
You may know how it is programmed, and what its thought process is WHEN
it is in a certain state (i.e., you may know its algorithm) but you do
not know what all of its internal 'variables' are set to at any point
in time.
This is a tough criteria to impose on our machine. In reality, we
don't have any true random coinflip generators, because given enough
processing time we CAN predict what result will be generated. However,
it would take roughly the lifespan of the universe to break our best
random coinflip generators, so we still say its results are
"impossible" to predict without knowing its internal states.
2. Can HAL output "HEADS" ten times in a row? A thousand?
Yes! Think about what would happen if it were impossible for HAL to
answer "HEADS" ten times in a row. Then if it ever answered "HEADS"
nine times in a row (which is bound to happen even with a physical
coin, eventually) then you would know
that its next answer has to be "TAILS", which means you've predicted
HAL's next answer without looking at its internal states. This is a
no-no, if HAL really is a random coinflip generator.
3. After a thousand answers, is HAL guaranteed to have given 'close to' 500 TAILS replies and 'close to' 500 HEADS replies?
No. Again, see above. If we knew that HAL would output roughly 500
TAILS and 500 HEADS, perhaps with some allowed error, then again we'd
be able to predict to some certainty what its outputs would be as we
get close to its thousandth answer.
(Here's the rub: if, over a large enough sample size, we
CONSISTENTLY observe that a machine is answering TAILS 51% of the time
and HEADS 49% of the time, then we can become reasonably certain that
this machine is not truly a random coinflip generator.)
4. Is Full Tilt Poker's random number generation biased?
According to the email quoted in BBB's thread, Full Tilt Poker uses
three, independent random number generators and combines their results
to create a 32-bit random number.
Using XOR to combine the three numbers simply ensures that an
attacker (or bored mathematician) would need to break all three
generators in order to predict the next number to pop out of the
machine. There's not much more to be said about this.
So, what about the three methods themselves?
i) ISAAC
The ISAAC method is very fast, well-known, and relatively well-researched RNG.
It is known to be imperfect: if you initialize ISAAC with certain
starting states, it will provide biased outputs. However, this flaw has
been known for several years and it is quite rare to begin with, so I'm
willing to bet that it is taken into consideration.
ii) OpenSSL
OpenSSL has been around for over a decade and is incredibly
well-researched. Used properly, it is incredibly secure, and its random
number generators are the basis for much of your privacy online. In
short, literally thousands of mathematicians and computer scientists
have an interest in its reliability, and any significant flaws have
gone over all of their heads.
If there's a weakness in Full-Tilt's RNG, it doesn't lie here.
iii) Hardware RNG
Implemented properly, the output from one of these things is truly random. Wikipedia has a decent article on how they work. I can't say much else about this, since Full-Tilt didn't disclose exactly what kind of RNG they use.
The beauty of XORing the three results together is that ALL THREE
methods need to be biased in order for a bias to show in the result.
The chances of this seem slim.
5. But how do we KNOW?
Here's the thing: a biased RNG is easy to detect.
There are several algorithms for testing RNGs and determining, to any
desired degree of accuracy, whether or not their outputs are unbiased.
These tests aren't arbitrary. They work and are mathematically sound.
The question is whether Full-Tilt tests its RNG results, and to what accuracy. That's obviously something I don't know.
6. Is Full Tilt Poker's card shuffling skewed?
Okay... let's say we have a reliable, unbiased RNG and wish to use it for shuffling a virtual deck of cards.
There are two, classic algorithms for doing so, both of which
produce a random shuffling as long as the underlying RNG is unbiased:
i) The simplest way is to assign every card in the deck a random
number, and sort the cards based on their assigned numbers. This method
is simple, but not particularly efficient, as sorting lists of numbers
isn't particularly quick.
ii) A more efficient method is to move through the deck one card at
a time, and swap each card with a random card from the remainder of the
deck.
The problem with the second method is that it is very easy to screw
up by an amateur programmer. It requires a fair bit of care to avoid
errors that make the shuffling biased. However, if you're a
multi-million dollar company and can afford to hire competent
programmers, you'll probably do so. But who knows.
If Full-Tilt's card shuffling is indeed skewed (which I think is unlikely), I wouldn't be surprised if the error is here.
However, it is at best unclear and at worst impossible (I suspect the
latter) for such a skewing to consistently give underdogs an edge in
poker games, so it seems irrelevant to the discussion.
7. Are decks skewed to fit 'typical' distributions?
JP says the following:
| Quote: |
| For
algorithms to run, it's mathematical, poker deal is "fixed" based on the
program to deal specific "typical" statistical averages over time. The
program knows what it's dealt and what should be dealt. How it evolves
from that I don't have a clue. Perhaps the program picked out that its
deal was missing some statistic point, switched to maintain "XY" cards
in "X" hands dealt. Once point reached, it went to it's normal
"routine". |
This should ABSOLUTELY NEVER BE DONE and NEVER would be in a setting
like this. EVER. If any one mathematician of the dozens surely employed
by Full-Tilt did this, they would be fired on the spot. Skewing random
results to make them look more 'typical' is incredibly dangerous to any
random number generator, as it actually GIVES AWAY information about
the results. (See questions 2 and 3.)
There is a story in Cryptonomicon by Neal Stephenson about how
during World War II, middle-aged women in top secret facilities would
pick out lotto balls from a chamber, day-in-day-out, in order to
generate the random keys the Allies to use to encrypt their
transmissions. Often, the ladies would notice what seemed to be
anomalies: they'd pick out a Q or an X 'more often than they should'
and would put back 'unlikely' letters like these that seemed to appear
too often.
The Germans realized this, and were able to crack the Allies'
transmissions, using the knowledge that certain letters in the
encryption keys would always appear at a certain frequency.
8. What possibility are we left with?
The only remaining possibility is that Full-Tilt deals fixed decks on purpose, and reshuffles them during a hand,
to let the n00bs suck out on you. But why would they? Why risk such
massive lawsuits to skew the game in certain players' favour? If you
have a theory, feel free to let me know. I can't think of one.
8. Any conclusions?
Not really. I just hope this was informative to someone, and helped to clear up some misconceptions about this subject.