> In spite of the diversity of strategies one can design, it is important to remark that the heads-up limit Texas hold'em variation has been claimed to be "essentially weakly solved" in January 2015 by the Cepheus poker-playing bot.[1] This means that on average the program is so good that a human would have no chance of ever edging ahead of it, even if the two played 60 million hands.[2] The bot can be played online at poker.srv.ualberta.ca, and users can even query strategies from the software
I watched that, and I think his understanding of what a "statistical tie" is is extremely dodgy (https://youtu.be/gz9FJfe2YGE?t=10m43s). What's up with that?
Edit: he does make a good point later on that playing thousands of games is exhausting, which would affect the reliability of the comparison between the AI and the pros. I don't know anything about poker, but it sounds plausible. Presumably this means they should make the AI so good that it can demonstrate its greatness quicker than a human gets tired.
Chalk it up to poker being something you can be skilled at without really having much mathematical understanding, and throw in a little "common sense" (i.e. anti-intellectualism), and you get comments like that.
The charitable interpretation of his argument is that the outcome is only a statistical 'tie' if you accept the arbitrary significance threshold of p=5%. Even relatively weak evidence is still evidence.
Unfortunately, some of the examples which he claims don't pass this significance threshold actually do. This does not help his credibility.
FWIW, the win rate was -91mbb/g, and so they reported the probability of being equal in skill somewhere between 5% and 10%, hence the "tie". The introduction to the other poker paper on HN now (https://arxiv.org/pdf/1701.01724v1.pdf), by a different team, from Alberta, is rather less gracious and says it was a "huge margin of victory". I think that whatever the interpretation, his video makes it clear he didn't understand any of this.
Significance works the other way round than you described it -- if two players were equal in skill, then a winrate of -91mbb/g or larger would happen 5-10% of the time.
I don't know if Doug Polk understands that or not, but I agree with his criticism and I think his analogy with sports reporting is sound. While "statistical tie" is true in a technical sense, it's not usually how matches are reported, and it's somewhat disingenuous and self-serving to use that language. It would be more honest to say that it's very unlikely that bot is better than humans -- given the advantages the bot had (a gruelling 2-week schedule, and pressure on the humans to get through N hands per day) it still lost by a significant margin.
You're right, I would have liked them to report posterior probabilities, but those were p-values instead.
> It would be more honest to say that it's very unlikely that bot is better than humans
I don't think the analogy is sound. Consider the alternative: over a long match, the humans failed to beat the bot convincingly. But in any case, the examples he gives are so wide of the mark that it's clear this isn't the sort of argument he's making—he talks about "spin" and stuff.
The notion of statistical significance as applied to research papers doesn't map well to the poker domain. He keeps repeating that they won 9 big blinds per hundred hands over an 80k sample because anyone familiar with poker at all knows that's an absolute spanking.
Does that mean no-limit poker is volatile enough that results considered by insiders to be long-term don't achieve 95% confidence? Idk probably, I haven't looked at the math myself. Is 73 buy-ins over 80k hands a huge win? No doubt.
Personally, I think it would be much more interesting to see an AI compete in a multi-player game with, say, ten human pros at the table. Not just heads-up.
There's so much variance in the game that it would be hard to know how good the AI really is. If they're doing 120K hands heads-up, they'd need to play far more in a full table. Getting that many pros to play that many hands would be difficult.
And what's wrong with heads-up, anyway? It's not as popular as multiway, but it's a fascinating game. In a full table game, you usually only end up playing 20-25% of your hands. Heads up you play 90%+. You end up in leveling wars with your opponent more often, too ("he knows that I know that he knows I usually wouldn't play a premium pair this way").
Maybe you are right, but in Poker championships you see many of the same names in the top from one year to the next, right? (I could be wrong about this)
That could suggest the variance is manageable.
I'd like to see more players because we have seen so many two-player AI contests before (chess, go). It would for example be interesting to see MCTS in this setting.
The variance is manageable if you play enough hands. This is one reason why a lot of pros prefer to play online - by playing many tables at once, you get more hands in and can better overcome the variance.
As for tournaments, the variance is even greater, since it is closer to all or nothing. To mitigate this, pros "buy a piece" of each other. Or swap equity. If either wins they pay the other a percentage. If you do this with enough other pros, you even out your swings...
I don't know if you've actually played poker, but heads up you still probably play only 20-30% of your hands (unless it's a tournament at least). A lot of full table games you can play less than 10% of your hands (just my estimation. A pro can probably tell you better numbers).
If each player follows your strategy, 40-60% of hands are won by nobody.
Playing 10% of your hands is very tight if you're up against other good players. At a loose table it might be very profitable though, especially if the other players aren't paying attention to how tight you are, and therefore don't give you enough credit for having a strong hand when you do play. That kind of table is pretty common at low limits.
LOL if it can figure out how to communicate, negotiate a share of the reward money, and pay off the other players ... but should the other player's trust it to keep up its end of the deal? on one hand it's a prisoner's dilemma, on the other hand what's the computer going to do with the money, throw a party in Vegas?
All you'd discover is that poker is a very very volatile game and the number of tournaments it would have to play in to prove anything would be beyond time available.
It's worth noting that most poker pros will constantly re-buy in early rounds in big tournaments. It's a general recognition that it takes money, not skill to survive. Being at the later tables helps them maintain their fame, which is really their business.
Of course, if you're playing against amateurs or drunks, you can always win. But usually at the real tournaments they are few and far between.
It's a general recognition that it takes money, not skill to
survive.
This is complete bullshit. Rebuys in the first round only help to insulate you from the chaos of the first round, where there tend to be a lot of wild players who gamble. Tournaments go on for 20, 30, 40+ rounds. The fact that people have won multiple championships and that many people continue to reach the final tables is evidence against the notion that it's just about money.
> The fact that people have won multiple championships and that many people continue to reach the final tables is evidence against the notion that it's just about money.
In modern poker (not the old days where few were playing), the max bracelets anyone has won is 5. Assuming that past a basic threshold of skill it's a 50/50 crapshoot, there's a 1 in 32 chance you get that on luck. There are thousands of people above the basic threshold.
"Professional" poker really is about tourists and drinkers. The grind is too slow, painful, and volatile. I know a couple of people who support themselves on only poker. That's their strategy year round between tournies. The reason they play the big tournies is to get lucky, and parlay their fame into sponsorships and such.
You're mixing your analysis of tournament and cash games. There is no rake in a tournament, at least not in the same fashion. Also, I'm not aware of a tournament that allows rebuys or top-ups that give a stack larger than the initial stack. I don't really play tournaments, though. Cash games only.
I play often enough. If the table is too tight, I go loose and take all the blinds. Sure, beating the rake can be tough, but it's not impossible.
>All you'd discover is that poker is a very very volatile game and the number of tournaments it would have to play in to prove anything would be beyond time available.
A pro in an adjacent seat once remarked "There's so much luck in tournament play you'd need a dozen lifetimes to know if you're actually beating them."
Oh, so playing online with access to statistical tools and analysis, so really the computer advantage for knowing the odds is completely nil. It's really only then competing in the strategy element... very interesting indeed.
I'd consider it unlikely (especially with two-tabling, and that poker players tend to study game theory and understanding of odds offline, not "live" or during gameplay itself) that the players will be referencing or accessing any tools during play.
It's just that it's becoming more and more in the toolbox of poker players to gain understanding of Nash equilibrium (conceptually and practically) and other important concepts that they fold into their strategy (based on meta-game, familiarity with the style and strategies of opponents, etc.)
More sophisticated players are constantly calculating real and implied odds at all times, without relying on a tool to do so (often there is fuzzing or approximation involved, but it's typically "close enough"), but especially in heads up play it's the more esoteric parts of strategy that are more important. Bluffing, slowplays/trapping, reads and reaction (not always physical as in tells but often habits).
Poker HUDs are very common online that give stats for every player (rate they have been calling at, times raised, pots won) and put it next to their icon so it can be easily used on multiple tables. (For example: https://pokercopilot.com/)
Yes, but I would place HUDs as somewhat more of a facile level of meaningful analysis - they're often used as a baseline for people who are multitabling and unable to watch a broad range of hands for patterns and habits. I've never relied on a HUD for the full decision-making in a single, particular hand. It's meta-information that is not actionable purely on its own (except maybe in the most ideal set of circumstances, like weeding out a good table vs. a bad one).
Calculating odds against hand ranges is too complex to be done by a human in real time while playing, even using calculators made for the purpose. Computers definitely have an advantage there. Assigning hand ranges in the first place is the trickier part for an AI, as it is more rooted in psychology.
This is a great way of putting it, your last part especially - assigning hand ranges is one of the things many players would probably attribute to gut or feel. Some implicit decision making ability that causes them to prune away vast parts of the possible tree and then make quick decisions on the remaining ones. Often based on knowing opponent history, observing patterns, etc.
Agreed. I would say one difference in poker is that strategy can be (and is, constantly) adapted over the life of a competition (and over the span of several competitions). Tempo, adjustments, stack and blind management/inflection points, even opponent avoidance in certain circumstances - I expect some of those to be things that humans may hold an edge to some degree. Those factor in a lot differently in other types of games (tournament play especially). I don't understand AI well enough to know what would be the hardest possible game for AI to conquer.
imagine a human player playing solo against a table full of optimally designed AI's that compute perfect statistical and economic odds based on a huge library of prior games including human games. What you are calling "advantages" for the human are no advantage at all. So now add a second human at that table, now the advantages you cite, they work only against that other player.
in terms of intial hand evaluation based on "gut", this is similar to chess position evaluation, a key part of master and computer play, a part that's been hard to get right, a part that brute force depth helps to get right, a part that is now completely solved for computers in play against humans. There is nothing particularly challenging about poker that stands in the way here.
It really isn't that hard in Texas Hold'em most of the time. You are usually only calculating for a few hands, and can memorize odds in advance (since while the combinations of cards is gigantic, the number of actual categories of combinations isn't that many)
That's wrong. Figuring out your true odds against multiple players who can hold an unevenly distributed range of hands requires a bayesian calculation that takes at minimum a few minutes for a human to perform, even using the best software tools for the purpose.
Strong players develop an intuition for this, making estimates on the spot, but they are frequently wrong and can't be very precise.
Just as the Naive Bayes algorithm is remarkably effective, a human approximation is often pretty good. Good enough that the strategy of the AI will be more important than its combinatorics.
Sibs have alluded to this but it bears repeating: it's not a matter of simply calculating your "odds". One of the first things a player learns when getting serious about poker is how to calculate their odds of making a given hand, it becomes quite second nature, and even being able to work with floating points rather than rounded approximations would not give the AI a significant advantage.
As one sib points out, calculating your odds of winning against a range of hands is significantly more difficult, but again raw computation would not give a significant advantage here as assigning an opponent an accurate range of hands without significant historical data on their play is more or less impossible.
Any online poker pro will tell you that 120,000 hands is nothing. You can win over the first 120,000 hands and lose over the next 120,000 hands or break even. This is because the margins are low and the variance is very high. Someone winning 2 big blinds per 100 hands (with 80 bb variance) will win between 7901bb and -3141bb 95% of the time.
Also, it appears they are playing a tournament, which increases the variance significantly. I actually think tournament play is less suited to measuring AI performance than a cash game, because you're basically playing a series of different games as the tournament progresses that may not be generalizable because the decision math changes based on blinds and stack size unlike a cash game.
> To ensure that the outcome of the competition is not due to luck, the four pros will be paired to play duplicate matches — Player A in each pair will receive the same cards as the computer receives against Player B, and vice versa. One of the players in each of these pairs will play on the floor of the casino, while his counterpart will be isolated in a separate room.
This technique won't work for the same reason mentioned above. Hands in tournament poker are necessarily not independent events.
I use the second hand on my watch to help with mixed strategies; sometimes using it to inform size of raise, or the direction of raise/fold, raise/call, and (rarely) call/fold decisions.
As such, if you happen to notice me look at my watch, you can deduce I might be raising, or calling, or folding, with a strong, weak, or medium hand that is already made, or on the come; or that I am interested in the time for a reason such as wondering how long i've taken, how long i've been at the table, if i need to leave, if a new dealer might sit and collect time soon, or something else.
I get that you're a pedant who wants to be right; but it's not a remotely useful tell unless the person does it only in one type of situation. but nobody who is concerned about how well they mix a mixed strategy is dumb enough to mix only _one_ strategy.
edit: and i'm not a pro. i play in mid-stake games with $1-5k buy-ins.
The point was that if you're going to look at the clock and use the second hand to help you make decisions about whether or not to bluff, look at it at other times too when you're not making a decision, so that looking at the clock doesn't become a tell.
You can compute perfect strategies for much simpler variants of Hold’em, e.g. Leduc Hold’em. Then you see that bluffing is absolutely required for optimal play.
Bluffing seems like a human tactic but in fact is completely mathematically justified and is absolutely required for optimal play. A bot that bluffs will destroy a bot that doesn't.
I'm guessing they're training the AI independently for each player, with that in mind I wonder how good it would perform when matched against a different player. Knowing your opponent's playing style is one of the main things you have to master in poker. This is specially difficult as playing style will vary as the game goes, and sometimes solely by the player's will (to trick you) and not by some information derived from the game. I can see a bot performing reasonably well on this task for a specific player despite all the difficulties, but for the general case it seems like a huge challenge (as it is for humans).
If it's just looking for equilibrium, it won't make any money. It also does live analysis of end-game situations, so I'm guessing it'll make use of U of Alberta's research into deviating from equilibrium to exploit leaks in the opponent's play.
It will if the human is not playing perfectly. Which of course they aren't because they are human. The question is if the computer is playing close enough to equilibrium/optimal compared to the humans.
That assumes that the Nash equilibrium is optimal. It isn't. Optimal play is to deviate from Nash equilibrium to take the most exploitative strategy against specific opponent weaknesses. This necessarily creates weakness in your own play. Optimal play sends misleading signals to the opponent. Basically, you signal "rock" so that the opponent plays "paper" but you actually play "scissors".
(In this metaphor, neither "rock" nor "paper" nor "scissors" are equilibrium strategies.)
For example, I might order a couple beers, straddle a few times and start egging on the rest of the players for a round of straddling. To get it going I might throw in a few blind bets. Ask folks to go all-in blind once or twice, for fun. Do that a few times and no one expects you to have a good hand. Well, except for the folks that have seen that story play out before. Except for the beers and bending the rules, the computer can do the same thing.
Without having looked into the details of this particular bot, they're almost always trained against themselves. It would be very surprising to me if they bootstrapped with any hands from live players.
Nobody who's been paying attention in the poker community would be shocked by this. HULHE has already been a AI win for a few years. HUNLHE is basically just a matter of time.
Nash equilibrium strategies specifically do work better against thinking opponents - the usual complaint about them is that they don't maximally exploit "bad" opponents.
I feel that Poker is one of those games that an AI could develop a winning strategy for, but not the most winningest. If you deployed this AI on pokerstars, it would win you money. If you deployed it at the WSOP, it would not win a bracelet.
nit: as a former professional poker player, the skill level required to be a long-term winning player at a high-stakes cash game is considerably higher than the skill requirement to win a single high-stakes tournament. (I would not voluntarily sit at a high-stakes cash table without knowing there were a mark present, I would voluntarily play most WSOP tournaments)
As another former pro and high stakes player, I agree, and I don't think anyone thinks it's particularly close. I'd even go so far as to say that the skill it takes to be a long term winning player at cash games is higher than that to be a long term tournament winner.
Agreed, and that's speaking as a stubborn amateur. In tournaments, the escalating blinds point you toward pretty straightforward strategies at different junctures. You're in the shallow end of the pool. In a cash game, you're always in the deep end.
I think "fish" is more typical. Though I suppose in Rounders it was "sucker". Still the best poker movie, despite depicting the actual poker play badly. https://en.wikiquote.org/wiki/Rounders_(film)
One thing to support this is the idea that there doesn't seem to be much more to differentiate a winner and a loser at the WSOP than randomness. Among the best players, they are likely so close in skill that winning and event is determined by a slightly weighted coin toss.
The other side is that if AI could discover some elements of lie detection that even humans cannot observe. It may not have anything to do with the power of the AI's computing, but more that its hardware can observe outside stimuli that humans cannot, and the AI may be able to integrate it into its strategy (e.g. a facial twitch that cannot even be seen by the naked eye).
I think the latter point is the one that could give an AI the edge to bracelet-win at poker. The play itself is too human for an AI to develop the right strategy, but the AI can find sources of truth in the physical actions of human opponents that humans would not be able to see.
Dwan is something of a reclusive/shady figure these days. Also it's unlikely he's currently playing anywhere close to his skill peak/ the skill level of the four selected - once a poker pro has achieved a level of fame commensurate to his, they can live comfortably for the rest of their days bleeding wealthy amateurs who get their kicks from playing (and losing) high stakes cash games against pros.
Don't know about Blom - probably just insufficient action for him (couldn't tell from the article, but it seems likely they're playing play money cash games, with fixed prizes for players who end ahead).
They're heads-up pros. Ivey and many other well-known names are not heads up specialists like these competitors. They'd likely have an edge over almost any "recognizable" name player.
Jason Les and Dong Kim both played in the previous competition, with Doug Polk (arguably the best heads up player in the world). They're reasonably in the discussion, and other factors such as interest and availability possibly factored in.
The $200k pool of money for an actual 'celebrated' poker player is absolutely not worth the time and effort. Hence why you get a few tiers down in terms of player skill. Still valid tests though, you can't only compare it to the top 1%
It's not a few tiers down in skill, it's a few tiers down in fame. Each of the four players named is likely a favorite over Ivey in this format. Being a specialist counts for a lot here.
“Your favorite poker player almost surely wouldn't agree to play any of these guys for high stakes, and would lose a lot of money if they did,” Galfond added. “Each of the four would beat me decisively.”
All-time winnings is not the best measurement as it doesn't take into account things like years played/measured, single individual large wins, etc. It's a decent list for one mode of inspection (winnings) but shouldn't automatically be taken as a ranking of players by skill.
Ivey is a very good player, for sure. In the context of HU play (as being discussed in the article), there are better.
That looks much more like he quit playing online. Saying he 'just got lucky a few times' is very odd. Ivey is considered one of the top pros by virtually everyone.
Beyond that, he (like many top pros), prioritizes lucrative cash games over tournament play (or exploiting an edge at baccarat, if the opportunity arises) which we can't track.
The 90's called. They want their poker strategy back.
Poker is not about reading your opponents "tells". That is a myth perpetuated by Hollywood and ... professional poker players.
While there definitely could be some value in using psychological manipulation to best your "live" opponent, gaining some advantage via a physical tell is a crap shoot at best except against the absolute worst players or the biggest mark.
I did play with a guy for several sessions who would consistently shake his leg whenever he had a good hand. I felt so bad for him (after busting him a few times) that I told him about it so he'd fix the shaking.
https://en.wikipedia.org/wiki/Heads_up_poker