Exploiting the sit-and-go

posted on 2011-11-21

The key to winning at poker is finding a game you can exploit. As Matt Damon tells us in Rounders:

Listen, here’s the thing. If you can’t spot the sucker in the first half hour at the table, then you ARE the sucker.

To make money at poker, you must find and take advantage of your opponents’ weaknesses and minimize your own. This is true if you’re playing as a person, or a bot is playing for you. In this section, I will describe how sit-and-go style games are particularly easy to exploit as a pokerbot.

If you need a refresher on basic poker hands or how to play Texas Hold’em, I recommend reading the wizard of odd’s page on Texas Hold’em before continuing.

Pokerbot Style

Everyone has a different style of playing poker, and each of these styles are more suited for different types of games. Before we look at exactly what types of games we can exploit, we should analyze what our own strengths and weaknesses are.

• Consistency of play

• Patience

• Won’t “go on tilt”

• Accurate odds calculations

• Does not respond well to change

• Can only handle preprogrammed situations

The “traditional” pokerbot plays small stakes, “cash” games. In these games, table conditions are relatively consistent: blinds structures do not change; the number of players at the table is relatively constant; the amount of cash a player has plays only a small part in strategy. All this means that each hand, strategically speaking, should be approached with the same perspective. This plays right to the advantages of a pokerbot. If you can program your bot to handle this situation, you’ve hit the jackpot.

What makes poker a unique game is that a poker hand can have literally billions of variations. Some of these are easy to program for. For example, if your hole cards are 2 7 offsuit preflop, then fold. Most situations are much more difficult than this. Not only do we have to consider our own cards, but we must consider how other peoples actions give clues as to what cards they may have. It turns out this is extremely difficult to program for. There are just too many factors for the human programmer to be able to account for everything.

This is the case for the traditional pokerbot. The information overload dominates. That is why most pokerbots available today are not competitive. They chose the wrong game.

The Sit-and-Go Tournament

PokerPirate exploited the sit-and-go tournament. These tournaments have become very popular online. Essentially, they are single table tournaments. At Royal Vegas Poker, they worked like this:

• Everyone buys in for a set amount, say 5+.50. (This means that the house takes fifty cents, and five dollars go into the prize pool. The total cost to the player is $5.50) • The tournament starts once ten players have joined • Everyone is given 1000 tournament chips • Blinds increment like this:  Hand RVP blinds 1-10 10/20 11-20 20/40 21-30 40/80 31-40 80/160 41-50 160/320 51-60 320/640 61+ 640/1280 • Once a player reaches 0 chips, he is eliminated from the tournament • The first place player receives 50% of the prize pool, second place 30%, and third place 20% (for the 5+.50 game above, this would be$25 to 1st, $15 to 2nd, and$10 to 3rd)

I developed PokerPirate in 2005-2006 to work on the Royal Vegas Poker casino. Since then they have redesigned their game to use the new PokerTime software, and this has changed their tournament structures somewhat. It is still, however, very similar. Most other online poker rooms offer sit-and-go’s with similar structures.

The sit-and-go tournament, surprisingly, is a goldmine for pokerbots. At first, this seems ridiculous because things are changing all the time. It is almost the exact opposite of a cash game. Blinds structures change frequently; the number of players at the table is constantly decreasing; and, the amount of cash a player has plays a huge part in tournament strategy. This means the bot must be able to respond to many more situations. More situations means more complexity which means harder to program. This apparent contradiction is why there has been little work on sit-and-go bots compared to cash game bots.

Because the house charges an additional 10% to play a game, playing merely average will result in us losing money. We have to be at least 10% better than our opponents just to break even. In practice, a good player will make 10-15% profit over thousands of games after subtracting the house’s cut. Below I go into detail about how to measure success, but at this point it is important to understand that even the best poker plays may win or lose an individual sit-and-go. We are only conserned about our net profits over thousands of hands.

Exploiting the Game

The secret of PokerPirate is that there are many more opportunities to exploit our opponents’ weaknesses in a sit-and-go tournament, and these exploits are relatively mechanical (i.e. easy to program). This will become apparent if we divide the tournament into 3 distinct phases: the early, mid, and end games.

Most of our advantage comes in the early game. The early game is made up of the first 20 hands. Blinds at this stage are either 10/20 or 20/40. This is very small compared to our initial chip count of 1000. If we went all 20 hands without playing once, we would lose 90 chips, leaving us with 910. This is still a very respectable chip count to enter the mid game with. Furthermore, most players are very impatient, and joined the game just for some action. During the first 20 hands, typically 1-3 players will go all in. This means that 1-3 players will be removed from the game, so only 7-9 will remain. Simply by waiting, we reduce our number of opponents and maintain roughly the same number of chips.

With a normal stack of chips and ten players left, being average over the long term nets us the 10% loss from the house’s cut. With nine players left, being average we break even. Anything fewer than nine, and being average generates a net profit.

The hardest part of the tournament is the midgame. The number of players are reduced from 7-9 to 4 or less, and blinds are typically 40/80, or 80/160, but can get as high as 160/320. We cannot afford to play passively anymore at this point, because blinds are big enough to take away a significant portion of our stack. Blinds are not yet high enough, however, that we are pot committed if we play a hand. This means the program will encounter “tricky” situations where it will have to decide what to do. It is these tricky situations that makes developing a pokerbot dificult.

PokerPirate is at a disadvantage during the midgame, because the midgame so closely resembles a cash game. Cash games are difficult to program because there are a lot of different hand combinations and the AI has to make difficult bets and calls. Luckily, because people have already been removed from the tournament, playing average will provide us with long term profits. Therefore we have a much easier goal to meet than do the cash game bots.

Another advantage of the tournament structure is that deciding the amount to bet in a no-limit game becomes much easier. In an individual hand, we are no longer concerned with maximizing our expected value. Instead, we are concerned with not getting knocked out. This turns out much easier to handle because we go all-in much more frequently. In a typical game, we will have to win 1-2 major hands (all-in) or 3-4 minor hands (win blinds) in order to make it to the end game and have a shot at winning first place. A limit holdem sit-and-go bot would probably be more successful because it wouldn’t have to deal with this particular challenge, but in practice there are very few limit sit-and-go tournaments.

We also have a great advantage during the endgame. During the endgame, there are typically 4 or fewer players and blinds are 160/320 or greater. There are two things we can exploit here. First, at this point, most players are desperate just to get “in the money.” That means they will play more conservatively than they should, hoping that someone else will get knocked out. First place prize, however, pays much more than 2nd or 3rd. Therefore, to maximize long-term profits, you should always play to win 1st and not settle to get just in the money. Second, because blinds are so large relative to stack size, you must play many more hands or else be blinded to death. Because blinds are so large, the best moves are typically to either go all-in or fold preflop. This avoids any tricky situations, and allows us to use the computer’s knowledge of preflop odds to greatly bolster our play.

In summary, a sit-and-go pokerbot has three advantages over the cash game bot:

1. Players are too aggressive too early, and will get kicked out without us doing anything

2. Players don’t realize that they should always be playing for first place

3. Preflop play is much more important, and computers are good at preflop play

For more information about beating sit-and-go’s, I recommend Sit ’n Go Strategy or How to Beat Sit ‘n’ Go Poker Tournaments.

Measuring Success

Professional poker players define success in terms of how much money they make; therefore, a pokerbot’s success should be determined in the same manner. The bot must be skilled enough not only to win money from the other players, but also to win enough to cover the house’s take. Also, it must be able to make money not just ocassionally, but over a prolonged period of time.

Return on Investment (ROI) is a measure of a player’s ability at a given game. A negative ROI means the player will lose money over time; a zero ROI means he will neither gain nor lose money; a positive ROI means the player will make money over time, and this is our goal. It is calculated like this:

ROI = (Winnings - Investment)/Investment

It is important that we include the house’s fees in the investment portion of the calculation. For example, if we play 1000 games of $5+.5 sit and goes, the investment would be$5500. If our total winnings over this period were 6000 dollars, then our ROI would be 9%.

Because poker has such a large amount of chance in any individual hand, ROI has a theoretical upper limit. This upper limit is determined by the skill of the other players, which is determined by the buy in for the game. In practice, this has been determined to be about:
 Buy in ($) ROI (%) 11 20 22 15 33 10 55 8 109 7 215 6 Hourly rate is a measure of how quickly a player wins money. A higher ROI will increase hourly rate, but there are many other factors that contribute as well. For example, playing several tables at the same. hourly rate = ROI * Buy in * # games per hour For example, a 9% ROI at a$5+.5 game, playing 5 games per hour yields an hourly rate of \$2.50. A human wouldn’t want to quit their day job for such a paltry income, but a pokerbot making this would create a nice additional paycheck.

For human players, ROI and hourly rate can be at odds with each other. A player might be able to have a 20% ROI playing just one table at a time, or a 10% ROI playing 4 tables. The latter results in twice the hourly rate with half the ROI. A pokerbot does not have this dilemma. It will play just as well at 1 table or 4. Therefore ROI and hourly rate can be treated almost sinonomously when measuring the success of a pokerbot. Since human players use hourly rate, however, that will be our convention from here out.