In the previous articles in this series, I covered the essentials of getting started and getting involved with voice programming, and some best practices. This time, I’m going to talk about an issue that comes up as you begin to create more complex command sets and grammars: grammar complexity and the dreaded BadGrammar error.

First, a little background. BadGrammar is a kind of error which occurs in voice software which plugs into Dragon NaturallySpeaking, including Natlink, Vocola, and Dragonfly. For a long time, I thought it was caused by adding a sufficiently high number of commands to a grammar. Then the subject came up in the Github Caster chat, and I decided to create some tests.

A Bit of Vocabulary

Before I describe the tests, I need to make sure we’re speaking the same language. For the purposes of this article, a command is a pairing of a spec and an action. A spec is a set of spoken words used to trigger an action. The action then, is some programmed behavior, such as automatic keypresses or executing Python code. Finally, a grammar is an object comprised of one or more commands which is passed  to the speech engine.

The Tests

The first test I did was to create a series of grammars with increasingly large sets of commands in them. I used a random word generator and a text formatter to create simple one-word commands which did nothing else but print their specs on the screen. All tests were done with Dragonfly, so the commands looked like the following.

After creating a (slow but usable) grammar with 3000 commands in it, my sufficiently-high-number-of-commands theory was shot. (Previously, I had gotten BadGrammar with about 500 commands.) Instead, it had to be, as Mark Lillibridge had speculated, complexity. So then, how much complexity was allowed before BadGrammar?

Fortunately, Dragonfly has a tool which measures the complexity of grammars. It returns its results in elements, which for our purposes here can be summarized as units of grammar complexity. There are many ways to increase the number of elements of a grammar, but the basic idea is, the more combinatorial possibility you introduce into your commands, the more elements there are (which should surprise no one). For example, the following rule with one Dragonfly Choice extra creates more elements than the above example, and adding either more choices (key/value pairs) to the Choice object or more Choice objects to the spec would create more still.

CCR grammars create exceptionally large numbers of elements because every command in a CCR grammar can be followed by itself and any other command in the grammar, up to the user-defined maximum number of times. The default maximum number of CCR repetitions (meaning the default number of commands you can speak in succession) in Dragonfly is 16.

With this in mind, I wrote a procedure which creates a series of increasingly complex grammars, scaling up first on number of Choices in a spec, then number of commands in a grammar, then max repetitions. (All tests were done with CCR grammars since they are the easiest to get BadGrammar with.)

The Results

The data turned up some interesting results. The most salient points which emerged were as follows:

  • The relationship between number of repetitions and number of elements which causes BadGrammar can be described by a formula. Roughly, if the number of max repetitions in a CCR grammar minus the square root of the elements divided by 1000 is greater than 23, you get BadGrammar.
  • These results are consistent across at least Dragon NaturallySpeaking versions 12 through 14 and BestMatch recognition algorithms II-V.
  • Multiple grammars can work together to produce BadGrammar. That is, if you have two grammars active which both use 55% of your max complexity, you still get the error.
  • Using Windows Speech Recognition as your speech recognition engine rather than Dragon NaturallySpeaking, you won’t get a BadGrammar error, but your complex grammars simply won’t work, and will slow down any other active grammars.

Implications

So what does all of this mean for you, the voice programmer? At the very least, it means that if you get BadGrammar, you can sacrifice some max repetitions in order to maintain your current complexity. (Let’s be honest, who speaks 16 commands in a row?) It also exposes the parameters for solutions and workarounds such as Caster’s NodeRule. It gives us another benchmark by which to judge the next Dragon NaturallySpeaking. Finally, it enables features like complexity warnings both at the development and user levels.

Grammars do have a complexity limit, but it’s a known quantity and can be dealt with as such.

UPDATE 3/26/2017: Saltbot now has its own subreddit, /r/saltbot. Private message one of the mods to join.

UPDATE 8/14/2016: Since the last update, Reconman has taken over maintenance and upgrades of Saltbot. Because he has added a lot of features, I am updating this article again.

UPDATE 5/16/2015: Chrome updated their App policy, so if you want to install Saltbot, you have to do so from the App Store.

Due to a recent surge of interest in Saltbot, the betting bot I created for Saltybet.com, I’ve decided to write this guide detailing its use, give its interface a facelift, and make available on Github a substantial but dated chunk of data which I gathered and used to develop the bot.

I’m going to go through its features in the order in which they appear in the UI. First then, are the four modes.

Betting Modes

Saltbot has four different modes which determine its basic behavior: Monk, Scientist, Cowboy, and Lunatic. The names are just for fun. Here’s how they work.

  1. Monk
    All four modes recorded match information after every match. Monk records information only, and doesn’t place bets.
  2. Scientist
    Scientist is the most accurate of the four modes. It uses all available information gathered from past matches to create a confidence score for each upcoming character. That score is then used to determine the selection and the betting amount. When determining its betting amount, it applies the confidence score to a flat amount which is itself determined by your total winnings meeting certain thresholds. Scientist requires about 5000 recorded matches to be usable.
    It also requires an evolved “chromosome” for its genetic algorithm to be effective. See the “Chromosome Management” section below.
  3. Cowboy
    Cowboy is a dumber (or more focused) version of Scientist. It only takes win percentage into account when making its selection. Also, unlike Scientist and Lunatic, it bets based on a percentage of your total winnings, not a flat amount based on a winnings threshold.
  4. Lunatic
    Lunatic doesn’t use stats at all. It flips a coin to determine its selection, then bets flat amounts, again, based on your winnings reaching certain thresholds.

For any of the modes to work, you have to be logged into a Salty Bet account. If you press F12, you can see their logic and messages in the developer console.

I should mention that the first match out of every hundred will be recorded with some information missing due to the auto refresh feature and the way in which the information is collected. This has little bearing on accuracy because the information which goes missing isn’t very important. However, if you close the Twitch window which the extension launches, lots of information will be missing. Leave it open.

Chromosome Management

In order to get started using Scientist, you have to first initialize the chromosome pool by clicking the “Reset Pool” button, and then setting the chromosome evolution in motion by clicking the “Update Genetic Weights” button. The genetic algorithm running will freeze Saltbot’s UI until you switch tabs or click off of it. For best results, you should let the genetic algorithm run for at least fifty generations.

Between rounds of evolution, the messages box will be updated with three pieces of information: “g”, the generation number for this round of evolution (closing the extension resets this counter but doesn’t reset the pool); the current best chromosome’s accuracy when applied to all recorded matches; and the current best chromosome’s approximate winnings when applied to all recorded matches.

If you like, while the genetic algorithm is running, you can open the extension’s background window and watch the chromosomes evolve in the background window developer console by right clicking any part of the extension UI, and selecting “Inspect Element”. Maybe this is really nerdy, but during the development of this bot, I came to enjoy watching the chromosomes more than the matches.

Records

If you would like to make a copy of your database for backup or analysis, or share your records with your friends, you can use the import and export buttons to do so.

Options

Presently there is only one option: Toggle Video. This is intended for low-bandwidth users or users who wish to let Saltbot bet in a background tab and therefore don’t need the video panel consuming resources.

Betting Controls

In the two years since its creation, many Salty Bettors have turned up at the Github page and asked for more granular control of the automated betting. Reconman responded by adding the Betting Controls section and some options on the Configuration menu. (See below for Configuration menu details.)

  • The “aggressive betting up to” control allows you to multiply bets by 10 until the specified cash threshold is reached. (Not active during tournaments.)
  • The “stop betting at” control lets you stop bets after the specified cash threshold is reached.
  • The “betting multiplier” control lets you increase or decrease all bets by up to an of magnitude. This feature stacks with the “aggressive betting up to” control, but does not stop at the “aggressive betting up to” threshold. (Not active during tournaments.)

New Features

Reconman has added a lot of new features since the original bot was written. They are as follows.

New: Character Database

Capture1

The character database is accessed by clicking the grid icon at the top of the SaltBot UI. You can use it to view the raw data that Scientist and Cowboy modes use to make their decisions. You can also search by character name. The characters in the character database come from the character data that you collect each match, and any data you upload to the bot.

(Notes: The “strategy” column records which mode was active for that match. Monk = “obs”, Scientist=”cs”, Cowboy=”rc”, and Lunatic=”ipu”. In the “winner” column 0 means red and 1 means blue.)

New: Configuration

There is only so much space on the SaltBot UI, and so some items have been moved to the Configuration menu. The Configuration menu can be accessed via the gear icon.  Capture3

  • Exhibition Betting Toggle: Some players think that betting on Exhibitions mode at all is inherently too random/ risky and would prefer the bot not to bet on them at all. Bets on Exhibitions mode matches can be toggled off via this menu option.
  • Tournament Options: There are settings to stop Saltbot from betting in tournaments after a certain cash threshold has been reached, and to always all-in or not.
  • Player Rankings: This used to be displayed in the F12 developer console, but now has a much cleaner-looking display on the Configuration page. SaltBot tracks player data as well as character data. The most frequent bettors’ betting stats can be viewed via these buttons.

New: Help and Github links

The question mark and Github icons lead here and to SaltBot’s Github pages, respectively. If it’s not apparent from the comments below, I no longer maintain SaltBot and so questions and concerns should be directed to the Github page where Reconman and a few others work on it. Also, Reconman is very patient, but for his sanity, if you have a bug you’d like to report, please read the bug reporting guide. It’s short.

Getting Started

To install the available historical data, download “65k records without exhibitions June 2016.txt” from Github, or one of the other seed data files, and import it with the “Import Records” button. (Alternatively, you can let Monk mode gather your own data for you for a while.) From there, you can switch to Scientist mode and SaltBot will take over for you. Happy betting!

This article recounts the story of how I became one of the wealthiest 100 players on a virtual betting site with over 10,000 active users. I was looking to sharpen up my JavaScript skills when I came across a mention on Hacker News of what turned out to be the perfect learning project opportunity: Salty’s Dream Casino.

salty01

Salty’s Dream Casino

A little bit of background is in order. Salty’s Dream Casino, a.k.a. SaltyBet, is a website whose main feature is an embedded Twitch video window and chat. The video window shows a video game called M.U.G.E.N. running 24/7. The game is a fighting game, and if you sign up for an account on SaltyBet.com, you are given 400 “salty bucks” and can bet on who will win the fights. It’s all play money of course, and if you run out, you automatically get a minimum amount so that you can continue betting. There are no human players controlling the characters; they are computer-controlled, some with better or worse AI. (Some have laughably bad AI, but that’s part of the fun.) The characters are all player created and there are over 5000 of them spanning over 5 tiers: P, B, A, S, X.

What I saw in SaltyBet was fast iteration for development (most matches are over in 1-2 minutes, the perfect amount of time to fix the logic and come back), a fun project, and data that would translate well into features for some of the machine learning algorithms that I’d been studying.

My Short Trip to the “0.1%”

As I would be needing to inject JavaScript into the site, I decided to go the route of a Google Chrome extension. I didn’t know anything about browser extensions at the time, so that would also be a great learning opportunity. The initial step was to create a basic runtime that ran in parallel to the SaltyBet fighting match cycle. I set it up: the very first version of the bot picked a side randomly and bet 10% of total winnings.

Happy that I’d gotten the extension working, I left it on overnight. When I woke up, my ranking was #512 out of 400,000 total accounts, with $367,000. As I suspected, this was just a great stroke of luck. When I got home that day, the bot had pissed away most of the money.

The Progression of Strategies

In order to create strategies any more advanced than a coin toss, I would need to collect data. I implemented some basic stat collection using Chrome Storage, as well as records import and export, then got to work on the first real strategy, “More Wins”. As its name implies, it simply compared wins and losses in order to determine which character to bet on. If there was a tie, it resorted to a coin toss again.

I plugged in RGraph to see just how much better “More Wins” was doing than a coin toss. I was dismayed to see that although the coin toss strategy had the expected 50% accuracy rate, “More Wins” was at 40%! After modifying the bot to print out its decision-making logic at betting time, I realized that (A) lots of matches were being decided by coin tosses, and more importantly, (B) lots of bad calls were being made because I didn’t have enough data to effectively compare wins and losses. For example, if there were a matchup between a very strong character whom I didn’t have any data on, and a relatively weak character with one win and a bunch of losses, the loser’s single recorded win would trump the zero recorded wins of the champ.

(At this point, of course I could have signed up for a premium account and gotten full access to character statistics, but where’s the fun in that? Besides, I wanted my bot to be able to work with limited data, since premium accounts really just had a larger amount of limited data.)

My solution was to create “More Wins Cautious”, which would only make a bet if it had at least three recorded matches for each character. While MWC did do a few percentage points better, it almost never bet. Not a good solution.

I had a bit more data by this point and had also started to realize that comparing wins and losses both rewarded and penalized popular characters more than it should have. For example, consider the following two character records. The parts in parentheses represent data that my bot has recorded.

As you can see, “BenJ” is being rewarded for being popular rather than effective. My next strategy, “Ratio Balance” compared win percentage rather than number of wins. This yielded a fairly significant improvement: 55% accuracy.

Enter Machine Learning

Wanting to apply some of the machine learning material I’d been studying recently, I upgraded the bot to collect more information than just wins and losses. Now, for each match, it recorded match odds, match time (for faster wins and slower losses), the favorite of bettors with premium accounts (who constitute about 5% of total active accounts), and the crowd favorite. The next version of the bot, “Confidence Score” combined all of those features in order to make its decision.

But how to weigh the different features? The problem was a good fit for a genetic algorithm, so I put all of those weights on a chromosome and created a simulator which would go back through all the recorded matches and try out different weighting combinations, selecting for accuracy. The accuracy immediately leapt to 65%! In the days that followed, the chromosome class underwent a lot of changes, but its final form looked like the following.

I also had the bot change its betting amount based on its confidence in its choice. This, more than any of the features of the data, turned out to be really important. It caused my winnings to stop fluctuating around the $20,000 mark and to instead fluctuate around the $300,000 mark. Then, I had the simulator select for (money * accuracy) instead of just accuracy, and my virtual wealth moved up into the $450,000 range despite the accuracy decreasing slightly. Of course by this time, I’d realized that most of the 400,000 accounts on SaltyBet were inactive, so I wasn’t really in the 1% yet.

Analysis

Before we move on, let’s take another look at that chromosome. There are a number of interesting facts which emerge, which aren’t intuitively obvious. For this reason, I’ve come to enjoy watching the chromosomes evolve more than watching the actual matches.

  • Win percentage dominates everything else. I actually put in an anti-domination measure in the simulator which penalized chromosomes by 95% which had one of the weights worth more than all of the others combined. (Doing so minimized the damage when the bot guessed wrong. Surprisingly, this didn’t hurt the accuracy much, small fractions of a percent.)
  • Crowd favor and premium account favor are completely worthless.
  • Though success and failure do count differently in different tiers, the distribution is far from uniform. For example, wins and odds in X tier count an order of magnitude more than almost everything else, but match times in X tier aren’t that important.
  • Not shown here is that the chromosome formerly included confidence nerfs, conditions like “both characters are winners/losers” or “not enough information” which would decrease the betting amount if triggered, and also switches to turn the nerfs on and off (like epigenetic DNA). The simulator consistently turned off all of my nerfs, so I got rid of most of them. The true face of non-risk-aversion.

You’ll also notice that there’s a tier, “U”, on the chromosome which I didn’t mention before. Due to some quirks of the site, my information gathering isn’t perfect. So, “U” stands for Unknown.

Why Not Also Track Humans?

salty02I had learned a ton about JavaScript (like closures and hoisting!) and browser extensions, and was pretty happy with the project. My bot swung wildly between $300,000 and $600,000 with 63% accuracy. I started to wonder how accurate the other players were, and realized I could also track them. So I did. I collected accuracy statistics on players for 30 days. This unearthed a few more interesting facts.

  • At 63% accuracy, my bot was in the 95th percentile. The most accurate bettor on the site bets at about 80%.
  • Players with premium accounts bet 6% more accurately than free players, on average.
  • Judging from the number of bets made, there were obviously other bots on the site.
  • Some players who were significantly richer than I was bet with much lower accuracy. One of them bet with 33% accuracy.

That last item in particular interested me. How could this be? … Upsets! I went back to the simulator and pulled out more statistics. By this time, I had quite a bit of data.

  • The average odds on an upset were 3:1.
  • The average odds on a non-upset were 14:1.
  • Upsets constituted 23% of all matches.
  • My bot was able to call 41% of all upsets correctly.
  • My bot was able to call 73% of all nonupsets correctly.

(3 * 0.41 * 0.23)+(-1 * .59 * .23)+(0.07 * .73 * .77)+(-1 * .27 * .77) = -0.02

If I switched the bot to pure flat bet amounts, it would take a loss, but it would almost break even! With just a little bit of tweaking, it might be able to get into the black in a stable, linear way, rather than all the wild swings around a threshold. (I was still betting 10% of total winnings at this time.) It also occurred to me, that since my bot was on 24/7, there were lower traffic times during which it could actually move the odds far enough that it would hurt itself. That too would be minimized by flat bets.

I switched the bot over to flat betting amounts based on total winnings. (Meaning, it was allowed to bet $100 until it passed $10,000, then $1000 until it passed $100,000, and so forth). I watched for a while and experimented with different things. What finally seemed to work was applying the confidence score to flat amounts, rather than the original 10% of totals. (So, the amount to bet was now (flat_amount * confidence)). That did the trick: my losses were instantly cut by 10%, which meant I was getting a penny back on every dollar bet, on average. My rank has been steadily rising ever since. No more wild swings or caps, just slow wealth accumulation.

salty03

I don’t work on the bot anymore, but I leave it on, 24/7. I come home from work and see another $100,000 accrued, and smile. Sometimes it drops $100,000 instead, but the dips are always temporary. Since I started writing this article, it has accumulated $40,000. If you care to try it out yourself, or perhaps improve it somehow, please fork it. It’s on Github.