Statistics

# A (Quick) Breakdown of SSCAIT Win Rates by AI Race.

Introduction:
Overall, I find that after controlling for individual bot skill level, bots tend to under-perform against Protoss (P) bots and over-perform against Terran (T) bots, and possibly Zerg (Z) bots as well.  Disclaimers, methodology, robustness, conclusion to follow.

Disclaimers:
This does not mean that the race P is superior or the race T is inferior. It only means bots perform worse than their past performance would suggest against P opponents, and better against T. It is entirely possible that the exploits against P bots are harder to identify, or the bot meta favors strategies that are played by P, (eg, late game death-ball style plays). This prediction is also honed to the average skill level- at tail ends of the skill distribution, these results will vary.

Methodology:

First, let’s dump the data from here into Excel->CSV->R. I had to manually add the races, though.  I’m not really convinced that data types other than CSV aught to even exist, but that’s just me.

Here’s the meat of it:

1. Get each bots overall win rate for each match. (Here it is an average of the match results, which could be either 100%, 50%, or 0%)
2. Assume this is win percentage a good proxy for the bot’s overall skill level. This needs a bit of explanation. It is already known the Locutus family is strong, for example, and they are likely to win most match-ups regardless of race (90%+).  We also know they always play Protoss. Let’s try to decompose the win rates into the portion which is approximately due to bot skill level and the part which is due to race match-ups. We need some proxy for bot overall skill level since we are trying to compare the win rates of each race after excluding any effect of skill.
3. Estimate the following:

$MatchWinPct = \beta_p * ProtossOpponent + \beta_t * TerranOpponent + \beta_z * ZergOpponent + \beta_s*SkillProxy + \epsilon$

Where the coefficients of interest are going to be: $\beta_p, \beta_t,\ and\ \beta_z$. Each of these three coefficient represents how different from its average win percentage a given bot is expected to do against an opponent of the given race.  Here are the results:

Call:
lm(formula = match_pct ~ skill_level + factor(opp_race) - 1,
data = melted_data)

Residuals:
Min      1Q  Median      3Q     Max
-90.911 -27.440   1.096  29.510  93.088

Coefficients:
Estimate Std. Error t value Pr(>|t|)
skill_level        0.99908    0.03827  26.107   <2e-16 ***
factor(opp_race)P -5.57628    2.37514  -2.348   0.0190 *
factor(opp_race)T  4.62700    2.38244   1.942   0.0523 .
factor(opp_race)Z  1.32735    2.47510   0.536   0.5918
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 37.52 on 1976 degrees of freedom
Multiple R-squared:  0.6814,	Adjusted R-squared:  0.6808
F-statistic:  1057 on 4 and 1976 DF,  p-value: < 2.2e-16

First, we see skill level is strongly significant and nearly exactly 1. If a bots win percentage goes up by 1%, we expect it to win each match by precisely 1% more.  This is to be expected.

Next, the coefficient on P is -5.6%. That suggests if a given bot (A) is playing a Protoss opponent, bot A’s overall match score is about 5.6% lower than its typical skill level would suggest, and this is difference is significantly different than 0.

Next, the coefficient on T is +4.6%. That suggests if a given bot (A) is playing a Terran opponent, bot A’s overall match score is about +4.6% higher than its typical skill level would suggest, and this is difference is also significantly different than 0.

The coefficient on Z is positive but small and insignificant. Bots tend to do about average, maybe slightly better than average against Z opponents.  (It may be significantly different than P or T though, I am dropping the constant and comparing coefficients is outside the scope of this quick post.)

A note about R2: R squared only measure predictive power. We do not care about predictive power here in this context, only the estimated effect of the coefficients.  If that cofficient is accurate to the real world, a good deal of accuracy can be sacrificed.

Robustness:
I have also redone this with Logit, which is slightly more accurate on the far tails. (In a journal article on this subject, we’d probably focus more on this result) But here we can simply make some examples. The estimated race effect will be larger near the bottom of the skill distribution and smaller near the top.  Example: Given a bot with the skill level of Purplewave (90.09%), the expected win rate moves about +5% when transitioning from a P opponent to a T opponent.  Given a bot with the skill level of SijiaXu (40.91%), its expected win rate moves about +12% when transitioning from a P to a T opponent.

Conclusion:

After controlling for skill level of individual bots (which is the vast and overwhelming factor in game wins), we find some modest suggestion that race is a relevant factor in matchup performance. In general bots under-perform against opponents of the race P and over-perform against the race T, but this factor is small at the higher end of the skill distribution.

Code:

library(reshape)

#Get that data – copy/paste from online table. Note spaces in names cause some rudeness in R-studio with autocomplete. I did a find and replace in the csv before hand.
SSCAITdata<-SSCAITdata[,-c(1,4,5)]
# View(SSCAITdata)
melted_data<-reshape::melt(SSCAITdata,id.vars=names(SSCAITdata)[1:3])

#Seperate the Win-losses
melted_data$wins<-as.numeric(sapply(strsplit(as.character(melted_data$value),”\\-“), “[“, 1))
melted_data$losses<-as.numeric(sapply(strsplit(as.character(melted_data$value),”\\-“), “[“, 2))

melted_data<-melted_data[!is.na(melted_data$Race),] present_elements<-!is.na(SSCAITdata$Race)
melted_data$opp_race<-rep(SSCAITdata$Race[present_elements],each = sum(present_elements)) #make sure the races line up
names(melted_data)[1]<-“own_race”
melted_data<-melted_data[!is.na(melted_data$value),] #drop games with no opponent. melted_data$match_pct<-melted_data$wins/(melted_data$wins+melted_data$losses) * 100 skill_level<-aggregate(match_pct ~ Bot, data = melted_data, mean) names(skill_level)[2]<-“skill_level” melted_data<-merge(melted_data, skill_level, all.x=TRUE) melted_data$outsized_performance<-melted_data$match_pct-melted_data$skill_level
melted_data$matchup_type = paste0(melted_data$own_race,”v”,melted_data$opp_race) # View(melted_data) aggregate(outsized_performance ~ matchup_type, data = melted_data, mean) #compare to below: summary(lm(match_pct~skill_level+factor(own_race):factor(opp_race)-1, data=melted_data)) #with intercept terms like most people expect summary(lm(match_pct~skill_level+factor(opp_race), data=melted_data)) summary(lm(match_pct~skill_level+I(skill_level^2)+I(skill_level^3)+factor(opp_race), data=melted_data)) summary(glm(I(match_pct/100)~skill_level+factor(opp_race), data=melted_data, family=binomial(link=”logit”))) #without intercept terms because of clean presentation summary(lm(match_pct~skill_level+factor(opp_race)-1, data=melted_data)) summary(lm(match_pct~skill_level+I(skill_level^2)+I(skill_level^3)+factor(opp_race)-1, data=melted_data)) summary(glm(I(match_pct/100)~skill_level+factor(opp_race)-1, data=melted_data, family=binomial(link=”logit”))) melted_data$prediction<-predict(glm(I(match_pct/100)~skill_level+factor(opp_race)-1, data=melted_data, family=binomial(link=”logit”)), melted_data, type = “response”)
View(melted_data)