KBO Model Day 9: Home Field Advantage Completed

Still in the lull of being productive on the KBO model. I’ve done more background work just to wrap my head around the game. Things like:

  • Some baseruns calculations + pythag for win expectancy. I did this via just manually grabbing data from Baseball Reference, but can automate because…
  • Built a scraper for Baseball Reference to grab all KBO team stats (batting, pitching, and defense) for 2019 and 2020 for building my bottom up win expectancy
  • Been more closely reviewing my old MLB model to game plan my approach for KBO Model 2.0

One thing has been bugging me is all the missing games from my home field advantage modeling. I did some digging and discovered one thing I expected, and another that I did not.

As expected, the scraper I used as the core of my system couldn’t handle double headers, but in a way that was different than I expected. The scraper would take the league’s daily schedule, grab some attributes, then build the game recap IDs from the attributes. There were two issues with this:

  1. The daily schedule page didn’t even list the second game of a doubleheader
  2. The factory that built the game IDs assumed a component that was just 0. However, in the first game of a double header this value was 1, and it was 2 for the second game. Therefore, it didn’t include either game. I resolved this by grabbing results from each team’s individual results page, then deduping.

Unexpectedly, the scraper has a list of teams as constant, and one of the teams for 2019 was just missing. Turns out, it’s because the Kiwoom Heroes were missing because prior to 2019 they were the Nexen Heroes. I fixed this by just adding the team’s current name in both Korean and English to the constants module.

I’ve re-run the data, and now have 410 more games for the decade than before. Overall, it pushes the KBO home field advantage to around 5% for the decade (bringing up numbers lower than that and dropping those that are higher). I’ve also included 2020’s numbers through today’s games.

KBO Home Field Advantage, 2010-2020 (through 5/27)

KBO Home Field Advantage, 2010-2020 (through 5/27)

SeasonHome Team WinsVisiting Team WinsTotal GamesHome Field Advantage
2010264268532-0.75%
20112762485245.34%
20122712615321.88%
20133183096271.44%
20143042795834.29%
20154113737844.85%
20164223798015.37%
20174103637736.08%
20184123477598.56%
201942033975910.67%
202056461029.80%
Totals3,5643,2126,7765.19%