Yesterday I discussed the first simplistic step in building my KBO model. Given my reservations after its first performance, I skipped action today, but for completeness, it again, recommended all 5 dogs, 3 home and 2 visitors. The results: 2-3. Progress!
The most important part of my MLB model was the adjustments made to win expectancy based on who was slated to start each game. Given that I still don’t have a path forward on that, I wanted to revisit home field advantage. As a recap, I could not find anything online about an established KBO home field advantage, so I figured I’d try to calculate it myself.
I found the KBOPrediction python repo on github. While it seems like the objective of the code is to “employ the notion of Deep Learning to predict” individual game results, the only part I care about here is the scraping and legwork on translating to English. While it is true that I too want to predict the results of individual games, this model’s approach just relies on the previous k games for each team. That’s not as sophisticated as I need.
After doing some hacking, I was able to run the scraper through a number of seasons to get the record of home and away teams. The results:
|Season||Home Team Wins||Visiting Team Wins||Total Games||Home Field Advantage|
Well, shoot. I used a flat 8% for home field advantage in the model, and that’s what it’s been in the KBO on average over the last four seasons. I was trying to go back 10 seasons, but the scraper keeps breaking in 2015. I’ll see if I can fix that later.
Another thing to address later is the number of games in my sample. For some reason, I’m short games in 2019. Looks like I have roughly the right number of games for 2018, and seems like the playoffs are included in 2016-2017 numbers. The scraper I’m using takes years and months as input for the scraper, so maybe I need to be more rigorous about what games to grab.
Previously: KBO Model Day 1