How SaberSim’s MLB Model Works

Transcript

Jordan:
All right, what is going on everybody? Welcome back to another edition of SaberSim’s strategy sessions here. I’m joined by Matt and Wil today. We’re going to be talking a little bit about our baseball model, really kicking into full gear here in the baseball season as we’re getting into the heat of the summer. So I know there’s been a lot of questions, a lot of interest on the baseball side of things lately, especially as it relates to the model. So wanted to put this together for you guys to talk about a little bit.

Jordan:
But if you’re new to these strategy sessions, we host these live every Thursday here at 2:00 PM Eastern, and we are doing these live for a reason. So we want engagement from you guys. We want questions. If we say something that’s kind of confusing or you’re not totally clear on, or you just have something to add to the conversation, throw it into YouTube chat, or throw it into our Office hours channel on Slack and we will get right to it. So without much further ado here, I know we got a lot to talk about, but, Matt, Wil, how are you guys?

Matt:
Doing good.

Wil:
Yeah, doing pretty good.

Jordan:
Good, right on. Well, I’ll go ahead and just let this get kick off here, Matt. Do you want to just kind of start with maybe a high level of how our model works and how our simulator is working for baseball?

Matt:
Yeah, for sure. So kind of a background, basically I’ve built SaberSim with just baseball. So that was like 2015, I think. I started writing a baseball simulator. So baseball is kind of our home base. I always considered that like my baby with SaberSim. And I think we’ve gotten really strong a lot of the other models, but it’s exciting to talk about MLB right now because that’s where it started. And I think we have a really strong model and even stronger now.

Matt:
But yeah, basically how it works for those, especially for those that are new to SaberSim, that Sim part of SaberSim is not just the name. We actually simulate all of the games for the day thousands of times. And the way that we do that is we basically analyze the past 10 years of play-by-play data from the MLB, from AAA, AA, all the Minor Leagues, we analyze all of that data and we kind of put it into this really complex model that takes all the factors, from ballpark to umpire, to batter and pitcher, to all this other stuff. Then we come up with basically probabilities for all of these different factors that go into the game. And we have all these probabilities for the factors and then we plug that into the simulator. And so we’d say, “Let’s simulate this game play-by-play thousands of times.” So we’re not just doing this theoretical Sim, we’re actually simulating the game. We’ll take these probabilities and we’ll say, “All right, let’s choose a random event based on these probabilities. It’s going to be a single. Now there’s a guy in first, let’s do the next play.”

Matt:
And so we’re really getting the most accurate results possible. And in terms of not only the average projection, but the full distribution of events. So yeah, we analyze all that play-by-play. We simulate the games and then the result, the projections that you see on SaberSim are basically just the average of all of those simulations. But when you run builds, it takes into account all of the distributions, all of the different results that can happen in the simulations. So we’re not just looking at the mean projection. When you’re running lineups, you’re looking at all of the different simulated outcomes.

Jordan:
Right yeah. And I think that’s part of really is what is so powerful here. And we talk a lot on Office Hours about looking at some of those ranges of outcomes when you click the player name. Can you talk a little bit more about how we leverage those using Smart Diversity, for people that maybe aren’t so familiar?

Matt:
Yeah, for sure. So basically how Smart Diversity and how the builder works is, we essentially bin the simulation. So we kind of group them randomly. So we’re taking random samples of different groups of simulations and how big those groups or bins are changes based on the Smart Diversity setting. So a simple way of explaining it is, when you have Smart Diversity all the way at zero, then every lineup is using the full range of simulations to build an individual lineup. So we’re just taking the average of all the Sims, which is just the same as the projection.

Matt:
When you have Smart Diversity at the very other end at 10, at the maximum setting, then every single lineup is looking at one simulation. So those are the ranges of that setting. And that’s obviously a huge range because one of them is just taking the average, that’s just the mean projection. The other one is taking every single lineup is looking at one simulation. And so in that sense, when you run a build with maximum Smart Diversity, what you’re doing is saying, “Give me the optimal lineup from, however many, 1,500 different simulations of this slate.” And that’s a really cool way of thinking about it, because especially for something like showdown or small slates, where you really need to get the optimal in order to win, in order to get first, that’s what you want to do. It’s like you’re saying, “I want the optimal lineup in a particular outcome of these games.”

Matt:
All those middle values are between zero and 10 or between one and nine if there’s more diversity, or just different ways of adding like a smart, that’s what we call Smart Diversity. It’s adding this smart randomness where it’s not just random variance between lineups, it’s actually the true variance of the players. And that variance is also correlated. If you’re taking bins of say five simulations at a time, if in those five simulations the Yankees score an average of 10 runs, then all of the Yankees are going to have higher point values in that Sim. So we’re the only builder that does this, where our randomness is correlated and it’s using real distributions, not just random, just like a random integer added or a normal distribution. It’s like the true distribution of outcomes. So, that’s how our lineup builder really leverages the simulator to kind of create lineups in a way that no one else does.

Jordan:
Right, cool. Yeah, no, I mean, it’s obvious, it’s a big advantage to actually look at real possible outcomes instead of applying this randomness to your projections. I think it’s clear that the strength of Smart Diversity rests on the strength of the model itself then in terms of actually being accurate there. And I know that you’ve been doing a lot of work on kind of digging in this season to test that accuracy. And can you talk a little bit maybe about some possible biases we’ve uncovered in the past and what that looks like and what our manual review process looks like for finding those?

Matt:
Yeah, so a big reason we wanted to jump on this session today is to really talk, Wil and I’ve been working really closely over the past, this whole season really, and before the season on improving the MLB model. And one of the main reasons kind of coming into the season, that one issue I would say, or bias that I wanted to address was how we handled Minor Leaguers. So for many years we’ve included this Minor League data into our model, but I think often I’ve noticed, and in some analysis I’ve noticed that we were not really accounting for Minor League statistics in the best way. Sometimes we would overwrite players that had really, really excellent Minor League numbers. Or we would underrate players that had really bad Minor League numbers. I think it’s the reason is because it’s very difficult to make that comparison when a player gets from the minors to the majors, because when they’re in the minors, they’re mostly just facing Minor League pitchers that we don’t really know how good their opposition is. We only know their stats in the Minors.

Matt:
So what I really wanted to do was kind of work on a better way of translating Minor League statistics to the majors. And it’s not as simple as just taking their Minor League stats and saying, “All right, let’s multiply by 0.9, and that’s our new stats.” Because the way the models works is it’s taking all of this play-by-play data, and it’s going one play at a time through every single game, we have tens of thousands of games, maybe hundreds of thousands. So, Wil and I really worked a lot on talking high level theoretical about how to fix this bias of kind of overrating, good Minor Leaguers, underwriting bad Minor Leaguers, and Wil can talk a little bit more about how we did this. But that was sort of the bias that we found and we wanted to address.

Matt:
The other thing kind of separate from that is just, for the past few years before this year, pretty much everything about SaberSim was automated and there was very little manual input into the model. And that’s what we wanted. That’s what I intended. Because whenever you add a manual intervention, there’s the possibility of adding your own personal bias, right? And adjusting things when they shouldn’t be adjusted. But one thing that we’re doing a lot differently is we’re trying to identify spots where, okay, the model isn’t perfect, no model is perfect, but so it’s going to have some biases and we want to identify those and make those adjustments so that we’re getting the most accurate results, but we’re doing it in a objective way.

Matt:
So we know like, “Hey, this park moved their fences back 10 feet this year, we’re going to apply a park adjustment. Or we know that the ball, the pitcher, the sticky substances ban is now going to be affecting strikeout rates. We’re going to apply an adjustment to the model to account that because that’s not something that any model is really going to be able to do on its own.

Matt:
So, and then stuff like pitch counts where a pitcher might be coming back from injuries on a pitch limit. We’re doing a lot of more of that little manual intervention that’s still based on objective data, but it’s really improving the Sim where the model isn’t necessarily perfect. But yeah, I mean, maybe Wil can talk more about in terms of like the Minor League stuff that I was talking about and more of the specifics of how we improved the model based on that.

Wil:
Yeah. So our biggest challenge was like Matt said of basically, how do you take a Minor Leaguer who has a really great home run rate in AAA or something like that. And how do you adjust that for them facing better pitchers, but also pitchers that are throwing harder and different parks and like how the fences are different or fences are closer for the back and everything like that. So our approach ultimately kind of came down to analyzing how the leagues have interacted before. So how players that have both AA and AAA experience and how they’ve shifted, like how their stats have adjusted throughout there. And figuring out basically how we can project and how we can adjust each of those stats to a different league.

Wil:
So if we have a AA prospect, we can say, we can confidently, and maybe not confidently, we can estimate what his stats would look like in AAA. And if we can do that, then we can estimate what the sets would look like in the Major League. And so we have basically all of these different things is how Clayton Kershaw, AA, would be unhittable, but who is the, Wander Franco who just came up? We might have been a little bit more bearish on him than others might have been just looking just at their Minor League stats, because there’s an adjustment, there’s more home runs hit in the Major Leagues, but the contact rate is going to decrease for batters that are coming up from a league where they’re facing worst pitchers.

Wil:
And their strikeout rate is probably going to increase because they’re facing part of the hit stuff. So, that’s the bulk of what we did. We tested a lot of different methods and I’m really happy with not only the implementation that we have, but also the results that it’s gotten.

Jordan:
Yeah, speaking on those results. I know we’re not really just applying these fixes and waiting for future results to come in or looking just at anecdotal data. We’re actually doing some rigorous backtesting here to test our findings. Do you want to talk a little bit about what that looks like?

Wil:
Matt, do you want to take that one, or?

Matt:
Yeah, so basically we just really last night and this morning, we kind of finished up a really big round of improvements. We’re just doing some less tinkers on the model. Not that this is the final thing, we’re still working on stuff. But we pushed out kind of a new version of the model a couple of days ago. And then last night and this morning we ran basically a back test of this entire season. So we re-simulated every single game of the season using the updated model improvements that we’ve been talking about. And so then we just did some results on comparing our game projections to Vegas lines, and it’s pretty remarkable improvement. So before with kind of the current simulations we were actually slightly negative, comparing our closing line moneyline performance to Vegas, which honestly is like, that’s kind of expected, right?

Matt:
So Vegas closing lines are basically as accurate as you get in terms of predicting the outcome of a game. It’s very difficult to be more accurate than closing lines because they’re accounting for all of these really sharp betters that are putting money on these lines. So being slightly negative, I think is fine. And I’m not actually, I wouldn’t have been worried about that, but we rerun all these Sims. And we’re now way, way up looking at our game projections versus closing lines, which is amazing. It’s just really, really thrilling. I think we saw these results and we’re like, Wil and I were, kind of our jaws dropped. So it’s really cool. We’re still kind of running some more analysis, not just on moneyline, but on the total bets, on run line bets. And then we’re going to really look into individual statistics and kind of look at, do we still have any biases? It seems like we still kind of have a bias towards the under for a lot of games.

Matt:
And so we’re going to look into, well, do we need to adjust certain statistics to get a little bit closer to the total and not be under? But there also might be a legitimate reason to be under, I think. I know I’ve talked to Andy, our CEO about this before, but there’s actually just a bias in general for the lines that unders tend to be better bets than overs overall, because people like betting overs. And so more money tends to be on the overs despite them, so that kind of moves the lines up a little bit. So I think there’s a little bit of a natural edge in betting unders. We might still be a little bit too far under, so we’re still kind of continually looking at the results, but even so, we’re way up on even under bets, on over bets, on moneyline. All of these backtests are looking amazing based on all of these improvements we’ve done.

Jordan:
Yeah, that’s awesome. I know we had Max here on the show for our first strategy session actually, where we were talking about sports betting and he mentioned the same thing with betting unders and how he found that those were some of his best bets overall in baseball in particular. So I am interested, I know you had mentioned about this manual review process and how, when you first started putting together SabreSim that you wanted to avoid manual intervention to avoid personal biases from leaking in, can you maybe expand a little bit on what the manual review process is looking like and how we’re avoiding kind of leaking some of that personal bias in? Or what kind of things are we using for indicators that maybe something does need a little bit of intervention?

Matt:
Yeah. So the main things that we’re looking at right now for that manual intervention, so first thing is pitch counts. So what we’re doing now is we’re kind of taking the pitch counts in the Sim, like in the morning runs at the Sims, and we’re comparing that to basically the industry. And then just looking for places where we’re, like our pitch counts are off from the rest of the industry or off from prop bets. So we’ll look at a lot of sportsbooks that have prop bets for number of outs recorded. And that’s a good sense of the innings pitch for pitchers. And not that we want to be exactly matching Vegas props, because they’re pretty inefficient.

Matt:
But if there’s a really big difference, then that’s something where we can look into pitcher game logs and see, “Looks like this pitcher hasn’t gone in over a month, he’s coming back from an injury. We might look into a beat writer article that says, “Hey, this pitcher threw 50 pitches in their last rehab outing.” That’s not something that’s going to be easily available. Statistically you kind of have to have a little bit of a manual review that just looks at those pitchers and says, “This is somebody that’s not going to go as long as it says.” Or on the other side of things, maybe last time the pitcher was just coming back from injury, but now they’re fully stretched out and we want to bump up their pitch count because we know maybe the manager said, “Hey, they’re good to go. They’re not on the limit.”

Matt:
So the pitch count thing is really big. I think, especially for the DFS projections, that’s going to really help the accuracy of that. It’ll help with the bidding as well. But with the projections it’ll really just dial those in. The other thing is, just looking at Vegas lines and kind of seeing where our projections differ from Vegas, and not just that, but how Vegas is moving. So, a really good indicator of edge I guess in a line is which direction the line is moving. So if the moneyline for a team starts at minus 150, and then later in the day it moves to minus 120, that probably means that there’s a lot of sharp money on the other side of that bet.

Matt:
And so if we’re moving in the opposite direction of how the line is moving, that generally means, “Hey, maybe we’re missing something about this game. Maybe there’s some key injury for a player or a pitcher is, his velocity has dropped a lot or there’s this sticky substances thing where he’s actually expected to be performed worse. Or there’s just some factor that we’re missing,” because the Sim and the models are not perfect. We’re going to miss something sometimes. And so having that indicator that’s like, “Hey, maybe there’s something off here and we can kind of make some adjustments to account for that is, that’s how that part of it works.

Jordan:
Gotcha. Yeah, it makes a lot of sense. And the goal here ultimately is that we are providing projections that are as accurate as possible as the user when you’re building your DFS lineups. But I guess in the, while we’re thinking about this of things, speaking as a SaberSim user, are there opportunities that I could do a little bit of this on my own or possibly use some of these signals on my own to add some value to my DFS process?

Matt:
Yeah, Wil, you want to take that one?

Wil:
Yeah. So I mean, there’s basically, Matt’s covered most of the events that would trigger us to manually intervene on them. And so I think most of the time it’s just, we’re going to basically in those situations usually just move towards Vegas. That’s typically, if we’re sure we know that this is likely a result of decreasing striker pitch out rate, your pitcher strikeout rates or something to that effect. And so if you disagree with that assumption or something to that effect, I think that’s a great place to impact that. So specifically, I remember the Giants changed their park recently. And so if you have a different view on how that affects maybe certain handed batters. So it’s like if we may apply a global home run, decrease there or increase there, but you may want to boost right-handed batters or something to that effect where you might be able to dial in more if you have sort of a different view than we do there.

Jordan:
Gotcha. And assuming I was tackling a slate, maybe it’s a big 14 or 15 games slate, and I only have so much time to prepare and research for the slate. Are there indicators that I could use to pick certain games or teams that maybe require a little bit of extra research? Would it be separation from Vegas or would you look more at the trends of the way the lines have moved? Or how would you recommend somebody go about finding maybe a couple of different spots to make some changes on a big slate?

Wil:
Matt, you want to go?

Matt:
Yeah. I mean, it’s definitely tough when you have a huge slate, because there’s so many teams and so many pitchers that you don’t want to spend all day. And really, I mean, we’re all the stuff that we’re talking about is like, we’re doing that work for you. So I want to start off with, to be clear, that you don’t have to do that research. You can use the work that we’re putting in and you’ll have good results. That said, there are ways that you can add value. I think one thing, yes, looking at the spots where we differ from Vegas, or especially where you notice that the line has really moved in one direction and that direction is kind of farther if we’re in the opposite direction. So say the line has moved way towards the Red Sox and we, our team total for the Red Sox is way under the implied.

Matt:
Maybe something is we’re missing there and we didn’t pick it up in our manual intervention or maybe we decided, “Hey, actually I think we’re right here.” But you might think, “Hey, Vegas is probably right.” So you can kind of adjust the team totals on the main page to get closer to what Vegas is. In terms of pitchers I think one way that you can really find some differentiation and find some edge is looking at some more detailed stat cast data, looking at pitchers that have maybe add in new pitchers or their velocity has changed a lot. Or maybe their spin rate. With someone like Gerrit Cole, where it’s like, “Oh, his spin rate’s a lot lower without using the spider tack or whatever it’s called.” Maybe that’s someone to target, to stack against for a contrarian play. I think there’s lots of stuff like that, that we’re doing our best to account for that kind of thing. But there’s always room to add to the model by looking at those detailed stats.

Matt:
I wouldn’t look big picture at like, “Oh, well, this guy has a high ERA, or this batter makes a lot of contacts or he’s hit a lot of home runs recently. Because that’s something that the model is really good at doing objectively. And I’m not going to be able to add value there as much as anybody else, just like anybody else, because I’m not a computer. I built SaberSim a long time ago and Wil’s add a ton of value to it. And it’s this very complex algorithm and the things that it does account for it does it really, really well. But when there’s other factors that might not be part of that model, I think that’s something where you can add some value. But yeah, I mean, on a big slate, maybe you just, you take a look at a few pitchers where you know that something might be up. Or you look at a few teams where there’s a big difference and you try to add some value there. You don’t have to go through every single player on the slate or every single team on the slate.

Jordan:
Right. Yeah, it makes a lot of sense. And I think overall it’s kind of reassuring to know that this manual review or at least a second look is taking place here for these things that are outside of the control of the simulator itself. So definitely good to know that one level of review here is kind of already taking place anytime you open up the app and pull up a slate that day, so. But there’s definitely some additional value that you can find there. I think there are some questions that have rolled in here now that we maybe can get to. I think this one came in, Matt when you were kind of talking about your high-level overview here.

Jordan:
This came in from [Materio 00:26:21] on Slack. And he asks, “Are your simulations pitcher-based or batter-based? The problem with creating an event, for instance, a strikeout pitch is both the pitcher and specific hitter probably influenced the probability of this pitching outcome.” I know, kind of maybe getting in a little bit deeper into the details here. But yeah, I mean, looking at a pitcher that has their own strikeout rates and a hitter that has their own strikeout rates, how are we going about kind of parsing that and coming up with probabilities?

Matt:
Yeah, that’s a really good question. And a very … It’s not a simple problem to solve, especially because it’s not just pitcher and batter, right? Those are not the only two factors when you’re determining what’s the probability of this plate appearance ending in a strikeout. You have the pitcher and the batter, you have the umpire, you have how hot is it out? Which park are they in? You have, what does the wind look like? What’s the temperature, I already said temperature, but there’s all these different factors that influence how the event happens. So it’s not pitcher-based or batter-based. What we do is we just have a formula that takes in all of these different factors. It can take between one and unlimited really number of factors. And they all have their own probabilities that this event occurring.

Matt:
And those probabilities are based on all of this past data and how high variance the status. So, some factors are going to be very stable where it’s like, we know that this is the probability for this certain factor. For other ones we don’t really know, so it would be really regressed towards the mean, but we take all of those factors and we basically just plug it into this formula. And it outputs the probability that this event will occur based on all of those factors and the league average. So the simple answer is, it’s not pitcher or batter-based, it’s based on all of the factors that go into the game and into a specific plate appearance. And so that’s where, I mean, I think that’s what makes it so powerful is that it’s not just any one thing and you can’t really get to the same conclusion just by looking with your eyes, because there’s just so many different steps that go into it.

Jordan:
Yeah. It makes a lot of sense. I mean, when you’re watching the baseball game, literally with your eyes exactly as you said you think you’re watching this binary relationship between the pitcher and the batter and who’s going to win. In reality we have all these other factors that are coming into play and we’re taking into account, which I think is really a lot of the strength of the model and part of what makes it so cool here. I did see another question here that came in also from Materio, and this is an interesting one. He asked, “Can we backtest the new simulation model within the client ourselves for previous slates, or is the new model only applied to upcoming events?” I’m actually genuinely curious about that too. If you were to go back to a previous slate just for review and simulate it and build lineups, are you using our new model or is that retroactively?

Matt:
Yeah, it is using the old model, so we didn’t overwrite the results. And the reason is just, we don’t want to mess with previous builds. If you do a build on a previous slate using specific projections, we don’t want to overwrite those and make it seem like we’re changing our projections after the fact. So I think it’s just, we don’t want to change the projections and seem kind of sketchy that way. So it’s only on upcoming slates. We have run the backtest for all the previous Sims that are just not live on the site, but this is live right now. The projections for tonight are using the new model.

Matt:
And really we’ve been using, it’s not like this is one new and improved model that we suddenly put into place. We’ve been iterating and improving on this for the past few weeks, or really just throughout the entire season. But I think especially the past few weeks we’ve really made the big gains. And so really today is just when we’ve finished this whole backtest, but the model has been improving and we’ve been seeing a lot of those changes throughout the past week or two especially.

Jordan:
Gotcha. Yeah. And definitely want to stress that this is a 100% an iterative thing, right? This isn’t just our one big splash here where the model is what it is, we want to continue to improve on this over time. You mentioned here at the start that we are looking at maybe a potential minor under bias here, are there any other things that we’re looking at for some possible near-future improvements as we continue to get a little bit better here?

Matt:
Yeah. Wil, you want to talk about the decay rate stuff a little bit? Because I think that’s one area that we’ve been working-

Wil:
Yeah, so the couple of things that we’re really focused on currently after this big update of Minor League adjustments is the decay rate first of all, which is basically, how strongly are we going to weight recent stats versus their historical stats? So, if a batter has been in the league for 10 years, how much do we want to weight their 10-year sample size versus how much do we want to weight their last year or their last couple of weeks? And so that’s a problem that isn’t just, there’s no answer to that. So that’s just something that we have to explore, test, and improve on.

Wil:
And I think some of the other things that we’re doing is looking at any park-specific biases, as well as another route that we’re looking to improve on and see if there’s something that we’re missing, I think a big part of that could just be with parks changing. And so analyzing both recent [inaudible 00:32:11] and long-term historical performances, how can we improve our accuracy there?

Jordan:
Gotcha. And when is the big BVP update coming in to start incorporating that heavily?

Wil:
Never.

Jordan:
Oh, okay. Gotcha. Cool. Let’s see. I see another question came in from Andrew here on YouTube. He asks, “When the score of a team is six, is Smart Diversity optimizing to find the best lineup that achieves that specific score while simultaneously finding the best co-stack? And then he followed that up by saying, “I’m mostly wondering if a team score caps the focus of games, Smart Diversity pulls from at the nine or 10 setting? I don’t want to completely disregard outlier games where a team goes off.”

Matt:
Yeah, I can, do you want to answer that, Wil?

Wil:
Yeah, it sounds like Graham is-

Matt:
Yeah.

Wil:
So basically when you adjust, I think what you’re saying is that adjusting a team total, like setting it to a six. Yeah, so that’s not going to remove the outliers of Smart Diversity. What it’s going to do is essentially sort of shift the mean of your distribution. So, if we were projecting the team as a mean of five runs and you set it to six, you essentially shift the distribution more to the right, but you won’t completely get rid of the outliers.

Jordan:
Gotcha. Cool. And that’s a good question. That’s one that I was curious about too. So great question, Andrew. Let’s see, saw another question come in here. I don’t know if this is as uniquely related to the model here. Giant Cobra asked, “One question I have, can we add DK and fan to a contest to see who took down the contests in the game centers?” Maybe a future feature there?

Matt:
I think we’re thinking about that. I think it’s not going to be anytime soon, but we’re definitely looking at ways to add a little bit more review of specific contexts rather than just looking at your lineups, kind of looking at, “Well, how do my lineups do in this contest?” We see, do some analysis of actual contest results, but it would probably be pretty far off, but we’re definitely, it’s in our minds.

Jordan:
Gotcha. Yeah, I guess another question I have here too, I don’t see any others from our users here in YouTube or Slack for the time being. So if you guys have questions, feel free to fire them off. But I mean, an interesting question I have is, we’re obviously very focused around baseball right now, it’s right in the middle of baseball season. But as we start to look forward to football and NBA coming back in the fall, what have you guys kind of learned throughout this process here in the past couple of months? Are there some lessons that we’ve learned that maybe could be applied to other sports and improve the product overall? Or following baseball, what are some things that we want to take from this and apply it to our football model and our basketball model and so on?

Matt:
Yeah. I mean, I’ll let Wil answer after me if he has any thoughts there, but one cool thing is all of the other sports use very similar models as baseball does in terms of how that works. So we’re, for all of the different sports we’re taking all of this play-by-play data from all of the different leagues and we’re taking it from the past 10 years or however long we have, and putting it into this model where we come up with the probabilities and then sticking that into a simulator.

Matt:
So a lot of the lessons we’ve learned will directly apply. So if we have football stats from, right now we’re not accounting for, we’re not really pulling in much college stats, but we definitely, those stats are available. We can pull those in and use similar methodology as we do for the Minor League stats for football and even just stuff like the decay rate that Wil’s talking about, we’re looking at how do we incorporate recent stats versus historical stats. A lot of those lessons, a lot of the stuff that we’ve been improving on in baseball is totally directly applicable because all of the models are using a very similar methodology. There’s obviously differences because the sports are so different, but the kind of high-level way that they work is very similar.

Wil:
Yeah, I’m definitely excited to take a look at, trying to control for college football and new drafts and lots of very cool things in my mind about that, that I’d love to dig into. I think a big part of what we’ve done really and specifically in the last couple of weeks with this final push to get this backtest going and everything like that is, we’ve really sort of deconstructed the whole model and put it back together. We’ve just completely taking it apart, looked at all the pieces, figured it all out. So I think that we’re in a really good place to make improvements on it and we know exactly, at least for me, still kind of new, really figuring out how it all works together in my head and everything like that. So I know it’s like, if I want to control for this or isolate these out, it’s all really interesting stuff that I think directly applies to the other models.

Jordan:
Yeah, it’s really exciting. It’s been awesome hearing about the work you guys have been doing on baseball recently. And I can’t wait to see how it translates into some of the other sports. Another question I had that I wanted to talk about, the user pulls up the SaberSim app and they see not just one set of projections, but two in the form of ownership. Can either of you guys talk a little bit about what that ownership model looks like, how those numbers are being generated, and maybe even some ideas for what we have going on behind the scenes for improvement there?

Matt:
Yeah. So ownership is really cool. I think different how a lot of places do it. So our ownership projections are actually using a simulator as well in a way, it’s not the play-by-play sport simulator. But what we do is we take our projections and then other factors from just the industry. We take all our projections and all these other factors and we essentially simulate the contests. So Wil simulate kind of a big GPP contest using these projections and build actual lineups. And the ownership projections are essentially just the exposures of the simulated contest. So we’ll build thousands of lineups and then look at, okay, what does each player owned in these lineups that we build? And that’s what the ownership projections are.

Matt:
And so one, it’s nice because you get ownership projections that they all add up to the right amount and they make sense with each other. We’re looking at the actual lineup construction. So there’s a position where there’s only two viable players or there’s a really expensive pitcher on the slate that everybody’s going to play. That means the rest of the batters that are expensive might be lower owned. We’re going to kind of account for all of that because we’re making actual lineups, but there’s really a lot to improve upon there. So, there’s way more factors that we can incorporate into the ownership projections. We’re still doing it fairly simply, because I think it works pretty well, but there’s a lot of different other factors that we can pull in and kind of incorporate into that same process where we’re simulating the contest essentially.

Matt:
And then I don’t know if you specifically said what users can do to add to them, but I’ll just touch on that as well. I think, one thing is looking at because our actual projections are kind of part of what go into ownership because we think, we have really good projections that kind of mirror a lot of what the industry … When we project a player really high it’s likely that other people are going to see that as well and other projection sources are going to be similar.

Matt:
But if there’s somewhere where you think that we’re off, we may be over projecting ownership in areas where we’re way higher on a team or a player than the rest, than other sources are. So one source of values you can maybe increase ownership where we’re really low on a team or a pitcher compared to the field and vice versa if we’re really high. And just when there’s kind of this hype factor that is maybe beyond projections, where someone like Wander Franco, who he’s men price batting second.

Matt:
And is like the top prospect of the past five-year, well not get the … I mean million flat, but he’s a very top prospect, expected to do very well. I think we had him below 10% projected ownership. Other sources that I was looking at also had him below 10%, he came in at like 22. And that’s a place where it was pretty obvious I think if you looked at it like, “Hey, this guy is going to be owned because he’s known he’s this hyped prospect and people are going to want to play him.”

Matt:
So, that’s definitely a way to add value where you can say, “Hey, I think,” I mean, hopefully you didn’t increase his ownership because he ended up doing really well and then you would have had less of him at the higher ownership, but that’s a good place to add value where I think intuition actually can play a stronger role in ownership projections than it should in normal projections. Because when you’re doing ownership, you really you’re predicting what other people are going to do. And just using intuition about that, I think can often just based on experience of like, “Hey, I’ve played at DFS before. I’ve seen these contests. I think this guy is going to be high-owned.” For me I think that often ends up helping just based on like, “Hey, I know what other people are going to do based on my experience.”

Wil:
Yeah, I think that’s a big part of it where it’s a double-edged sword in that predicting human behavior is harder for a model, but easier for a human, because it’s exactly like you said. I mean, if I go to play a contest and it’s like, I look at the prospect that I’ve seen 47 articles written about on Twitter, I know he’s probably going to get some ownership just from virtue of people hand building or people that just, he’s been touted. People want to have him in their lineups, and I’ll just go ahead and increase his owner ownership percentage. And that’s just something that an automated model probably won’t get without incorporating a ton more factors.

Jordan:
Yeah. It’s almost as if the ownership model is trying to estimate what people should do given the inputs, and rather than what they actually will do. I see, I mean, pretty often I think ownership condenses a little bit more than the model might expect. But I mean, the way it is set up right now, part of what makes it really cool and really useful, at least for me, is that because we’re doing this dynamically and actually creating real lineups we have ownership projections for smaller slates, things like turbos and night slates and showdowns in a way that actually mimics the way people are going to have to build lineups for that contest where maybe some other models that are out there, the heavy manual components of that. And they can’t possibly have ownership projections for all those different slates and contest types and things like that, so.

Matt:
Yeah, I would say that there’s probably, I don’t play too many of those really small slates, but I would guess that there’s a lot more edge in using those ownership projections for those smaller turbo night slates, because they’re not as readily available elsewhere and we’re really mimicking, like you said, we’re kind of mimicking how people actually have to build those lineups. So I bet you can kind of get more, even more value out of the ownership projections, looking at them for those smaller slates.

Jordan:
Yeah, it’s been working out for me. Those are some of my best contests, some of those other smaller slates, so. Let’s see, another question came in here through YouTube says, “I’m still not sure how I can add value to the model that’s not already being considered in the model?” And this is kind of like the question that comes in almost every day here on Office Hours. And Nancy Drew Guy, I don’t know if you’re joined in here a little later or just caught some of that beginning section. We did talk about this in a little bit more detail, but really just maybe for like a quick and dirty answer to this. And I don’t know what the right word is here, as not mathematical as possible. What’s one thing maybe you guys would both say that somebody can do that can step in where they’ve only got limited time and impact their lineups in a way that is positive EV for that slate?

Wil:
Yeah, I think for me it would just be comparing outrun totals to implied run totals on Vegas or taking a look at the betting page and seeing what tips we have strong unit bets on. And if it feels wrong, or if I just, I don’t want that much exposure to the team just bringing their run total closer to Vegas. There’s nothing, like a stand doesn’t need to be a 100% or 0% on a player or a team. There’s really nothing wrong with going just a bit over or a bit under a team that’s still taking a stand, that’s still generating a difference. So I think that’s the, if I was just going in there with five minutes before a build, that’s where I would go to.

Matt:
Yeah. For me, I think what I mentioned before was getting into the Statcast data and stuff like that. I think while that’s true that you can add value there, it is hard for someone that is new to this, or doesn’t really know how to interpret that sort of data. And honestly, I don’t really do much of that myself.

Matt:
I really trust the model a lot. And I essentially use what it tells me. I will say how I, this isn’t exactly adding input into the model itself, but where I had my most manual intervention is in the step three exposures page after the build runs, where I’m looking at where my highest exposures are, both in individual players and in team stacks. And I’m kind of adjusting those partly based on intuition where I just see, if I see a player that’s really high, especially if they’re way higher than the ownership projection I might want to … Sometimes I want to just take a stand there. Other times I think I can get a lot of leverage out of playing 50% of a 10% owned player without having to have a 100% of them.

Matt:
Other times I just want to diversify my stacks a little bit, or I look and I see a play that looks, a team or a pitcher that looks, I have less exposure to that than I want that maybe it’s someone like, say like Fernando Tatis against Trevor Bauer last night, where if you’re just running a build, you might not get much of him because he’ll be low projected, because he’s going against a top tier pitcher. But I might look at my exposures. See, I have, maybe I have 2% of him and think, “He’s got really high upside, especially if I’m going to be fading Trevor Bauer, or I’m going to be having less of that pitcher then I want, I’ll try to bump up maybe a star player that goes against them because I’m trying to take advantage of the leverage from bending that pitcher, something like that.

Matt:
So long story short, I think that I’m adding a lot of my value and my manual intervention really in managing my exposures after the fact, rather than necessarily altering projections beforehand. And I know that’s different than what Max and Danny Steinberg have talked a lot about adjusting their projections beforehand. And so there’s a lot of different ways to add your own input, but that’s just the way that I prefer to do it or how I intend to do it.

Jordan:
Yeah, I do a lot of my work on the exposures too, rather than the projections as well. So these questions are always kind of tough because a lot of it’s just going to hinge so much on a slate-to-slate basis. So, I mean, I guess the last thing to just kind of wrap this up is, you mentioned it earlier, Matt. I mean, we don’t want to give the indication that anyone should be hitting the app on any day, pulling up the slate, and feeling like they must make a change to be effective, right? We’re already putting a lot of time and our own review process into these. There are opportunities to add value at times, especially on different slates or certain situations, but no one should feel like they need to go in and start moving numbers around to make a difference.

Jordan:
It seems like there was some interest here in that conversation about ownership. We had a couple of questions trickle in here. So I’m going to pull some of these in. This was another question from Slack here. Is there a chance that SaberSim could determine ownership projections for different entry fees in the future? I know this question does come in sometimes. It also, we hear this question in the form of, could you project single entry ownership versus 150 max ownership or cash games ownership versus GPP ownership? I have a few thoughts on this too, but anything that you guys want to mention here?

Matt:
Yeah, I mean, I think it’s a good idea. I think it would be a cool new addition to ownership. And we have a lot of cool features that we’re working on right now. So that’s probably not something that we’re going to be able to do in the coming months or weeks. But I think I would love to have different ownership projections for different size contests and different entry fee contests so that you’re really able to differentiate. But yeah, I mean, in the meantime, feel free to change the ownership projections as well, based on the type of contents that you’re entering, and the high stakes, a 100-man contest or whatever that there’s just going to be way more concentrated ownership. Whereas I think our ownership projections probably lean towards more of the low stakes, big contests, where they’re not quite as sharp as the bigger ones, because they’re a little bit more spread out than sometimes those big contests tend to be. So yeah, long story short, we should absolutely do that and I think it’s a good idea.

Jordan:
Yeah, I was going to echo kind of what you just said. I think particularly, I know less maybe about this question from the form of an entry fee standpoint, but we do talk in Office Hours pretty often for people that are playing single entry or three max or things like that. You see a lot of that ownership condense on what the field perceives to be the best plays on any given slate. I always recommend, if you’re a single-entry player and you play a lot of single entry and you’re familiar with how ownership kind of condenses, that’s definitely a spot you can add value by just making adjustments to the ownership model, because yeah, I think we are kind of mimicking a large field why GPP with ownership getting pretty spread out, so.

Jordan:
Let’s see, another question here came in, pull this up. This is an ownership question as well. “Since the ownership model is based on a slate sim in the builder … Wait, but since the ownership model is based on a slate sim, does the builder function as a slate sim? Or does it only build based on the outcomes of the games and ignore the potential contest the lineups are being played in? I think I know the answer, but just wanted to ask.”

Matt:
Yeah. So the builder just, well, it’s a little bit complicated to answer, right? If you have ownership fade at zero, so you’re not incorporating ownership at all, then it is just looking at the outcomes of the games and is not accounting for ownership at all. If you have the ownership slider on, we are trying to mimic, I would say that what we’re trying to do is similar to mimicking a slate sim, we’re trying to build lineups that have the highest expected value given that they’re being played in these tournament structures, right? So the entire point of accounting for ownership and fading high-end players and boosting low-end players, all that stuff that the ownership fader slider does, the whole point of that is to build lineups that have a better chance of being in the top 1%, the top 0.1% of a contest.

Matt:
We’re not literally doing a slate sim where we’re trying to find the lineups that actually placed the best in the slate, just because it’s very, very resource-intensive. I think Wil and I have both looked at that on our kind of personal use of like, “How can we analyze this contest? How can we find lineups that have the highest likelihood of placing well in contests?”

Matt:
But it’s been more of like an analytic thing, but not really something that’s possible to incorporate into the builder, just in terms of how much processing power it would take to, we have to build all these contests lineups, which we can kind of do with the ownership, but then we have to simulate all of them and rank them and then decide how do our lineups place within these lineups. So it’s just, it’s a very complicated problem. And I think the purpose of the ownership fade slider is to solve that problem in a faster, simpler way, if that makes sense? So really that’s what we’re trying to do is essentially build lineups in a way that mimic the results of what a slate or contest simulator would do.

Wil:
Yeah, and I want to add to that, that the default settings for each of those tournaments, if you input what kind of tournament you’re playing and how those sites generate, those are generated from an analysis of seeming a contest like that and looking at sort of what level of ownership is required. Like is typical, or how much upside do we need to try and capture? And so like-

Matt:
That’s a really good point, yeah.

Wil:
Yeah, so it’s like, I can run a slate sim and I’ll get back to you in nine hours when it finishes about what the actual optimals are, whereas this can build all the lineups in one or two minutes and gets you almost all the way there. So that’s-

Matt:
Yeah, to expand on that, we’ve actually done, I think, Wil, you kind of mentioned, but we’ve literally done backtests. We created these defaults by running hours and hours. One of our developers literally would kick this off every single night. It would run all night and it would be testing every single possible combination of slider settings on all of these different contests and came up with these defaults that are back-tested as kind of the highest EV.

Matt:
And that doesn’t mean they’re necessarily always perfect for every single contest. And we’re still working, iterating those. And they still depend on what the field is doing. If the field changes and everybody starts not stacking, or everybody starts 100% stacking, things are going to change. And if people start playing way more of the [inaudible 00:56:09] players, things are going to change. So it always changes. But yeah, I do want to clarify, the defaults are created from that sort of slate sim. It’s just the lineup builder itself is not a slate sim or a contest sim.

Jordan:
Yeah, that’s a great point. That’s really interesting. It’s a great question too. Thank you for asking that. Let’s see, another question here, getting close to the end here. Another feature request it looks like. “Can you all please add a study lineup to see how the top players construct their lineup?” Definitely another feature I’ve gotten requested here on Office Hours before of kind of maybe a review of past slates. I can definitely see how that would be useful. Do you guys have any thoughts there?

Matt:
I mean, I think I mentioned this already earlier if somebody asks something similar, that’s definitely something we’re thinking about. I would caution you, just from a theoretical perspective it can be dangerous to, you can run into small sample size issues where you’re studying a single slate or even a week of slates. And you see, “Hey, this player did really well. They won $100,000. They won a million dollars by playing these players and playing these stack constructions.” And there’s a lot of danger in kind of overfitting and taking signal from these results when there’s just noise.

Matt:
The other problem is that I think a lot of, especially, I mean, even pros have these heuristics that they use where they’re going to always set five three stacks, or they’re always going to play batters that are next to each other in the batting order. And a lot of the reason that they do this is because they don’t have the tools that do it for them. These are heuristics that they know work and create winning lineups, but they’re not the necessarily optimal, perfect way of playing. And so I think you can get us to dangerous territory of like, “Oh, I’m going to only play five three stacks because that’s what I see the pros doing.” When really it’s, they’re only doing that because that’s the easiest way to kind of fit stacks into their optimizer. And it’s not necessarily the best lineups to be playing.

Jordan:
Yeah, that’s a great point. I completely agree. I mean, I think sometimes another question I get here on the show sometimes is, somebody runs a build and they say, “Why am I not getting all five stacks?” Almost the heuristic or the rule that exists, because it’s hard to account for correlation with the traditional optimizer has now become the defacto only way to play the contest. And now when given an optimizer that kind of takes into account correlation and ownership and upside dynamically and builds lineups on its own, it gets flipped on its head. Like why isn’t this doing the thing that is the heuristic? So yeah, I think that’s a great point.

Jordan:
I don’t see any other questions roll in, in here. We’re getting close to the end of time. Thank you, everybody, who took the time to watch this, either the folks watching live or everybody that’s watching the recording back of this later. I will get this recording up this afternoon and will have it timestamped if you want to come back and review a certain question or anything like that again. Thank you, Matt and Wil for being on here today. Did either of you guys have any other final thoughts before we hop off?

Wil:
Nothing else from me. Thanks for having us as well.

Matt:
Yes. Thanks, Jordan.

Jordan:
Cool, yeah. I’ll be right back here tomorrow for Office Hours, 2:00 PM Eastern. We’ll pick this right back up again next week on Thursday with another strategy session. If you guys have any thoughts at all of something that you think would be interesting for us to dive a little deeper into on a Thursday, always feel free to shout it out in Slack or in email, whatever works for you guys. And we’ll see you soon.

Matt:
Thanks, everyone.

Wil:
Catch you later.

How SaberSim’s MLB Model Works

How SaberSim’s MLB Model Works

Transcript

TRY SABERSIM FOR FREE