Thursday, May 23, 2013

Baseball Calculus: Starling Marte and the Hot Start Bias

I have Starling Marte on my fantasy baseball team. I'm happy that I do - he's had a great start to the season, putting up fantastic fantasy baseball numbers: a .304 average, 5 homers, 33 runs, 17 RBI and 10 steals. Over a full season, that projects to a .304/118/18/61/36 line - terrific stuff. The sort of performance you'd expect from a second- or third-round player. Marte is perceived by many as an emerging star.

Here's the problem: Starling Marte is, at this point, probably one of the most overrated players in fantasy baseball.
Don't believe me? I'm convinced Marte is in for a serious reckoning, which has perhaps already hit. Most telling is his 46/9 strikeout-to-walk ratio, which is dangerously close to Jeff Francoeur (35/5 this year) territory. Pitch selection is important in baseball. If you're too free a swinger, pitchers will get you to chase more pitches. You'll get fewer good pitches to hit. When you do make contact on pitches outside the zone, you're more likely to hit weak grounders and pop-ups. Marte's .304 batting average is an illusion created by a .385 BABIP, which places 8th among MLB regulars. He's been lucky.
Marte is overrated because of something I'd call a 'Hot Start Bias'. We tend to form opinions about players over time. If a rookie has a hot week, we'll keep an eye on him, but we're slow to commit to him. If he has two good weeks, we look more closely. If, after a full month, he's hitting .327 (as Marte was) and among the MLB leaders in runs, steals and OBP, we will look at him and say "maybe this kid really is going to be one of the best leadoff hitters in baseball."
Then, we start keeping an eye on him. Every time we check his stats, that batting average is still over .300; he's still among the MLB leaders in runs, and he's got good numbers across the board. And because he's a fun, exciting, emerging star, we check his numbers often. Every check is another data point in the formation of our opinion.
Here's the problem, though: we make the mistake of simply checking his season-to-date stats every time. We don't check his recent numbers, which would show that since the start of May, he's hitting .270/.333/.432 with a 20/2 K/BB.
This problem can be illustrated graphically with some concepts borrowed from basic calculus.
Remember that thing from Calculus I? The 'area under the graph'? I'm suggesting that our opinions of a player like Marte is something that is formed over time. Every time we check his batting average, it's a data point in our minds. In early April, we see those high averages - he stays over .330 until April 18 - we think 'excellent, but it's still early; I'd like to see him sustain it'. Then, for a few more weeks, he maintains an average around .320. Now he's gradually tailing off towards the .300 mark.

You might note above that the graph starts only on April 10, cutting out Marte's first week. This is a bit of a cheat on my part; he hit poorly in his first couple games, keeping his average low until he got hot a few games in. I've left those games out purely for visual effect. While it's statistically dishonest, I don't think it changes the issue.

Back to my argument: very data point further adds to our perception that Starling Marte is a .300 hitter. That he's a good player. That he's one of the league's better leadoff men and an excellent fantasy baseball player.
Here's the amazing thing: what if he is actually only, say, a .260 hitter from here on out? Let's say he hits a steady 1.04 hits in 4 ABs for another 117 games (bringing him to 162 total). Here's how that batting average chart will look:

So here, we see the hot start is a bit compressed, followed by a steady tail-off down to a .272 average at season's end. As you can see, I've kept the original blue shading for days where Marte's average is over .300, and used red shading once he falls below .280. The blue area represents our perception that Starling Marte is an excellent hitter; the red is our perception that he's actually pretty average; sub .280, which is still decent for an MLB regular but not a star. Everything in between is kind of neutral.
As you can see, though, the size of the blue area far exceeds that of the red area; the cumulative effect of this perception will be that, at the end of the season, plenty of people will still be saying things like:
  • "Starling Marte can easily hit .300 in 2014."
  • "Starling Marte is going to win a batting title someday." (don't get me started on this line...)
  • "Starling Marte is going to be a perennial all-star."

All this when he's really a .260 hitter. He's a lot closer to Chris Young than Andrew McCutchen, yet his hot start will continue to skew perceptions of his ability level because of this bias.

And that's why I'm trying to sell high on him in my fantasy baseball league. People think he's this awesome, budding superstar, based on seven weeks of BABIP-fuelled good luck. He's not. He's going to be overrated and over-drafted by fantasy players for some time before we all clue in to the fact that he's actually pretty average. We fall prone to this Hot Start Bias all the time, yet seem to be surprised every time we get caught by it. It also runs in the opposite direction: if a talented player starts badly in April, we'll miss it when he tears it up in May and June because we've got it in our heads that he's "not having a good year." There's no reason to get caught in this trap if we think carefully about how our perceptions operate.

Follow Rory Johnston (@rnfjohnston) on twitter:

Wednesday, May 22, 2013

Mark Buehrle: now with more groundouts!

Mark Buehrle put up another strong start for the Blue Jays today, (7 IP, 4H, 2 ER, 6/2 K/BB) and though he didn't get the win, it's another positive sign that he's getting back on track after a tough start.

His improving ERA and K/BB are easy to notice, but what you may have missed is that he now hasn't given up a home run in three straight starts. This isn't just luck; he's done it by getting more ground outs. He's gotten more ground outs than air outs in each of those three starts (11/5 today) after being on the wrong side of that ledger for the three preceding starts, where he allowed 8 HR in 18 innings.

Buehrle's been a pitch-to-contact guy for some time, and has always given up a fair number of home runs. As his stuff ages, he's going to have to keep the ball down to stay out of trouble, and it looks like he's been able to do that lately. Getting ground outs will speed up his innings and allow him to be the innings-eater the team needs (especially after days like yesterday, where Ramon Ortiz got knocked out in the third inning). He'll likely be facing Atlanta's high-power offence on Monday so he'll definitely want to make sure he keeps the ball on the ground.

Wednesday, May 15, 2013

Jose Bautista is batting 2nd, and that's good news

Batting your best hitter second in the lineup has been a pet theory among sabermetricians for a number of years, but it's getting increasing interest around baseball this year, and the Blue Jays are joining the bandwagon, moving Jose Bautista into the second spot in their batting order. Does it make sense? Or is Jose's power wasted there?

The injury to Jose Reyes robbed the Jays of a natural leadoff man, and they've struggled to find an appropriate fill-in. Emilio Bonifacio and Rajai Davis got long looks there, as their speed looks good atop the order. Also getting time were guys like Brett Lawrie and Munenori Kawasaki. That whole group was a failure, though; your leadoff hitter must be able to get on base, and all those players feature below-average OBPs. The natural solution, I felt, was Melky Cabrera, and indeed the Jays have finally put him there. Melky had been batting 2nd or 5th most of April, as the Jays really wanted to mix his switch-hitting abilities among all the right-handed power bats. Now that Adam Lind has turned it on, they've been able to use Melky to lead off. He's not especially fast - indeed, he remains in the leadoff spot right now despite hamstring issues - but he's one of the team's best OBP guys.

When your lineup features as much power as the Jays have (it's not just Bautista and Edwin Encarnacaion; JP Arencibia and Colby Rasmus are piling up HRs too), you need to get as many guys on base in front of them as possible. Sending Davis and Bonifacio up there to make outs did nothing to set the table for them - it just burned outs and stopped the offence before it could get started.

Further, batting Melky 5th was similarly wasteful. Bautista and Encarnacion are two of the team's better OBP guys. They'll often get on via a walk, and if they're on base that often, you need a guy with some power to bring them in. Arencibia and Rasmus are perfectly suited to do that; their OBP isn't great, but they can cash in men on base. Melky, on the other hand, has a good chance to hit a single, but can't drive in runners en masse.

So with Melky in the leadoff spot, who bats second? As I've said above, the Jays have plenty of power guys but few OBP men. In that regard, Bautista is a perfect solution. With Edwin hitting 3rd and Arencibia 4th, the Jays have enough power in the middle of the lineup that Bautista can be deployed elsewhere.

A further advantage is that Bautista is less likely to come to the plate with 2 out. Batting 3rd - especially behind Davis and Bonifacio - teams could pitch around him, knowing that a walk with 2 out was a lesser risk. With none or one out, walking Bautista is a greater risk, because the Jays will have more chances to drive him in. So teams have to pitch to him, and he's able to get fastballs that can become home runs. How much better is Jose with less than 2 outs? His career BA/OBP/SLG splits:
  • none out: .279/.366/.559
  • one out: .250/.358/.459
  • two out: .229/.363/.443
The difference is undeniable. With two out, Jose gets a lot of walks, but doesn't get pitches to hit. The fewer outs, the more pressure teams face to pitch to him. You could easily make an argument, based on these splits, that Jose should be leading off. Yes, he'd be hitting a lot of solo home runs, but he'll do that anyway (just with 2 outs) if the Jays can't put decent OBP men on in front of him. Batting him 2nd behind Cabrera is a good compromise, and will continue to help the Jays. Bravo to John Gibbons for being willing to work outside the conventional wisdom.

Follow Rory Johnston (@rnfjohnston) on twitter: