Thursday, June 5, 2014

Appetite for Distraction

As often happens, I was using the Baseball Reference Play Index for one particular thing and I ended up somewhere else.  Based on the early season success of the Marlins hitters with runners in scoring position, especially at home, some people (perhaps those wearing tinfoil hats) accused them of stealing signs.  In order to check the validity of this statement, I checked each team's batting splits with a runner on 2nd base, since this is the most obvious situation that would allow the runner to steal signs.  When I saw the results, I thought I was on to something.  The Marlins have a team batting average of .301 with a runner on second (and no other runners), compared to a .260 overall average.  This difference is the largest for any team in the majors in 2014.  When I did a search for greatest difference with runners on 2nd and 3rd, I was expecting similar results.  However, I noticed that the Marlins are only hitting .192 in this situation, which is 68 points lower than their overall average (6th worst difference in MLB).  My thesis was shattered, but at least I didn't prove the conspiracy theorists correct.  It seems as if taking any of these splits seriously over such a small sample size can lead to poor conclusions...

After this search, I wanted to investigate larger sample sizes to see if I could find anything interesting.  I no longer wanted to look at particular teams, but rather the MLB as a whole.  Do batters hit better in particular situations? I used the Play Index to check league-wide batting average since 1960 (more than 50 years of data, a huge sample size) for each different combination of bases occupied.

DescriptionStateBA
No runners on0000.256
Runner on 1st1000.276
Runner on 2nd0200.247
Runner on 3rd0030.277
Runners on 1st and 2nd1200.254
Runners on 1st and 3rd1030.294
Runners on 2nd and 3rd0230.270
Bases Loaded1230.279
All states---0.261

Intuitively, having more runners on base should lead to a higher batting average.  Why?  Well, mostly due to selection bias.  Simply put, selecting states in which a pitcher may be struggling (more runners on base) should result in better performance by the batter, even if the batter isn't actually any better.  In addition, based on traditional lineup construction, good hitters should get more opportunities with runners on base than average or poor hitters.  For the most part, the table above supports this argument.  With no runners on base, the batting average is .256, and if we combine the other 7 states, the batting average with at least one runner on base is .268, a 12 point increase.  However, breaking down the individual states paints a different picture.  If more runners correlates with a higher batting average, then why is the batting average with runners on 1st and 3rd (.294) significantly better than the batting average with the bases loaded (.279)?

As it turns out, adding a runner to second base always decreases batting average.  To see this effect, we can arrange the 8 baserunner combinations into pairs to isolate the effect.  Each pair includes an initial state without a runner on 2nd base and the same state with a runner on second base.  In all four cases, the batting average is worse with the extra runner on 2nd.

Initial State
BA Added runner on 2nd
BA Difference
No runners on 000 0.256 Runner on 2nd 020 0.247 -0.009
Runner on 1st 100 0.276 Runners on 1st and 2nd 120 0.254 -0.022
Runner on 3rd 003 0.277 Runners on 2nd and 3rd 023 0.270 -0.007
Runners on 1st and 3rd 103 0.294 Bases Loaded 123 0.279 -0.015

Conversely, adding a runner to either 1st base or 3rd base always increases batting average.  Similar to the table above, we can isolate the effect in each case.  Adding a runner to first base increases batting average by 7 to 20 points:

Initial State
BAAdded runner on 1st
BADifference
No runners on0000.256Runner on 1st1000.2760.020
Runner on 2nd0200.247Runners on 1st and 2nd1200.2540.007
Runner on 3rd0030.277Runners on 1st and 3rd1030.2940.017
Runners on 2nd and 3rd0230.270Bases Loaded1230.2790.009

And adding a runner to 3rd base increases batting average by 18 to 25 points:

Initial State
BA Added runner on 3rd
BA Difference
No runners on 000 0.256 Runner on 3rd 003 0.277 0.021
Runner on 1st 100 0.276 Runners on 1st and 3rd 103 0.294 0.018
Runner on 2nd 020 0.247 Runners on 2nd and 3rd 023 0.270 0.023
Runners on 1st and 2nd 120 0.254 Bases Loaded 123 0.279 0.025

The point with these last two tables is not to prove, somehow, that a batter suddenly becomes a better hitter with runners on base.  As discussed earlier, a large part of this increase is probably due to selection bias.  But, it does seem logical that adding a runner to second base should also exhibit this effect, and the magnitude of the increase should be somewhere in between the effects shown for adding a runner to 1st and adding a runner to 3rd (maybe about 10 to 15 points in batting average).  In reality, the effect is quite the opposite (7 to 22 point decrease).

Is there any logical explanation for the data?  Conservatively, hitters are about 20 to 25 points worse in terms of batting average with a runner on 2nd than what we might expect.  I believe that this effect is almost entirely due to the distraction of having a runner directly in the batter's line of sight.  While the batter is trying to concentrate on the pitcher's delivery, release, and the trajectory and spin of the the ball, he has to cope with a teammate dancing around next to second base behind the pitcher.  Even if the batter tries to "block out" the runner, it almost certainly has an effect that cannot be muted.  The effect seems to be worse if a runner is on 2nd base and 3rd base is open, as the two worst batting averages are with a runner on 2nd (.247) and with runners on 1st and 2nd (.254).  In these cases, the baserunner at 2nd probably moves around more since he is the lead runner and can possibly steal 3rd or take off early to score on a single.  Is there anything that can be done to mitigate this "distraction" effect?  Maybe if you're David Ortiz and you just hit a leadoff double, you should think about taking a lead and then standing still until the ball is hit.

2 comments:

  1. This has me wondering what the effect is when second base is stolen, for instance if it is stolen late in an at-bat is the decreased AVG effect less than if it's stolen early.

    Conversely it seems there is an extra reward potential in attempting to steal third because the batter's chance for a hit may have just increased.

    --Lance

    ReplyDelete
  2. What I'd really like to know is what amount of the gain from runners on 3rd come from sacrifice flies saving batting average. I think it's a relatively small effect, perhaps adding 20% extra value, which doesn't make it particularly interesting. Still, the sac fly effect unfairly biases towards runners on 3rd.

    --Lance (again)

    ReplyDelete