Friday, 25 July 2008

Opportunity bunts!

In connection with a Diamond Mind Baseball league I play, I was looking at Jerry Royster's brief major-league managerial career. (He's now big in Korea.) The scuttlebutt had him as an old school manager who valued the bunt and the little things - taken the extra base, moving the runner over, etc

Obviously, the simple way to look at such things is to count up all the sacrifice hits and see who bunted the most. By this measure, Royster stacks up pretty well, being in the top 25 percent of NL managers that season. (The DH-rule means comparisons across leagues don't apply as oftentimes the bunt is the default move when the pitcher is at bat.)

Team         SacHit     SB+CS        OBP
MON 108 182 .334
CIN 95 168 .330
StL 83 128 .338
MIL (Royster) 79 144 .320
CHI 78 84 .321
NY 75 129 .322
SF 68 95 .344
PIT 68 135 .319
ATL 67 115 .331
PHI 67 147 .339
LA 67 133 .320
HOU 64 98 .338
ARI 62 138 .346
FLA 59 250 .337
COL 49 186 .337
SD 45 115 .321

(Royster didn't particularly steal much, so let's put that question to one side for today.) Royster was fourth overall, but there was some bunching so some of that might be down to opportunities.

Exactly! Not only do we need to know how many times Royster bunted, to get an idea whether he has any tendency to bunt, but we also need to know how many opportunities he had to order a bunt. To give you some guide already, I've listed the team OBP above. You can see that the Brewers were near the bottom in that category, so Royster's high bunt total and low OBP probably means we're going to see a further adjustment upwards.

To do this, I subtracted from the number of hits, the number of extra-base hits, stolen bases and caught stealings; then I added walks and hit batsmen. (This is not perfect. One also would like to know how many wild pitches and balks were involved in moving the runner over.) After that, you can divide the number of sacrifice hits by the number we'll call MO1 (Men on 1st). I'll call the resulting number the bunt percentage.
Team        SacHit      MO1      BuntPct
MON 108 1373 .079
CIN 95 1380 .069
MIL (Royster) 79 1343 .059
StL 83 1470 .056
CHI 78 1408 .055
NY 75 1409 .053
PIT 68 1350 .050
LA 67 1342 .050
PHI 67 1443 .046
ATL 67 1456 .046
SF 68 1518 .045
FLA 59 1381 .043
HOU 64 1501 .043
ARI 62 1537 .040
COL 49 1429 .034
SD 45 1447 .031

It changes things a little, although you can see from this that the simple total of sacrifice hits is a pretty good guide to how much a manager calls for the bunt. The main effect is to show how much Dusty Baker (with the Giants), Jimy Williams (with the Astros) and Bob Brenly (with the Diamondbacks) didn't bunt, relative to their opportunities. However, we're still not done with what these statistics can show us.

Using the standard deviation (the average amount that any of the bunt percentages is away from the mean), we can get an idea of whether Royster's .059 was particularly extreme behaviour. The standard deviation for this list is .012. The average is something like .050. Therefore, if the difference between a team's bunt percentage and the league average is more than .012, that's an example of extreme behaviour.
Team      Deviation   Number of StDev
MON .029 3
CIN .019 2
SD -.019 2
COL -.015 2
MIL (Royster) .009 1
ARI -.009 1
StL .007 1
HOU -.007 1
FLA -.007 1
CHI .006 1
SF -.005 1
ATL -.004 1
NY .003 1
PHI -.003 1
PIT .001 1
LA .000 1

So, Royster didn't really deviate all that much from the average NL manager in 2002. He was headed in that direction, but didn't quite make it. The real extremists when it came to the bunt were Frank Robinson, who bunted an awful lot (into the third standard deviation!), and Bruce Bochy and Bob Boone, at opposite ends of the spectrum.

There's another issue one should bear in mind when reading this sort of stuff. Royster's team wasn't particulary good, finishing with over 100 losses (94 down to Royster's tenure) and having the worst offense in the league, bar one (Pittsburgh). In such circumstances, a manager may call more bunts, steal more and go with the hit and run in order to make something happen, or light a spark under the team. It rarely works, but we tend to fidget more when we're unhappy than when we're happy. Because Royster's major-league career is so short, we'd really like to know more about his minor-league one. Perhaps, one day, when I assemble the statistics, we'll be able to shed some more light on whether, for Jerry Royster, opportunity doesn't knock but bunts.

Friday, 18 July 2008

Defensive Winning Percentage - Rivals

There are a couple of other well-known defensive metrics that attempt to implement a zone on seasons without a Zone Rating. One is TotalZone, developed by Sean Smith (AROM at Baseball Think Factory), and the other is Dan Agonistes' Simple Fielding Runs.

Both of these estimate a Zone Rating using the Retrosheet play-by-play data, which only goes back as far as 1953. That's fine, but with the current rate at which new seasons are added means we have to wait another thirty years or so to get all Major League history covered. I don't think I can wait that long, as I'll be near 80 by the time we get there. And I don't expect to live to such a ripe age.

In the meantime, we need another system, which is where I think my Defensive Winning Percentage wins out. With a few tweaks, I think I can improve it for the pre-1954 period, when it is hampered by not knowing the number of doubles and triples a pitching staff has given up.

Wednesday, 16 July 2008

A Defensive Metric 4

So far, I've identified these flaws with my Defensive Winning Percentage:

1) The statistics will, by their nature, include plays made on balls out of a fielder's zone. This is not necessarily a good thing.

2) The Positional Adjustment, being based on traditional Zone Rating, doesn't accommodate flaw (1). Thus, one ends up with estimated Zone Ratings in excess of 1, which is a logical nonsense. What I should do is adjust the positional factor by the distribution of plays on each team, but that opens up a nightmare world of a lot of work. Another option would be to cap the estimated Zone Rating. Or you could leave it, and bear it in mind when looking at some of the data.

3) There's no adjustment for groundball tendencies of pitchers, nor handedness of batters. This is quite conscious, in part because I don't see the point of doing this for a metric trying to measure what has happened, as opposed to talent level. In other words, I think a player whose pitching staff is pitching to his strengths, deserves credit. I may yet incorporate these factors, but at this stage I get the impression that on the whole the measurements even out.

4) It really is only what I'd call reasonably reliable as far back as detailed statistics from Retrosheet go, including the all-important doubles and triples given up by a team's pitchers. Without those, I'm more likely to believe conventional wisdom where my system challenges it.

I intend to stick with this unsatisfactory metric for the time being. I think it does a good job at giving us an idea of the relative ability of a team's fielders at a given position for an era before modern play-by-play metrics. That's all I ask of it, and that's all I ask you to ask of it.

Tuesday, 15 July 2008

A Defensive Metric 3

Having reached a total for Runs Saved Above Average for Washington Senators shortstops in 1970, we can use this to calculate a Defensive Winning Percentage.

Offensive Winning Percentage is a tried and tested sabermetric stat that uses a player's Runs Created to make an estimate of how many games a lineup of nine players with that Runs Created would win. You can find one set of Offensive Winning Percentages at each player's page on baseball-reference.com Ed Brinkman's page, under the Special Batting heading, shows his 1970 OWP was .394. That means a team of Ed Brinkmans would win about 63 games in a 162-game season. Let's look at whether his glove made up for that.

The first step is to calculate how many runs an average team in that league scores. In 1970, 12 teams scored 8109 runs, so that means the average team scored 675.7

Subtract the number of Runs Saved Above Average. That was 24.7 as we saw in the last post. (If it had been negative, as was the case with the Cleveland Indians' shortstops in 1970, we would add it.)

675.7 - 24.7 = 651

Now, with Offensive Winning Percentage, you'd take the runs created for a team of Ed Brinkmans, and divide it by the sum of Ed Brinkmans plus the league average, all raised to the exponent 0.83. That would represent the runs scored divided by the runs scored plus the runs allowed. We do the same thing, but in this case it is the runs allowed that is the variable. Eg,

Average Team's Runs Scored = 675.7
Team with Washington SS Runs Allowed = 651

675.7^0.83/(675.7^0.83 + 651^0.83) = .508

Therefore, the Washington shortstops, backing up a league average offense, would win about 82 games. If you added 82 and 63, and divided by two, you'd get a reasonable estimate that a team of 1970 Washington Senators' shortstops, batting like Ed Brinkman, would win about 72 games. No way is Ed Brinkman going to carry you to the pennant, but he was a better player than his team, which could only manage 70 wins.

However, when you compare the Senators' shortstops to the rest of the league, you can see that, starting from the mean point of DWP's, .501, Washington, with Minnesota, had the best fielding shortstops in the league. The Twins' Leo Cardenas was probably the proper Gold Glove winner that season, as his fielding percentage was slightly better, .978 to Brinkman's .974 .508 is excellent for DWP, but it isn't historically remarkable. You really need to push your DWP past .510 to get into remarkable territory.

Oh, who won the AL Gold Glove that year? It was Hall of Fame shortstop Luis Aparicio, then playing for the White Sox. Their shortstops managed a DWP of only .498, so Cardenas and Twins fans have a good case for feeling aggrieved.

I'm going to look next at a couple of problems with DWP, which I think is good, but with a couple flaws that always need to be borne in mind when using it.

EDIT: Edited to correct the omission of the Exponents!

Monday, 14 July 2008

A Defensive Metric 2

I promised an example of how to work out Defensive Winning Percentage, so let's use Eddie Brinkman, Washington Senators shortstop for 1970, his last season before the Detroit Tigers got him in a trade.

The first part of the process is derived from Charlie Saeger's system of Context-Adjusted Defense. This allocates outs and hits to the infield and the outfield.

The first step is to estimate how many groundball and flyball outs the team allowed for the season.

Groundball Outs = Team Assists - Catcher Assists - 1b DPs - OF Assists.
Flyball Outs = Team Putouts - Team Strikeouts - Team Assists.

For 1970 Washington, we get the following
GB outs = 1942 - 80 - 145 - 28 = 1689
FB outs = 4369 - 823 - 1942 = 1604

Then, you need to estimate the number of hits held against infielders, and then those against the outfielders.

IF Hits = Subtract from the total Team Hits Allowed the total of (home runs + doubles + triples). Take 70 percent of the difference. Divide this by the ratio of GB outs to (outs-strikeouts).
OF Hits = Subtract from the total Team Hits Allowed the number of home runs and the total of IF hits. Take 70 percent of this number.

For 1970 Washington, we get:
IF Hits = ((1375-(139-232-47))*0.7)/(1689/3339) = 344
OF Hits = (1375-139-348)*0.7 = 625

Now you are ready to estimate the Zone Rating.

Subtract from the number of GB outs the number of Team pitcher assists. Add the number IF hits. Take 30 percent of this number (this is a positional adjustment, like the one used in Win Shares) and add it to the number of errors made by Senators' shortstops.Take the result and use it to divide the number of shortstop assists.The result is your estimated Zone Rating for all Washington shortstops in 1970.

((1689-220)+344)*0.3 = 543
543 + 23 = 566
593/566 = 1.05

Multiply the estimated Zone Rating by the average number of plays made at the shortstop position, as calculated by Chris Dial in his Dr Strangeglove piece on Baseball Think Factory, and then by the Run Value he has calculated for plays made at each position.This is the Runs Saved by all Senators' shortstops in 1970.

1.05 * 532 *.753 = 420.63

Following the same procedures, we get a total for the American League in 1970 of 394.39 Runs Saved by all AL shortstops. Subtract the league runs saved from the Washington runs saved.This gives you how much better a Washington shortstop was than an average AL shortstop at saving runs.

420.63-394.39 = 24.7

Brinkman started 157 games at shortstop for Washington in 1970, so at worst we can credit a substantial portion of that quality to him.It's worth noting that in 1968, when Brinkman only appeared in 77 games at short, Washington shortstops managed to save 0.7 runs, so it seems likely that Brinkman is worth even more than 24.7.

You're now ready to calculate the Defensive Winning Percentage.

Saturday, 12 July 2008

A Defensive Metric

Last night, I was trying to explain, for the first time in public, to a couple of guys I know, my fielding metric for pre-Zone Rating players. I don't think it went very well, so I'll try again here.

You need to go back to Chris Dial's method of converting Zone Rating into Runs Saved, which is the foundation for what I do. In the modern era of Play-By-Play accounts, it's possible to create a Zone Rating to show how many balls hit into a player's fielding zone are turned into outs. However, before 1987 we don't have this information, so we have to figure out a way of estimating it.

Chris's research revealed that there is an average number of fielding chances at a position. For example, at shortstop the average team has 532 chances per season of 1440 innings. This number, I'm told, is apparently relatively static over the seasons. Chris also established a Run Value for each chance. Thus, an average team's shortstops save about 400.6 runs per season.

Since we have total values for all those who fielded at shortstop for Major League teams going way back in the history of Organized Baseball, we can, with a degree of confidence based on the recording of other information, estimate a Zone Rating for each position on a given team in a given season. And from that, we can work out how many runs the whole season's worth of players at a given position had saved.

I'll give a detailed example in the next post.

Monday, 7 July 2008

How do you follow that?

Follow what? Mickey Lolich's amazing 1971 season, to be exact.

GS   W   L   CG   ShO   IP   H   HR  HB   BB   IBB   SO   ERA   DIPS ERA
45 25 14 29 4 376 336 36 7 92 2 308 2.92 3.08

You don't see that sort of pitching any more. The last pitcher prior to Lolich to pitch 376 or more innings was Pete Alexander, pitching for the 1917 Philadelphia Phillies, who hurled 388. What's more, it's not like there was someone out there every two or three years coming close to Lolich. There's but a handful of pitchers who managed to get with hailing distance of around 340 innings. You'd think Lolich's arm would have fallen off, the way pitchers are babied nowadays. (Think of last season's Joba rules.)

Let's see what a DIPS projection would have made of Lolich going into the 1972 season.
IP      H   HR  HB   BB   IBB   SO  DIPS ERA
302.3 281 31 8 105 4 258 3.34

Not bad, but not the Cy Young award material that 1971 was. However, look at what really happened.
GS   W   L   CG   ShO   IP   H   HR  HB   BB   IBB   SO   ERA   DIPS ERA
41` 22 14 23 4 327.3 282 29 11 74 5 250 2.50 3.10

Wow! And remember this season lost about two weeks to the strike, so Lolich was on pace more or less to amass the same number of starts. However, he was dominant and began to fade as the season wore on. In late May there was talk of Lolich winning 30, talk he didn't do much to encourage.

The big thing to notice is the tremendous difference in walks. 105 expected dwindled to 74 for real. In other words, he improved on his 1971 walk rate, which went from 2.2 per 9IP to 2.03 per 9IP. He also improved on his projected hit rate, which may have been a product of the Tigers' defense, generally at the time respected for its quality. (We'll have a look at whether that was indeed the case another time.)

During spring training, Tigers' manager Billy Martin had him working on his slow curve. Lolich himself admitted, according to a June issue of the Sporting News, that he threw maybe 80 percent most of the time, so that "when I reach back for that 100 percent, it's there." He also used two fastballs, a rising one for early in the game, and more of a sinker at the end, to go with his curve.

Lolich felt he hadn't been given his just desserts by the Cy Young voters in 1971, who gave the award to the league's 2d place pitcher in wins, complete games, strikeouts and 3d place in innings pitched (all behind Lolich's 1sts in those categories). Vida Blue did lead the league in ERA though. Maybe Mickey had a point, though, as Blue won the prize fairly solidly. Anyway, he had a fire in his big belly when he came out for the start of the season. Never rile an easygoing chubby man who rides motorcycles.

When it came to the 1972 Cy Young award, Lolich fell to 3d place behind Gaylord Perry (who started as hot for the Indians and Lolich did for the Tigers) and Wilbur Wood, the White Sox' equally rubber-armed ace who had a career year and pitched 376.66 innings.