Definitions

# Runs created

Runs created (RC) is a baseball statistic invented by Bill James to estimate the number of runs a hitter contributes to his team.

## Purpose

James explains in his book, The Bill James Historical Baseball Abstract, why he believes, runs created is an essential thing to measure:
With regard to an offensive player, the first key question is how many runs have resulted from what he has done with the bat and on the basepaths. Willie McCovey hit .270 in his career, with 353 doubles, 46 triples, 521 home runs and 1,345 walks -- but his job was not to hit doubles, nor to hit singles, nor to hit triples, nor to draw walks or even hit home runs, but rather to put runs on the scoreboard. How many runs resulted from all of these things?1

Runs created attempts to answer this bedrock question. The conceptual framework of the "runs created" stat is:

$RC = frac\left\{A;times;B\right\}\left\{C\right\}$

where

• A = On-base factor
• C = Opportunity factor

## Formulae

### Basic runs created

In the most basic runs created formula:

$RC = frac\left\{\left(H+BB\right) times TB\right\}\left\{AB+BB\right\}$

where H is hits, BB is base on balls, TB is total bases and AB is at-bats.

This can also be expressed as:

OBP × SLG × AB
or,
OBP × TB

where OBP is on base percentage and SLG is slugging average.

### "Stolen base" version of runs created

This formula expands on the basic formula by accounting for a player's basestealing ability.

$RC = frac\left\{\left(H+BB-CS\right) times \left(TB+\left(.55 times SB\right)\right)\right\}\left\{AB+BB\right\}$

where BB is base on balls, CS is caught stealing, TB is total bases, SB is stolen bases, and AB is at bats.

### "Technical" version of runs created

This formula accounts for all basic, easily available offensive statistics.

$RC = frac\left\{\left(H+BB-CS+HBP-GIDP\right) times \left(TB+\left(.26 times \left(BB - IBB + HBP\right)\right) + \left(.52 times \left(SH + SF + SB\right)\right)\right\}\left\{AB+BB+HBP+SH+SF\right\}$

where BB is base on balls, CS is caught stealing, HBP is hit by pitch, GIDP is grounded into double play, TB is total bases, IBB is intentional base on balls, SH is sacrifice hit, SF is sacrifice fly, and AB is at bats.

### 2002 version of runs created

Earlier versions of runs created overestimated the number of runs created by players with extremely high A and B factors (on-base and slugging), such as Babe Ruth, Ted Williams and Barry Bonds. This is because these formulas placed a player in an offensive context of players equal to himself; it is as if the player is assumed to be on base for himself when he hits home runs. Of course, this is impossible, and in reality, a great player is interacting with offensive players whose contributions are inferior to his. The 2002 version corrects this by placing the player in the context of his real-life team. This 2002 version also takes into account performance in "clutch" situations.

A: $H + BB - CS + HBP - GIDP$
B: $\left(1.125 times Singles\right) + \left(1.69 times Doubles\right) + \left(3.02 times Triples\right) + \left(3.73 times HR\right) + .29 times \left(BB - IW + HBP\right) + .492 times \left(SH + SF + SB\right) - \left(.04 times K\right)$
C: $AB + BB + HBP + SH + SF$

where K is strikeout.

The initial individual runs created estimate is then:

$RC = left \left(frac\left\{\left(2.4C+A\right);\left(3C+B\right)\right\}\left\{9C\right\} right \right) - .9C$

If situational hitting information is available, the following should be added to the above total:

$H_\left\{RISP\right\} - \left(AB_\left\{RISP\right\} times BA\right) + HR_\left\{ROB\right\} - frac\left\{AB_\left\{ROB\right\} times HR\right\}\left\{AB\right\}$

where RISP is runners in scoring position, BA is batting average, HR is home run, and ROB is runners on base. The subscripts indicate the required condition for the formula. For example, $H_\left\{RISP\right\}$ means "hits while runners are in scoring position."

This is then figured for every member of the team, and an estimate of total team runs scored is added up. The actual total of team runs scored is then divided by the estimated total team runs scored, yielding a ratio of real to estimated team runs scored. The above individual runs created estimate is then multiplied by this ratio, to yield a runs created estimate for the individual.2

### Other expressions of runs created

The same information provided by runs created can be expressed as a rate stat, rather than a raw number of runs contributed. This is usually expressed as runs created per some number of outs, e.g. $frac\left\{RC\right\}\left\{27\right\}$ (27 of course being the number of outs per team in a standard 9-inning baseball game).

## Accuracy

Runs created is believed to be an accurate measure of an individual's offensive contribution because, when used on whole teams, the formula normally closely approximates how many runs the team actually scores. Even the basic version of runs created usually predicts a team's run total within a 5% margin of error.3 Other, more advanced versions are even more accurate.

## Problems with runs created

There is an issue of ecological fallacy which comes up when discussing the validity of runs created (or any run estimator, for that matter). Just because a formula does a good job of estimating team runs (so the argument goes), that does not mean it can be accurately applied to individual production. However, this may be a misuse of the ecological fallacy; while that fallacy does deal with incorrect assumptions being made about individuals based on group data, the actions it often applies to are purely individual in nature—voting distribution in a community is a common example. Since each person makes their choice independent of everyone else, voting data at the community level is a very different function than voting at the individual level. But run scoring is, by nature, a team process. And individual run creation is essentially the same function as team run scoring, except on a smaller scale, because all of things that make teams score runs (getting on base and driving runners in) are being done by individuals. So, because team runs and player runs created are essentially measuring the same thing, the ecological fallacy is generally seen as not applicable to run estimation formulas. However, the ecological fallacy aside, there are still other significant problems with runs created.

While even the simplest version of Runs Created estimates team runs with reasonable accuracy, the multiplicative (A*B)/C structure of the formula is fundamentally flawed when estimating the runs produced by each individual hitter, particularly in the case of hitters with extremely high on-base and slugging percentages. The reason for this is that it is impossible for a player to get on base and then drive himself in -- players' on-base and slugging averages must interact with those of their teammates. Yet RC's simple OBP*TB form assumes that a player's own slugging is interacting with his own on-base percentage, which artificially inflates RC for players who score well in both categories.

Take an example: in isolation, Ryan Howard's on-base percentage and slugging average each have a real, discrete effect on the Philadelphia Phillies' offense, but when combined they overstate Howard's contribution by treating it as though he is both driving in players with equal on-base ability as himself, and simultaneously being driven in by players with equal slugging ability as himself. This model would be appropriate with regard to a theoretical lineup of nine Ryan Howards, each of whose on-base and slugging abilities would interact in precisely this manner; however, Howard is in a lineup with players of lesser on-base and slugging abilities—his actual contribution to the Phillies in terms of runs is influenced by the fact that some of his on-base skills are being wasted by teammates who lack his slugging ability, and that some of his slugging skills are being wasted by teammates who lack his on-base ability. Therefore, Howard's RC production must be adjusted downward to reflect this reality.

This is generally not a major issue for most players, as their OBPs and SLGs are not high enough to significantly distort their Runs Created; however, superstars who put up impressive OBPs and SLGs will frequently see their RC artificially inflated by this phenomenon. In recent years, James has modified the Runs Created to correct this error, effectively placing a player in a lineup of average players, rather than assume that a player's own slugging is interacting with his own on-base percentage.

Runs created does not take into account the stadiums in which a player hits. Certain stadiums, such as Denver's Coors Field prior to the introduction of the baseball humidor, generally increase offensive production in games played there. Since each run scored in such stadiums is less valuable, the same number of runs created will translate into fewer wins in a stadium like Coors than it would elsewhere.

Runs created also does not take into account the era in which a player played. Due to various factors, some eras of baseball history have had lower or higher average levels of offensive production.

## Related statistics

• OPS (On-base Plus Slugging) is similar conceptually to runs created, except that it adds the A (on-base) and B (advancement) factors together, rather than multiplying them. This makes the statistic less accurate than runs created. However, OPS is easier for many fans to accept and embrace because they are already familiar with the individual OBP and SLG statistics that comprise it, and because it is simple to figure out.
• Win Shares is James' attempt to summarize, in one stat, a player's contributions on both offense and defense.