# Thread: Strength Runs: strERA and SOS

1. ## Strength Runs: strERA and SOS

For a while, I've been toying with two new metrics. They are very simple in premise, but difficult to compute because you have to go through so many box scores. I heartily accept criticism, constructive or not, and I believe I am prepared to defend against most arguments. I am still learning to ropes to creating new statistics, but I think this one is at the very least worth glancing at.

Before I begin, I'm going to say that I've made up a lot of data for the sake of ease. I don't have the patience to sift through a hundred box scores for a statistic that may be blasted to pieces by the end of this thread. I will tell you where I've fabricated data. Plus, hypothetical numbers never hurt in the explanation of a metric. Wish me luck:

In 2006, Brewers' Ben Sheets pitched on April 21st against the Reds. Milwaukee won due to his endeavors. The Reds scored only two runs (both earned) against Sheets in seven innings. What I will use is something I call "strength runs" to determine how strong this start was, because everyone knows 2 ER allowed against a strong team isn't equal to 2 allowed against a weak team.

First, we need to see how strong the Reds' offense was that season up to the date of Sheets' start.

Code:
Reds	Opponent	IP	ER scored
Apr 3, 2006	CHC	9	6
Apr 5, 2006	CHC	8	6
Apr 6, 2006	PIT	8	6
Apr 7, 2006	PIT	8	7
Apr 8, 2006	PIT	8	8
Apr 9, 2006	PIT	9	3
Apr 11, 2006	CHC	9	9
Apr 12, 2006	CHC	9	1
Apr 13, 2006	CHC	9	7
Apr 14, 2006	STL	9	1
Apr 15, 2006	STL	9	3
Apr 16, 2006	STL	9	6
Apr 17, 2006	FLA	8	8
Apr 18, 2006	FLA	9	5
Apr 19, 2006	FLA	8.333	7
Apr 20, 2006	MIL	9	10
Where IP represents the innings played by the Reds offense, and ER means the earned runs they scored against their opponent.

Now I'm going to add another column labeled "STR" to represent strength runs (this is where I'm making up data)

Code:
Reds	Opponent	IP	ER scored	Strength Runs
Apr 3, 2006	CHC	9	6	-
Apr 5, 2006	CHC	8	6	-
Apr 6, 2006	PIT	8	6	-
Apr 7, 2006	PIT	8	7	-
Apr 8, 2006	PIT	8	8	-
Apr 9, 2006	PIT	9	3	-
Apr 11, 2006	CHC	9	9	-
Apr 12, 2006	CHC	9	1	-
Apr 13, 2006	CHC	9	7	8.4
Apr 14, 2006	STL	9	1	1.5
Apr 15, 2006	STL	9	3	4.5
Apr 16, 2006	STL	9	6	9
Apr 17, 2006	FLA	8	8	8
Apr 18, 2006	FLA	9	5	5
Apr 19, 2006	FLA	8.333	7	7
Apr 20, 2006	MIL	9	10	12
A strength run is awarded based on tiers. For example, the Reds scored 7 ER against the Cubs on Apr. 13th. Up until that date, the Cubs' pitching was allowing the 7th most earned runs in the NL. Thus, they fall in the "subpar" tier.

first or 2nd most ER allowed= bottom tier (1.0 strength runs)
3rd-5th most ER allowed= lower tier (1.1 SR)
6-8= subpar tier (1.2 SR)
9-11= surpar tier (1.3 SR)
12-14= upper tier (1.4 SR)
15-16= top tier (1.5 SR)

The Reds' offense scored 7 ER against a subpar earned run-allowing team, so every ER they scored was worth 1.2 strength runs. 7 x 1.2= 8.4. The next day, they scored only one run, but it was against a team in the top tier, so every run was worth 1.5 strength runs. Etc.

(I do not calculate SR for the first few days of the season because teams are only starting to show how strong/weak they are, hence the dashes).

From April 13th to April 19th, the Reds' offense accumulated 55.4 strength runs over 70.1 innings played.
Back to Sheets' star: Sheets was facing an offense with a 7.09 offensive strength ERA (ostrERA). That is, the Reds offense scored 7.09 strength runs for every nine innings they played. From the 13th to the 20th, of all NL teams, this places the Reds at the third highest ostrERA (I say "oh-stra"). Thus, they're in the upper tier. We go back to the same formatted tiers but with different labels:

first or second highest ostrERA= top tier (1.0 SR)
3-5= upper tier (1.1 SR)
6-8= surpar tier (1.2 SR)
9-11= subpar tier (1.3 SR)
12-14= lower tier (1.4 SR)
15-16= bottom tier (1.5 SR)

For every earned run Sheets allowed to Reds, he was only charged with surrendering 1.1 strength runs because he faced a difficult offense. In 7 IP, his two ER amounts to 2.2 strength runs allowed (SRA). His strERA for this start was 2.83, as opposed to his regular ERA of 2.57. To determine the season's strERA, you simply add up his total SRA and divide by IP. However, day-by-day data is required, as you can see, so you would need a powerful entity like BBRef to provide it.

The second portion of this thread is dedicated to the pitchers' strength of start (SOS). Very simple. Start with five points and subtract the SRA. Multiply this total by one if the starting pitcher goes less than 7 innings pitched. Multiply by 1.1 for a pitcher who goes <7 but >8 IP. Multiply by 1.2 if 8<IP>9, multiply by 1.3 if 9<IP>10 , and multiply by 1.4 for anything over 10 innings pitched. Sheets' SOS on 4/21/06 was 2.42 because he allowed 2.2 strength runs (5-2.2=2.8), but pitched for exactly seven innings (2.8 x 1.1=2.42). Season totals are determined by adding everything up and/or dividing by games started.

My metric has no regards for parks or defense. Not only can both be subjectively adjusted by a fan, but I want to be sure this is something actually worthwhile before pursuing that complicated stuff.

Awarding strength runs based on team rank isn't as useful as tiers. I use tiers to compensate for reality. Many things can happen to make a team one run scored above or below another. It's like saying a guy who has 201 hits was better than the guy with 200 hits. Grouping the teams allows us to do what we would if we were simply eyeballing the numbers all at once. Oh, these teams are very good at scoring/preventing runs, these next few aren't as good, the next few were rather poor, and the last few were dreadful. That sort of thing.

That's it. I await BBF responses. I'm suited for all incoming missiles, bombs, torpedoes, dirty bombs, ICBMs, and explosive packages.
Last edited by Tyrus4189Cobb; 09-04-2012 at 03:50 PM.

2. Registered User
Join Date
Feb 2009
Posts
1,590
No dirt bombs, suspiciously wrapped packages or other ominous attacks from me; but I have a suggestion that I admit may be too simplistic.

This approach lends itself to in-season evaluations or career evaluative summaries:

1. Batters Faced, Pitcher [BFP] tells us a great deal about relative pitcher performance, if our minds are open to the concept.
2. BFP can be interpreted as capital invested in each start [or the sum of collective starts].
3. BFP can be reviewed, as updated, from opening day through the end of a season in pitching records.
4. Conversely, we can deduce BFP from the offensive team hitting strengths by using Runs/PA as a corollary.

Example: In all of MLB history [1876-yesterday] average runs scored = 4.53 with average BF ranging between 36.4 [in very poor batting climates] to 38.4 in heavy run scoring climates. In the mid-range, we can sensibly relate a run climate < 4.5 to 37 PA and +>4.5 to 38PA.

4.53/38 = .1192

4.25/37 = .1149

4.00/37 = .1081

Say a pitcher goes 7.67 innings in a start, yielding 3 runs, all earned, facing 26 batters in the process. His line would look like this: 3/26 = .1154.
If the opposing team was a relative powerhouse, having seasonal numbers of 388 runs scored in 2985 PA, their offense status rank would be 388/2985 = .1300.
The pitcher's relative performance looks excellent.

However, say this same outing took place against a cellar-dwelling club with weak hitting and a line of 254 runs in a similar 2985 PA = .0851, the performance looks mediocre at best.

If I read your intent correctly, would this achieve your goal more directly and accessibly?

3. Take cover!!!
9-anti-submarine-missile.jpg
Oops, that's just a leftover missile that was meant for the ER/GS thread. I will signal it to self-destruct. Sorry about that.

4. Originally Posted by leewileyfan
If I read your intent correctly, would this achieve your goal more directly and accessibly?
I think I'm reading your intent incorrectly, but if I understand your post, it wouldn't achieve my goal. You use runs according batters faced. My metric uses innings played, not BF because I quantify the first portion, ostrERA, into a number that acts like ERA. I am looking to see how many runs are scored by an offense given the amount of outs they have to play with in a game. Your BF approach differs because it is measuring a pitcher's performance in terms of how he fared against the number of batters he faced, something that changes even if he pitched exactly six innings every game. It is runs relative to the amount of opponents, whereas my approach is runs relative to innings, which in itself has to do with how the game has progressed (6 innings is always 18 outs but not always 22 BF), similar to ERA.

I also specify the use of earned runs and not runs. Runs win the game, but I'm measuring the strength of a pitcher's start, not his impact on the team's record. As a whole, earned runs are the only objective way to determine how much damage the pitcher is allowing or preventing. Unless one wants to take the time to sift through gamelogs, earned runs are the fault of a pitcher (in general). It isn't practical to go through each error/wild pitch/passed ball of every game and determine which runs still should or shouldn't have scored.

Awarding strength runs based on team rank isn't as useful as tiers. I use tiers to compensate for reality. Many things can happen to make a team one run scored above or below another. It's like saying a guy who has 201 hits was better than the guy with 200 hits. Grouping the teams allows us to do what we would if we were simply eyeballing the numbers all at once. Oh, these teams are very good at scoring/preventing runs, these next few aren't as good, the next few were rather poor, and the last few were dreadful. That sort of thing.

5. Registered User
Join Date
Feb 2009
Posts
1,590
I used "runs" as a convenience. I am sure earned runs can be extrapolated from "runs; but even at that, I sometimes believe just plain old "runs" may better betray how essentially vulnerable a pitcher is to surrendering runs to the opposition.

It's a kind of RC/PA for pitchers, a side door approach to how pitchers fare against degrees of offense[s] they face. It would toss out the 3.45 format of ERA, FIP ERA, etc. and replace it with a four decimal place pitcher ROPE [return of pitching effort, per batter faced].

6. I think we'll agree to disagree.

Kinda hoping I'd get feedback from more than one person...

7. I like the concept, but can you take it further and, instead of looking at how the Reds faired against each team they played to deterine strength runs, look at how they faired against the indvidual pitchers of that team? The results can be very different if you miss the heart of the rotation, or if you don't have to face a top closer

Also wondering why you are only looking at performance year-to-date. I would think it would be more accurate looking at the opponent's performance before and after the event. Also, you are implying that you are using season-to-date, even if it is toward the end of the season. Teams play very different throughout the season. You may want to consider limiting it to the 20 games before and 20 games after, or something like that. Of course you run into the difficulty of the beginning and end of year, which I don't have a good solution for

In theory the lineups can also be disected to see if all the "A" starters were in the game, and adjust accordingly, but this is probalby byond the scope of a simple metric

8. Originally Posted by Brooklyn
I like the concept, but can you take it further and, instead of looking at how the Reds faired against each team they played to deterine strength runs, look at how they faired against the indvidual pitchers of that team? The results can be very different if you miss the heart of the rotation, or if you don't have to face a top closer
Certainly an option

Originally Posted by Brooklyn
Also wondering why you are only looking at performance year-to-date. I would think it would be more accurate looking at the opponent's performance before and after the event. Also, you are implying that you are using season-to-date, even if it is toward the end of the season. Teams play very different throughout the season. You may want to consider limiting it to the 20 games before and 20 games after, or something like that. Of course you run into the difficulty of the beginning and end of year, which I don't have a good solution for.
Another option. I use YTD because it is the team's strength up until facing the selected starter. Teams rise and fall throughout the season, so a team may be strong prior to playing June 30th then falter midway through July. Yet the starting pitcher on June 30th is facing the strong lineup that has been playing up to that date. Everything after is irrelevant. But I see what you mean about limiting the scope, because a team off to a poor start can become stronger but still seem average when the numbers balance out. Perhaps the previous 20 games are more appropriate.

Originally Posted by Brooklyn
In theory the lineups can also be disected to see if all the "A" starters were in the game, and adjust accordingly, but this is probalby byond the scope of a simple metric
That's why a powerful engine like Baseball-Reference or Fangraphs would have to use programs to supply such info. I am but one man. Perhaps there would be a way to determine individual strength runs then see how much the lineup adds up to, but I prefer my way better for the sake of simplicity. After all, the metric is concerned with pitcher vs. team offenses.

9. At the end of this season, I hope to have the strERAs of all starting pitchers with at least 12 games started.

10. Registered User
Join Date
Mar 2012
Location
http://uzrillusion2.blogspot.com/
Posts
54
Two things:

1. I'm not a fan of just making up coefficients out of thin air.

2. Ostrera in spanish means "Oyster-wench, oyster-woman"

11. Originally Posted by JDanger
Two things:

1. I'm not a fan of just making up coefficients out of thin air.

2. Ostrera in spanish means "Oyster-wench, oyster-woman"
1. It wouldn't matter if everyone is being judged on the same criteria and can be awarded the same coefficient

2. Oh well

12. Originally Posted by JDanger
Two things:

1. I'm not a fan of just making up coefficients out of thin air.
Agreed. People typically weigh their numbers in ways that favor their purpose. Remember when everybody would multiply ERA+ by IP to gauge pitcher greatness? The question is...why multiply by 1.1/1.2, etc. Why not 1.5/1.6/1.7, etc.? Why not 2.1/2.2/2.3?

13. Originally Posted by Matthew C.
Agreed. People typically weigh their numbers in ways that favor their purpose. Remember when everybody would multiply ERA+ by IP to gauge pitcher greatness? The question is...why multiply by 1.1/1.2, etc. Why not 1.5/1.6/1.7, etc.? Why not 2.1/2.2/2.3?
I could multiply by 2.1 I could by 2.18897. I could multiply by 4,534.928. I doesn't matter. I'm assigning ascending values to ascending tiers. Everyone is subject to the same factors.

I started with 1.0 because a run is a run, so it seemed foolish to stay that allowing a run against a weak team is worth less than a run. Then I went up by increments of 0.1 because they are small yet effective. It makes strERA resemble ERA because it is, in essence, a metric that resembles the statistic. If I went with 1.0, 2.0, etc. it would make the differences between pitchers too vast to really understand how good/bad they were.
Last edited by Tyrus4189Cobb; 09-08-2012 at 09:52 AM.

14. Originally Posted by Tyrus4189Cobb
I could multiply by 2.1 I could by 2.18897. I could multiply by 4,534.928. I doesn't matter. I'm assigning ascending values to ascending tiers. Everyone is subject to the same factors.

I started with 1.0 because a run is a run, so it seemed foolish to stay that allowing a run against a weak team is worth less than a run. Then I went up by increments of 0.1 because they are small yet effective. It makes strERA resemble ERA because it is, in essence, a metric that resembles the statistic.
How did you come up with the increments that you did?

And it "does matter" when you multiply it by a "tiered" range. There is a big difference between 1.2 x 3 and 1.2 x 5 vs. 4,534 x 3 and 4.534 x 5.
Last edited by Matthew C.; 09-08-2012 at 09:55 AM.

15. Originally Posted by Matthew C.
How did you come up with the increments that you did?

And it "does matter" when you multiply it by a "tiered" range. There is a big difference between 1.2 x 3 and 1.2 x 5 vs. 4,534 x 3 and 4.534 x 5.
I already explained how I got my increments.

The difference between 1.2 x 3 and 120 x 3 is 356.4. You could do that, if you want. It's the same mathematical result of bigger vs. smaller. It just looks messier.

Tier 1- multiply by 1.2
tier 2- multiply by 3.8.

Bob allows 4 ER to a tier 1 team in six innings. Gary allows 4 ER to a tier 2 team in six innings. Bob's strERA is 7.20. Gary's strERA is 22.80. Gary has a larger strERA because he allowed ER to a lesser team.

OR, I can make things easier and say tier 2's factor is 1.3 Now Gary's strERA is 7.80. Still larger, but now in a neater package. 7.20 vs 7.80 looks better and is more mentally comparable than 7.20 vs 22.80.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•