A Collection of Data Science Projects by Patrick DeAngelis

Men’s Baseball League Heroics!
This first project is taking a swing at bringing the wonders of ‘advanced analytics’ to my summer baseball team. I used fangraphs formulas and modified them where I had to. The biggest problem I ran into was data collection, its hard to get guys to keep a pitching chart for a Sunday morning men’s league. Finally, to do the calculations I used python and a jupyter notebook.
In [ ]:
import pandas as pd import numpy as np
In [ ]: Gathering all of the sperate league stats into individual data frames
df1 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=973090&teamid=6819796')[0]
df1['Team'] = str('Hurricanes')
df1['division'] = str('west')
df2 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=973090&teamid=6834815')[0]
df2['Team'] = str('Four Seam Falcons')
df2['division'] = str('west')
df3 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=973090&teamid=6839726')[0]
df3['Team'] = str('Kings')
df3['division'] = str('west')
df4 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=973090&teamid=6819004')[0]
df4['Team'] = str('Royals')
df4['division'] = str('west')
df5 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=973090&teamid=6818906')[0]
df5['Team'] = str('Huskies')
df5['division'] = str('west')
df6 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6809038')[0]
df6['Team'] = str('SUFFOLK YANKEES')
df6['division'] = str('east')
df7 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6825321')[0]
df7['Team'] = str("EASTERN A's")
df7['division'] = str('east')
df8 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6837495')[0]
df8['Team'] = str('HUNTINGTON SAINTS')
df8['division'] = str('east')
df9 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6834814')[0]
df9['Team'] = str('LI COBRAS')
df9['division'] = str('east')
df10 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6830526')[0]
df10['Team'] = str('LI CRUSH')
df10['division'] = str('east')
df11 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6845760')[0]
df11['Team'] = str('ORDER 66')
df11['division'] = str('east')
df12 = pd.read_html('https://www.leaguelineup.com/teams_baseball.asp?url=lism21&divisionid=962033&teamid=6862702')[0]
df12['Team'] = str('TRASH PANDAS')
df12['division'] = str('east')
In [ ]:
combining them into one big data frame
stats = [df1, df2, df3, df4, df5, df6, df7, df8, df9, df10, df11, df12]
df = pd.concat(stats)
In [ ]: creating variables for use in formulas
#batting variables tb = df.loc[:,'TB'] sh = df.loc[:,'SH'] cs = df.loc[:,'CS'] sb = df.loc[:,'SB'] k = df.loc[:,'K'] bb = df.loc[:,'BB'] hbp = df.loc[:,'HBP'] h = df.loc[:,'H'] dub = df.loc[:,'2B'] trip = df.loc[:,'3B'] hr = df.loc[:,'HR'] rbi = df.loc[:,'RBI'] ab = df.loc[:,'AB'] ibb = df.loc[:,'IBB'] sf = df.loc[:,'SF'] pa = df.loc[:, 'PA']
In [ ]:
#calculating weighted on base average wo = ((0.693 * bb) + (0.723 * hbp) + (0.877 * h) + (1.232 * dub) + (1.552 * trip) + (1.980 * hr)) ba = (ab + bb - ibb + sf + hbp) woba = round(wo/ba, 3) df['wOBA'] = woba
In [ ]:
#calculating weighted runs above average Weighted Runs Above Average (wRAA) measures the number of offensive runs a player contributes to their team compared to the average player. A wRAA of zero is league-average, so a positive wRAA value denotes above-average performance and a negative wRAA denotes below-average performance. This is also a counting statistic (like RBIs), so players accrue more (or fewer) runs as they play. wOBA_scale = 1.2 lgwOBA = df['wOBA'].mean() wRAA = ((woba - lgwOBA)/wOBA_scale) * pa wRAA = round(wRAA, 3) df['wRAA'] = wRAA
In [ ]:
#SecA = Secondary average *accounts for power dicipline and speed seca = (bb + (tb - h) + (sb - cs))/ab seca = round(seca, 3) df['SecA'] = seca
In [ ]:
#rc = runs created rc = ((h + bb - cs + hbp) * (tb + (.26*(bb - ibb + hbp)) + (.52 * (sh + sf + sb))))/(ab + bb + hbp + sh + sf) rc = round(rc, 3) df['RC'] = rc
In [ ]:
df = df.sort_values(by='wOBA', ascending=False)
In [ ]:
#Batting leaders leaders = df[['Name','AVG', 'OPS', 'wOBA', 'SecA', 'RC', 'wRAA', 'Team', 'division']] leaders = leaders.loc[leaders['RC'] > 5] leaders.head(15)
Out[ ]: Here are our league leaders from about 15 teams across Nassau county. Vinny is my shortstop and as you can see he had a pretty good season. That .850 wOBA is really something.
| Name | AVG | OPS | wOBA | SecA | RC | wRAA | Team | division | |
|---|---|---|---|---|---|---|---|---|---|
| 14 | Vinny Grassano | 0.559 | 1.461 | 0.850 | 0.441 | 18.422 | 15.965 | Hurricanes | west |
| 17 | Brett Maier | 0.437 | 1.464 | 0.772 | 0.625 | 8.305 | 6.747 | HUNTINGTON SAINTS | east |
| 13 | Nick Isernia | 0.511 | 1.399 | 0.764 | 0.422 | 22.396 | 18.120 | HUNTINGTON SAINTS | east |
| 1 | Chris Burns | 0.488 | 1.300 | 0.693 | 0.341 | 18.685 | 14.754 | LI CRUSH | east |
| 0 | Alberto Argotte | 0.435 | 1.243 | 0.670 | 0.419 | 25.547 | 20.259 | LI COBRAS | east |
| 11 | Connor McHale | 0.467 | 1.214 | 0.661 | 0.822 | 22.717 | 13.919 | Kings | west |
| 13 | Mike Grassano | 0.476 | 1.249 | 0.661 | 0.333 | 16.243 | 11.818 | Hurricanes | west |
| 2 | Corey Elowsky | 0.444 | 1.213 | 0.625 | 0.722 | 32.116 | 19.308 | Four Seam Falcons | west |
| 1 | Ken Danielsen | 0.426 | 1.199 | 0.612 | 0.404 | 17.300 | 12.420 | Royals | west |
| 7 | JT Cullen | 0.444 | 1.135 | 0.611 | 0.250 | 12.341 | 9.280 | HUNTINGTON SAINTS | east |
| 22 | Liam Shannon | 0.447 | 1.201 | 0.607 | 0.526 | 15.621 | 10.011 | Four Seam Falcons | west |
| 18 | Zach Hood | 0.435 | 1.147 | 0.607 | 0.478 | 8.022 | 5.658 | EASTERN A’s | east |
| 20 | Chris Olberding | 0.475 | 1.133 | 0.576 | 0.361 | 23.071 | 13.425 | HUNTINGTON SAINTS | east |
| 18 | Jesse Matos | 0.439 | 1.134 | 0.574 | 0.366 | 14.889 | 9.696 | HUNTINGTON SAINTS | east |
| 14 | Jose Lebron | 0.419 | 1.168 | 0.573 | 0.903 | 15.754 | 8.707 | Four Seam Falcons | west |
In [ ]:
# Pat stats pat_df = df.loc[df['Name'] == 'Pat DeAngelis'] pat_df
Out[ ]: These are my batting stats from last season. not really so good, this performance prompted a visit to the eye doctor for an updated prescription
| # | Name | AVG | GP | GS | PA | AB | R | H | 2B | 3B | HR | RBI | BB | K | HBP | IBB | SB | CS | SH | SF | DP | ROE | FC | LOB | TB | OBP | SLG | OPS | Team | division | wOBA | wRAA | SecA | RC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7 | 54.0 | Pat DeAngelis | 0.185 | 19.0 | 19.0 | 59.0 | 54.0 | 5.0 | 10.0 | 3.0 | 0.0 | 0.0 | 9.0 | 4.0 | 18.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 | 0.237 | 0.241 | 0.478 | Hurricanes | west | 0.258 | -4.32 | 0.13 | 3.455 |
In [ ]:
df.loc[df['Team'] == 'Hurricanes']
Out[ ]: Here are the stats for my entire team.
| # | Name | AVG | GP | GS | PA | AB | R | H | 2B | 3B | HR | RBI | BB | K | HBP | IBB | SB | CS | SH | SF | DP | ROE | FC | LOB | TB | OBP | SLG | OPS | Team | division | wOBA | wRAA | SecA | RC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 32.0 | Mike Aniano | 0.286 | 9.0 | 9.0 | 23.0 | 21.0 | 5.0 | 6.0 | 1.0 | 0.0 | 0.0 | 2.0 | 2.0 | 8.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 7.0 | 0.348 | 0.333 | 0.681 | Hurricanes | west | 0.343 | -0.055 | 0.143 | 2.616 |
| 1 | 13.0 | Joe Bitetto | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 2 | 77.0 | Sebastian Buttafuoco | 0.200 | 4.0 | 4.0 | 6.0 | 5.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.333 | 0.200 | 0.533 | Hurricanes | west | 0.262 | -0.419 | 0.200 | 0.420 |
| 3 | 9.0 | Scott Clark | 0.296 | 10.0 | 9.0 | 27.0 | 27.0 | 5.0 | 8.0 | 3.0 | 0.0 | 0.0 | 1.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 11.0 | 0.296 | 0.407 | 0.704 | Hurricanes | west | 0.397 | 1.151 | 0.111 | 3.259 |
| 4 | 33.0 | Steve Corea | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 5 | 24.0 | Thad Cosentino | 0.194 | 17.0 | 17.0 | 45.0 | 31.0 | 10.0 | 6.0 | 1.0 | 0.0 | 0.0 | 3.0 | 10.0 | 10.0 | 3.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 7.0 | 0.422 | 0.226 | 0.648 | Hurricanes | west | 0.347 | 0.043 | 0.387 | 4.822 |
| 6 | 0.0 | Niko Davanzo | 0.000 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.000 | 0.000 | 0.000 | Hurricanes | west | 0.000 | -0.288 | 0.000 | 0.000 |
| 7 | 54.0 | Pat DeAngelis | 0.185 | 19.0 | 19.0 | 59.0 | 54.0 | 5.0 | 10.0 | 3.0 | 0.0 | 0.0 | 9.0 | 4.0 | 18.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 | 0.237 | 0.241 | 0.478 | Hurricanes | west | 0.258 | -4.320 | 0.130 | 3.455 |
| 8 | 57.0 | Andrew Deckel | 0.040 | 12.0 | 12.0 | 27.0 | 25.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 10.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.111 | 0.040 | 0.151 | Hurricanes | west | 0.084 | -5.892 | 0.080 | 0.169 |
| 9 | 25.0 | Rob Deckel | 0.250 | 12.0 | 12.0 | 28.0 | 24.0 | 5.0 | 6.0 | 0.0 | 0.0 | 0.0 | 5.0 | 2.0 | 12.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6.0 | 0.296 | 0.250 | 0.546 | Hurricanes | west | 0.246 | -2.330 | 0.083 | 2.160 |
| 10 | 17.0 | Sam Dipetro | 0.234 | 21.0 | 21.0 | 56.0 | 47.0 | 5.0 | 11.0 | 4.0 | 0.0 | 0.0 | 3.0 | 7.0 | 16.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 15.0 | 0.357 | 0.319 | 0.676 | Hurricanes | west | 0.373 | 1.267 | 0.234 | 6.193 |
| 11 | 44.0 | Rocco DiPietro | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 12 | 5.0 | Kenny Grassano | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 13 | 27.0 | Mike Grassano | 0.476 | 13.0 | 13.0 | 45.0 | 42.0 | 6.0 | 20.0 | 5.0 | 0.0 | 2.0 | 18.0 | 3.0 | 7.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 31.0 | 0.511 | 0.738 | 1.249 | Hurricanes | west | 0.661 | 11.818 | 0.333 | 16.243 |
| 14 | 29.0 | Vinny Grassano | 0.559 | 11.0 | 11.0 | 38.0 | 34.0 | 6.0 | 19.0 | 11.0 | 0.0 | 0.0 | 17.0 | 3.0 | 2.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 30.0 | 0.579 | 0.882 | 1.461 | Hurricanes | west | 0.850 | 15.965 | 0.441 | 18.422 |
| 15 | 22.0 | James Hall | 0.308 | 21.0 | 19.0 | 63.0 | 52.0 | 9.0 | 16.0 | 0.0 | 2.0 | 1.0 | 13.0 | 10.0 | 9.0 | 1.0 | 0.0 | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 23.0 | 0.429 | 0.442 | 0.871 | Hurricanes | west | 0.425 | 4.155 | 0.404 | 11.974 |
| 16 | 2.0 | Chris Kornfeld | 0.250 | 2.0 | 2.0 | 4.0 | 4.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.250 | 0.250 | 0.500 | Hurricanes | west | 0.219 | -0.423 | 0.000 | 0.250 |
| 17 | 46.0 | Vinny Mavaro | 0.345 | 12.0 | 11.0 | 30.0 | 29.0 | 3.0 | 10.0 | 0.0 | 0.0 | 0.0 | 2.0 | 1.0 | 7.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 | 0.367 | 0.345 | 0.711 | Hurricanes | west | 0.315 | -0.771 | 0.103 | 4.143 |
| 18 | 4.0 | Chris McGrane | 0.187 | 7.0 | 6.0 | 18.0 | 16.0 | 1.0 | 3.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | 8.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.278 | 0.250 | 0.528 | Hurricanes | west | 0.292 | -0.808 | 0.250 | 1.400 |
| 19 | 7.0 | Tom Moffatt | 0.250 | 4.0 | 4.0 | 10.0 | 8.0 | 2.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 2.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.400 | 0.250 | 0.650 | Hurricanes | west | 0.314 | -0.265 | 0.375 | 1.216 |
| 20 | 96.0 | Anthony Neal | 0.500 | 1.0 | 1.0 | 3.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.667 | 0.500 | 1.167 | Hurricanes | west | 0.523 | 0.443 | 0.500 | 0.840 |
| 21 | 12.0 | Bobby Negron | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 22 | 99.0 | Frankie Petrocelli | 0.300 | 6.0 | 6.0 | 15.0 | 10.0 | 4.0 | 3.0 | 0.0 | 0.0 | 0.0 | 3.0 | 4.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 0.533 | 0.300 | 0.833 | Hurricanes | west | 0.408 | 0.777 | 0.400 | 2.293 |
| 23 | 14.0 | Anthony Pranzo | 0.195 | 17.0 | 17.0 | 50.0 | 41.0 | 8.0 | 8.0 | 2.0 | 0.0 | 0.0 | 2.0 | 9.0 | 5.0 | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 | 0.340 | 0.244 | 0.584 | Hurricanes | west | 0.314 | -1.327 | 0.341 | 4.726 |
| 24 | 3.0 | Steve Quiroz | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 25 | 6.0 | Jordan Rorher | 0.219 | 17.0 | 17.0 | 45.0 | 32.0 | 5.0 | 7.0 | 2.0 | 0.0 | 0.0 | 1.0 | 11.0 | 7.0 | 2.0 | 0.0 | 3.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 9.0 | 0.444 | 0.281 | 0.726 | Hurricanes | west | 0.393 | 1.768 | 0.500 | 6.196 |
| 26 | 1.0 | Ben Sherman | 0.242 | 12.0 | 12.0 | 36.0 | 33.0 | 6.0 | 8.0 | 0.0 | 0.0 | 0.0 | 4.0 | 2.0 | 2.0 | 0.0 | 0.0 | 6.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 8.0 | 0.278 | 0.242 | 0.520 | Hurricanes | west | 0.233 | -3.386 | 0.242 | 3.378 |
| 27 | 15.0 | Mike Silva Jr | 0.000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.000 | 0.000 | 0.000 | Hurricanes | west | NaN | NaN | NaN | NaN |
| 28 | 18.0 | Ben Theiss | 0.267 | 15.0 | 15.0 | 36.0 | 30.0 | 5.0 | 8.0 | 3.0 | 0.0 | 0.0 | 9.0 | 5.0 | 5.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 11.0 | 0.389 | 0.367 | 0.756 | Hurricanes | west | 0.414 | 2.044 | 0.300 | 5.087 |