import pandas as pd
import numpy as np
Exploring NFL Play-by-Play Data with SQL and Python
Background and Goals
It’s been 5 years since I last explored NFL’s play-by-play data. It’s also been 5 years since my Eagles won the Super Bowl, which will be played in less than 24 hours from now. Go Birds.
It’s been so long since I’ve blogged that fastpages, the blogging library I use, has been deprecated.
I have thoroughly enjoyed some of the statistical analyses put forth by fans of the NFL this year. My favorite analyst is Deniz Selman, a fellow Eagles fan who makes these beautiful data presentations.
I also appreciate Deniz’ critique of analysis-without-context that often negates the brilliance of Jalen Hurts:
As I’ve been trying to say all year, EPA/dropback is not nearly as valuable a metric when the offense lets the QB decide whether it’s a “dropback” or not during the play by reading the defense, and that QB is the absolute best at making that decision. #FlyEaglesFly
— Deniz Selman (@denizselman33) February 11, 2023
My second favorite analyst is Ben Baldwin, AKA Computer Cowboy especially his 4th down analysis realtime during games.
There has been an onslaught of statistical advances in the NFL since I last explored play-by-play data and I’m excited to learn as much as I can. In particular, I’d like to get a hang of the metrics EPA (Expected Points Added) and DVOA (Defense-adjusted Value Over Average), which may not necessarily intersect with my play-by-play analysis (I believe Football Outsiders is the proprietor of that formula).
I’d also like to use this project to practice more advanced SQL queries than I’m used to. Given the complexity of the play-by-play dataset (by team, down, field position, etc.) I’m hoping I can get those reps in.
Lastly, I’d like to explore data presentation with these statistics using R, python, Adobe Illustrator and Photoshop. I’ve been inspired by simple, elegant graphics like those made by Peter Gorman in Barely Maps and bold, picturesque statistics posted by PFF on twitter:
The most clutch pass rushers face off in the Super Bowl pic.twitter.com/o50lV9Bkgk
— PFF (@PFF) February 12, 2023
I’ll work on this project in this post throughout this year–and maybe beyond if it fuels me with enough material–or it’ll fork off into something entirely new or different.
I’ll start off by next exploring the schema of the play-by-play dataset.
Documenting the NFL Play-by-Play Dataset Fields
In this section, I describe the fields in the 2022 NFL Play-by-Play Dataset. Not all of the fields are intuitive or immediately useful, so not all 372 column descriptions will be listed.
# load the data
= "../../../nfl_pbp_data/play_by_play_2022.csv"
fpath = pd.read_csv(fpath, low_memory=False)
pbp_2022
pbp_2022.head()
play_id | game_id | old_game_id | home_team | away_team | season_type | week | posteam | posteam_type | defteam | ... | out_of_bounds | home_opening_kickoff | qb_epa | xyac_epa | xyac_mean_yardage | xyac_median_yardage | xyac_success | xyac_fd | xpass | pass_oe | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2022_01_BAL_NYJ | 2022091107 | NYJ | BAL | REG | 1 | NaN | NaN | NaN | ... | 0 | 1 | 0.000000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | 43 | 2022_01_BAL_NYJ | 2022091107 | NYJ | BAL | REG | 1 | NYJ | home | BAL | ... | 0 | 1 | -0.443521 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
2 | 68 | 2022_01_BAL_NYJ | 2022091107 | NYJ | BAL | REG | 1 | NYJ | home | BAL | ... | 0 | 1 | 1.468819 | NaN | NaN | NaN | NaN | NaN | 0.440373 | -44.037291 |
3 | 89 | 2022_01_BAL_NYJ | 2022091107 | NYJ | BAL | REG | 1 | NYJ | home | BAL | ... | 0 | 1 | -0.492192 | 0.727261 | 6.988125 | 6.0 | 0.60693 | 0.227598 | 0.389904 | 61.009598 |
4 | 115 | 2022_01_BAL_NYJ | 2022091107 | NYJ | BAL | REG | 1 | NYJ | home | BAL | ... | 0 | 1 | -0.325931 | NaN | NaN | NaN | NaN | NaN | 0.443575 | -44.357494 |
5 rows × 372 columns
The 2022 NFL Play-by-Play dataset has 50147 rows (plays) and 372 columns.
pbp_2022.shape
(50147, 372)
play_id
is an identifier for each play in each game. It not a unique identifier as there are many duplicates. There are 4597 unique play_id
values in this dataset.
len(pbp_2022.play_id.unique())
4597
game_id
is an identifier for each game in the dataset in the format of {year}_{week}_{away_team}_{home_team}
. There are 284 unique games in this dataset.
len(pbp_2022.game_id.unique()), pbp_2022.game_id[1]
(284, '2022_01_BAL_NYJ')
There are 32 unique home_team
s and away_team
s.
len(pbp_2022.home_team.unique()), len(pbp_2022.away_team.unique())
(32, 32)
There are two season_type
values: 'REG'
for regular season and 'POST'
for postseason.
pbp_2022.season_type.unique()
array(['REG', 'POST'], dtype=object)
There are 22 week
values: - 18 regular season weeks (17 games + 1 bye) - 4 postseason weeks - Wild Card Weekend - Divisional Playoffs - Conference Championships - Super Bowl
pbp_2022.week.unique()
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22])
I believe posteam
stands for the team that has possession of the ball. There are 32 unique teams that can have possession of the ball in a game, and in some cases the posteam
is nan
.
len(pbp_2022.posteam.unique()), pbp_2022.posteam.unique()
(33,
array([nan, 'NYJ', 'BAL', 'BUF', 'LA', 'CAR', 'CLE', 'SEA', 'DEN', 'MIN',
'GB', 'IND', 'HOU', 'JAX', 'WAS', 'KC', 'ARI', 'LAC', 'LV', 'NE',
'MIA', 'ATL', 'NO', 'NYG', 'TEN', 'DET', 'PHI', 'PIT', 'CIN',
'CHI', 'SF', 'DAL', 'TB'], dtype=object))
posteam_type
has values 'home'
, 'away'
and nan
.
len(pbp_2022.posteam_type.unique()), pbp_2022.posteam_type.unique()
(3, array([nan, 'home', 'away'], dtype=object))
defteam
lists any of the 32 teams on defense on a given play. It can also have the value nan
.
len(pbp_2022.defteam.unique()), pbp_2022.defteam.unique()
(33,
array([nan, 'BAL', 'NYJ', 'LA', 'BUF', 'CLE', 'CAR', 'DEN', 'SEA', 'GB',
'MIN', 'HOU', 'IND', 'WAS', 'JAX', 'ARI', 'KC', 'LV', 'LAC', 'MIA',
'NE', 'NO', 'ATL', 'TEN', 'NYG', 'PHI', 'DET', 'CIN', 'PIT', 'SF',
'CHI', 'TB', 'DAL'], dtype=object))
side_of_field
can be nan
, any of the 32 team abbreviations, or 50
(midfield).
len(pbp_2022.side_of_field.unique()), pbp_2022.side_of_field.unique()
(34,
array([nan, 'BAL', 'NYJ', 'LA', 'BUF', '50', 'CLE', 'CAR', 'DEN', 'SEA',
'GB', 'MIN', 'HOU', 'IND', 'WAS', 'JAX', 'ARI', 'KC', 'LV', 'LAC',
'MIA', 'NE', 'NO', 'ATL', 'TEN', 'NYG', 'PHI', 'DET', 'CIN', 'PIT',
'SF', 'CHI', 'TB', 'DAL'], dtype=object))
yardline_100
can be nan
or between 1
and 99
.
len(pbp_2022.yardline_100.unique()), np.nanmin(pbp_2022.yardline_100), np.nanmax(pbp_2022.yardline_100)
(100, 1.0, 99.0)
There are 61 game_date
values.
len(pbp_2022.game_date.unique()), pbp_2022.game_date[0]
(61, '2022-09-11')
quarter_seconds_remaining
is between 0
and 900
(15 minutes).
min(), pbp_2022.quarter_seconds_remaining.max() pbp_2022.quarter_seconds_remaining.
(0, 900)
half_seconds_remaining
is between 0
and 1800
(30 minutes).
min(), pbp_2022.half_seconds_remaining.max() pbp_2022.half_seconds_remaining.
(0, 1800)
game_seconds_remaining
is between 0
and 3600
(60 minutes).
min(), pbp_2022.game_seconds_remaining.max() pbp_2022.game_seconds_remaining.
(0, 3600)
game_half
is either Half1
(first half), Half2
(second half), or Overtime
.
pbp_2022.game_half.unique()
array(['Half1', 'Half2', 'Overtime'], dtype=object)
quarter_end
is either 1
(True) or 0
(False).
'quarter_end == 1').desc[41] pbp_2022.quarter_end.unique(), pbp_2022.query(
(array([0, 1]), 'END QUARTER 1')
drive
is the current number of drives in the game (including both teams) as well as nan
values.
pbp_2022.drive.unique()
array([nan, 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.,
13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
26., 27., 28., 29., 30., 31., 32., 33., 34., 35.])
sp
teams seems to indicate whether the play involves the Special Teams unit, either 1
(True) or 0
(False).
'sp == 1').desc[32] pbp_2022.sp.unique(), pbp_2022.query(
(array([0, 1]),
'(3:19) 9-J.Tucker 24 yard field goal is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')
quarter
indicates the current quarter of the play. quarter == 5
represents Overtime.
pbp_2022.qtr.unique()
array([1, 2, 3, 4, 5])
down
represents the current down of the play (nan
, 1st, 2nd, 3rd or 4th).
pbp_2022.down.unique()
array([nan, 1., 2., 3., 4.])
goal_to_go
indicates whether this play is 1st & Goal, 2nd & Goal, 3rd & Goal or 4th & Goal, either 1
(True) or 0
(False).
pbp_2022.goal_to_go.unique()
array([0, 1])
time
is the minutes:seconds
formatted time left in the current quarter.
pbp_2022.head().time.unique()
array(['15:00', '14:56', '14:29', '14:25'], dtype=object)
yrdln
is a formatted string of team abbreviation and yard number.
pbp_2022.yrdln.unique()
array(['BAL 35', 'NYJ 22', 'NYJ 41', ..., 'NYJ 3', 'CIN 6', 'MIN 12'],
dtype=object)
ydstogo
is the number of yards before the next first down.
pbp_2022.ydstogo.unique()
array([ 0, 10, 5, 15, 6, 2, 1, 12, 9, 19, 11, 3, 8, 4, 16, 17, 7,
20, 14, 18, 13, 22, 26, 24, 21, 25, 23, 28, 30, 27, 31, 38, 36, 29,
34, 35, 32, 33])
ydsnet
is the net yards (yards gained - yards lost) of the current drive.
pbp_2022.ydsnet.unique()
array([ nan, 14., 21., 7., 1., 15., 9., 16., 44., 18., 62.,
48., 3., 11., 4., 88., 75., 23., 43., -2., 38., 0.,
45., 60., 13., 6., -1., 58., 25., 89., 59., 19., 66.,
29., -4., 24., 2., 12., 42., 78., 52., 57., 64., 35.,
-3., 70., 77., 72., 50., 37., 31., -6., 32., -5., 20.,
79., 74., 34., 65., 8., 47., 5., 69., 53., 33., 76.,
80., -16., 71., 68., 55., 27., 90., 86., 17., 30., 67.,
63., 73., 61., -13., 92., 40., 22., -7., 39., 41., 28.,
82., 49., 10., 36., 46., 84., 54., -23., -11., 83., 26.,
94., 87., -10., 85., 51., -14., 56., -8., 81., -9., 93.,
-12., -15., -17., 91., 99., 98., -19., 96., 95., 97., -20.,
-25.])
desc
is a narrative description of the current play.
1] pbp_2022.head().desc[
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).'
play_type
is either nan
or one of 9 different play types, including no_play
.
len(pbp_2022.play_type.unique()), pbp_2022.play_type.unique()
(10,
array([nan, 'kickoff', 'run', 'pass', 'punt', 'no_play', 'field_goal',
'extra_point', 'qb_kneel', 'qb_spike'], dtype=object))
yards_gained
is the number of yards gained (positive) or lost (negative) on the current play. It does not capture yards gained or lost due to a penalty.
min() pbp_2022.head().yards_gained, pbp_2022.yards_gained.
(0 NaN
1 0.0
2 19.0
3 0.0
4 5.0
Name: yards_gained, dtype: float64,
-26.0)
shotgun
indicates whether the quarterback was in shotgun position, either 1
(True) or 0
(False).
'shotgun == 1').desc[3] pbp_2022.shotgun.unique(), pbp_2022.query(
(array([0, 1]),
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
no_huddle
indicates whether the team huddled before the snap, either 1
(True) or 0
(False).
'no_huddle == 1').desc[3] pbp_2022.no_huddle.unique(), pbp_2022.query(
(array([0, 1]),
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
qb_dropback
indicates whether the quarterback drops back on the play, either 1
(True), 0
(False) or nan
.
'qb_dropback == 1').desc[3] pbp_2022.qb_dropback.unique(), pbp_2022.query(
(array([nan, 0., 1.]),
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
qb_kneel
indicates whether the quarterback kneels on the play, either 1
(True) or 0
(False).
'qb_kneel == 1').desc[176] pbp_2022.qb_kneel.unique(), pbp_2022.query(
(array([0, 1]), '(:59) 8-L.Jackson kneels to NYJ 43 for -1 yards.')
qb_spike
indicates whether the quarterback spikes the ball on the play, either 1
(True) or 0
(False).
'qb_spike == 1').desc[520] pbp_2022.qb_spike.unique(), pbp_2022.query(
(array([0, 1]),
'(:29) (No Huddle) 7-J.Brissett spiked the ball to stop the clock.')
qb_scramble
indicates whether the quarterback scrambles on the play, either 1
(True) or 0
(False). It looks like a scramble is not the same as a designed quarterback run, so I’ll dig deeper into this before using this field in analyses.
pbp_2022.qb_scramble.unique()
array([0, 1])
pass_length
is either nan
, 'short'
or 'deep'
. I’ll first understand what distance (in yards) corresponds to these designations before I use this field in analyses.
pbp_2022.pass_length.unique()
array([nan, 'short', 'deep'], dtype=object)
pass_location
is either nan
, 'left'
, 'right'
, or 'middle'
.
pbp_2022.pass_location.unique()
array([nan, 'left', 'right', 'middle'], dtype=object)
air_yards
is the number of yards a quarterback’s pass traveled in the air. It can be positive, zero or negative.
pbp_2022.air_yards.unique()
array([ nan, 0., -4., 3., 2., 16., 11., 5., 21., 14., -1.,
1., 7., 6., 15., -3., 8., 10., 50., 27., 25., -5.,
31., -6., 17., 51., 13., 4., 12., 36., 9., 32., 18.,
22., -2., 23., 45., 40., 52., -7., 26., 29., 20., 47.,
24., 30., 28., 37., 39., -8., 19., 41., 38., -12., 42.,
-10., 46., 35., 33., -9., 34., 44., 43., 53., 57., 48.,
49., 54., 58., 56., 59., 55., 61., -18., -54., -13., 62.,
65., -20., -16.])
yards_after_catch
is the number of yards the receiver gains or loses after catching the ball.
pbp_2022.yards_after_catch.unique()
array([ nan, 8., 1., 6., 0., 3., 5., 4., 12., 9., 10.,
-4., 18., 7., 15., 2., 11., 13., -1., 29., 30., 27.,
28., 16., 26., 24., 25., -5., 41., 14., 22., 19., 17.,
21., 32., 20., -2., 35., -3., 51., 66., 38., 46., 23.,
31., 37., 68., -6., 33., 52., 75., 34., 71., 44., 61.,
60., 58., 48., 50., 53., 39., 62., 47., -7., 42., 40.,
36., 49., 70., 45., 65., 43., 74., -10., -9.])
run_location
is either nan
, 'left'
, 'right'
, or 'middle'
.
pbp_2022.run_location.unique()
array([nan, 'left', 'right', 'middle'], dtype=object)
run_gap
represents which offensive line gap the runner ran through. It is either nan
, 'end'
, 'tackle'
or 'guard'
. I’ll have to dig a bit deeper (look at some video corresponding to the run plays) to understand if 'guard'
represents the A (gap between center and guard) or B gap (gap between guard and tackle), if 'tackle'
represents the B or C gap (gap between tackle and end), and if 'end'
represents the C or D (gap outside the end) gap.
pbp_2022.run_gap.unique()
array([nan, 'end', 'tackle', 'guard'], dtype=object)
field_goal_result
is either nan
, 'made'
, 'missed'
, or 'blocked'
.
pbp_2022.field_goal_result.unique()
array([nan, 'made', 'missed', 'blocked'], dtype=object)
kick_distance
is the distance of the kick in yards for the following play_type
values: 'punt'
, 'field_goal'
, 'extra_point'
, and 'kickoff'
. Looking through the data, not all 'kickoff'
s have a kick_distance
value.
'kick_distance.notnull()').play_type.unique() pbp_2022.kick_distance.unique(), pbp_2022.query(
(array([nan, 45., 40., 48., 24., 50., 56., 41., 33., 20., 49., 43., 7.,
36., 57., 25., 39., 60., 62., 61., 44., 46., 58., 26., 34., 64.,
30., 47., 54., 28., 53., 38., 29., 70., 37., 27., 52., 42., 63.,
51., 23., 55., 59., 69., 66., 14., 32., 35., 0., 31., 67., 74.,
19., 10., 22., 12., 8., 5., -1., 73., 65., 3., 21., 9., 16.,
15., 13., 18., 17., 6., 77., 68., 11., 71., 79.]),
array(['punt', 'field_goal', 'extra_point', 'kickoff'], dtype=object))
extra_point_result
is either nan
, 'good'
, 'failed'
or 'blocked'
.
pbp_2022.extra_point_result.unique()
array([nan, 'good', 'failed', 'blocked'], dtype=object)
two_point_conv_result
, the result of a two-point conversion is either nan
, 'failure'
or 'success'
.
pbp_2022.two_point_conv_result.unique()
array([nan, 'failure', 'success'], dtype=object)
home_timeouts_remaining
is the number of timeouts the home team has left. It is either 3
, 2
, 1
, or 0
.
pbp_2022.home_timeouts_remaining.unique()
array([3, 2, 1, 0])
away_timeouts_remaining
is the number of timeouts the away team has left. It is either 3
, 2
, 1
, or 0
.
pbp_2022.away_timeouts_remaining.unique()
array([3, 2, 1, 0])
timeout
indicates if a team calls a timeout, either 1
(True) or 0
(False).
'timeout == 1').desc[13] pbp_2022.timeout.unique(), pbp_2022.query(
(array([nan, 0., 1.]), 'Timeout #1 by BAL at 09:56.')
timeout_team
indicates which team called the timeout, and has 33 unique values—1 nan
and 32 team abbreviations.
(pbp_2022.timeout_team.unique(), 'timeout == 1').desc[13],
pbp_2022.query('timeout == 1').timeout_team[13]) pbp_2022.query(
(array([nan, 'BAL', 'NYJ', 'LA', 'BUF', 'CLE', 'CAR', 'DEN', 'SEA', 'GB',
'MIN', 'IND', 'HOU', 'WAS', 'JAX', 'KC', 'ARI', 'LAC', 'LV', 'NE',
'MIA', 'ATL', 'NO', 'NYG', 'TEN', 'PHI', 'DET', 'PIT', 'CIN', 'SF',
'CHI', 'DAL', 'TB'], dtype=object),
'Timeout #1 by BAL at 09:56.',
'BAL')
td_team
indicates which team scored the touchdown. It is nan
or one of 32 team abbreviations.
(pbp_2022.td_team.unique(),'td_team.notnull()').td_team[68],
pbp_2022.query('td_team.notnull()').desc[68]) pbp_2022.query(
(array([nan, 'BAL', 'NYJ', 'BUF', 'LA', 'CLE', 'CAR', 'SEA', 'DEN', 'MIN',
'GB', 'HOU', 'IND', 'WAS', 'JAX', 'KC', 'ARI', 'LAC', 'LV', 'MIA',
'NE', 'NO', 'ATL', 'TEN', 'NYG', 'DET', 'PHI', 'PIT', 'CIN', 'SF',
'CHI', 'TB', 'DAL'], dtype=object),
'BAL',
'(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
td_player_name
indicates which player scored the touchdown. It is nan
or one of 416 players who scored a touchdown in the 2022 season.
5],
(pbp_2022.td_player_name.unique()[:len(pbp_2022.td_player_name.unique()),
'td_team.notnull()').td_player_name[68],
pbp_2022.query('td_team.notnull()').desc[68]) pbp_2022.query(
(array([nan, 'D.Duvernay', 'R.Bateman', 'T.Conklin', 'G.Davis'],
dtype=object),
417,
'D.Duvernay',
'(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
td_player_id
indicates the id
of the player who scored the touchdown. There are 422 unique player IDs. Later on, I’ll look into why there are 5 fewer player IDs than player names.
5],
(pbp_2022.td_player_id.unique()[:len(pbp_2022.td_player_id.unique()),
'td_team.notnull()').td_player_name[68],
pbp_2022.query('td_team.notnull()').td_player_id[68],
pbp_2022.query('td_team.notnull()').desc[68]) pbp_2022.query(
(array([nan, '00-0036331', '00-0036550', '00-0034270', '00-0036196'],
dtype=object),
423,
'D.Duvernay',
'00-0036331',
'(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
posteam_timeouts_remaining
is the number of timeouts remaining for the team with ball possession. It can be nan
, 3
, 2
, 1
, or 0
.
pbp_2022.posteam_timeouts_remaining.unique()
array([nan, 3., 2., 0., 1.])
defteam_timeouts_remaining
is the number of timeouts remaining for the team on defense. It can be nan
, 3
, 2
, 1
, or 0
.
pbp_2022.defteam_timeouts_remaining.unique()
array([nan, 3., 2., 1., 0.])
total_home_score
is the total number of points scored by the home team.
5] pbp_2022.total_home_score.unique()[:
array([0, 3, 9, 6, 7])
total_away_score
is the total number of points scored by the away team.
5] pbp_2022.total_away_score.unique()[:
array([ 0, 3, 9, 10, 16])
posteam_score
is the total number of points scored by the team with ball possession on the current play.
5] pbp_2022.posteam_score.unique()[:
array([nan, 0., 3., 9., 10.])
defteam_score
is the total number of points scored by the team on defense on the current play.
5] pbp_2022.defteam_score.unique()[:
array([nan, 0., 3., 10., 17.])
score_differential
is the difference between posteam_score
and defteam_score
.
5] pbp_2022.score_differential.unique()[:
array([nan, 0., -3., 3., 9.])
punt_blocked
indicates if the punt was blocked. It is either nan
, 1
(True) or 0
(False).
'punt_blocked == 1').desc[3236] pbp_2022.punt_blocked.unique(),pbp_2022.query(
(array([nan, 0., 1.]),
'(5:06) 11-R.Dixon punt is BLOCKED by 44-T.Andersen, Center-42-M.Orzech, RECOVERED by ATL-9-L.Carter at LA 26. 9-L.Carter for 26 yards, TOUCHDOWN.')
first_down_rush
indicates whether a first down was achieved by a rushing play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.first_down_rush.unique(), 'first_down_rush == 1').desc[2],
pbp_2022.query('first_down_rush == 1').play_type[2]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).',
'run',
nan)
first_down_pass
indicates whether a first down was achieved by a passing play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.first_down_pass.unique(), 'first_down_pass == 1').desc[26],
pbp_2022.query('first_down_pass == 1').play_type[26]) pbp_2022.query(
(array([nan, 0., 1.]),
'(6:01) 19-J.Flacco pass deep left to 8-E.Moore to NYJ 41 for 24 yards (32-M.Williams).',
'pass')
first_down_penalty
indicates whether a first down was achieved by a penalty. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.first_down_penalty.unique(), 'first_down_penalty == 1').desc[17],
pbp_2022.query('first_down_penalty == 1').play_type[17]) pbp_2022.query(
(array([nan, 0., 1.]),
'(8:31) (Shotgun) 19-J.Flacco pass incomplete deep left to 8-E.Moore. PENALTY on BAL-44-M.Humphrey, Illegal Contact, 5 yards, enforced at NYJ 12 - No Play.',
'no_play')
third_down_converted
indicates if the team with ball possession on third down got a first down on the play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.third_down_converted.unique(), 'third_down_converted == 1').down[9],
pbp_2022.query('third_down_converted == 1').ydstogo[9],
pbp_2022.query('third_down_converted == 1').desc[9],
pbp_2022.query('third_down_converted == 1').yards_gained[9]) pbp_2022.query(
(array([nan, 0., 1.]),
3.0,
2,
'(12:41) (Shotgun) 8-L.Jackson right tackle to BAL 40 for 4 yards (57-C.Mosley, 3-J.Whitehead).',
4.0)
third_down_failed
indicates if the team with ball possession on third down did not get a first down on the play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.third_down_failed.unique(), 'third_down_failed == 1').down[5],
pbp_2022.query('third_down_failed == 1').ydstogo[5],
pbp_2022.query('third_down_failed == 1').desc[5],
pbp_2022.query('third_down_failed == 1').yards_gained[5]) pbp_2022.query(
(array([nan, 0., 1.]),
3.0,
5,
'(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.',
0.0)
fourth_down_converted
indicates if the team with ball possession on fourth down got a first down on the play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.fourth_down_converted.unique(), 'fourth_down_converted == 1').down[145],
pbp_2022.query('fourth_down_converted == 1').ydstogo[145],
pbp_2022.query('fourth_down_converted == 1').desc[145],
pbp_2022.query('fourth_down_converted == 1').yards_gained[145]) pbp_2022.query(
(array([nan, 0., 1.]),
4.0,
1,
'(7:32) 19-J.Flacco pass short right to 84-C.Davis to BAL 21 for 7 yards (23-K.Fuller).',
7.0)
fourth_down_failed
indicates if the team with ball possession on fourth down did not get a first down on the play. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.fourth_down_failed.unique(), 'fourth_down_failed == 1').down[154],
pbp_2022.query('fourth_down_failed == 1').ydstogo[154],
pbp_2022.query('fourth_down_failed == 1').desc[154],
pbp_2022.query('fourth_down_failed == 1').yards_gained[154]) pbp_2022.query(
(array([nan, 0., 1.]),
4.0,
6,
'(4:22) (Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.',
0.0)
incomplete_pass
indicates if the pass was incomplete. It is either nan
, 1
(True) or 0
(False).
(pbp_2022.incomplete_pass.unique(),'incomplete_pass == 1').desc[3]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
touchback
indicates if the kickoff or punt either went past the back of the endzone or was fair-caught in the end zone.
(pbp_2022.touchback.unique(),'touchback == 1').desc[33]) pbp_2022.query(
(array([0, 1]),
'9-J.Tucker kicks 65 yards from BAL 35 to end zone, Touchback.')
interception
indicates if the quarterback’s pass was intercepted by a defender. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.interception.unique(),'interception == 1').desc[28]) pbp_2022.query(
(array([nan, 0., 1.]),
'(5:07) (Shotgun) 19-J.Flacco pass short middle intended for 81-L.Cager INTERCEPTED by 32-M.Williams at NYJ 46. 32-M.Williams to NYJ 13 for 33 yards (19-J.Flacco).')
fumble_forced
indicates if a fumble was forced on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.fumble_forced.unique(),'fumble_forced == 1').desc[80]) pbp_2022.query(
(array([nan, 0., 1.]),
'(1:16) (Shotgun) 19-J.Flacco pass short right to 83-T.Conklin to BAL 21 for 6 yards (32-M.Williams, 58-M.Pierce). FUMBLES (58-M.Pierce), touched at BAL 25, recovered by NYJ-17-G.Wilson at BAL 27. 17-G.Wilson to BAL 27 for no gain (14-K.Hamilton).')
fumble_not_forced
indicates if a fumble occurred on the play but was not forced by another player. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.fumble_not_forced.unique(),'fumble_not_forced == 1').desc[264]) pbp_2022.query(
(array([nan, 0., 1.]),
'(13:46) (Shotgun) 9-M.Stafford to LA 11 for -6 yards. FUMBLES, and recovers at LA 11. 9-M.Stafford sacked at LA 10 for -7 yards (50-G.Rousseau).')
fumble_out_of_bounds
indicates if a fumbled ball went out of bounds. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.fumble_out_of_bounds.unique(),'fumble_out_of_bounds == 1').desc[1160]) pbp_2022.query(
(array([nan, 0., 1.]),
'(:32) (Shotgun) 16-T.Lawrence pass short right to 1-T.Etienne to WAS 11 for 3 yards (22-D.Forrest). FUMBLES (22-D.Forrest), ball out of bounds at WAS 19. The Replay Official reviewed the pass completion ruling, and the play was Upheld. The ruling on the field stands.')
solo_tackle
indicates if a player made a solo tackle on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.solo_tackle.unique(),'solo_tackle == 1').desc[1]) pbp_2022.query(
(array([nan, 1., 0.]),
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
safety
indicates if a defensive player scored a safety on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.safety.unique(),'safety == 1').desc[3255]) pbp_2022.query(
(array([nan, 0., 1.]),
'(:13) (Run formation) 19-B.Powell right end ran ob in End Zone for -26 yards, SAFETY (37-D.Alford).')
penalty
indicates if there was a penalty on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.penalty.unique(),'penalty == 1').desc[5]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
tackled_for_loss
indicates if a player was tackled for a loss of yards. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.tackled_for_loss.unique(),'tackled_for_loss == 1').desc[15]) pbp_2022.query(
(array([nan, 0., 1.]),
'(9:49) 20-Br.Hall right end to NYJ 9 for -2 yards (92-J.Madubuike).')
fumble_lost
indicates if a player lost a fumble to the other team. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.fumble_lost.unique(),'fumble_lost == 1').desc[129]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:13) (No Huddle, Shotgun) 19-J.Flacco pass short middle to 20-Br.Hall to BAL 16 for 6 yards (36-C.Clark). FUMBLES (36-C.Clark), RECOVERED by BAL-44-M.Humphrey at BAL 15.')
qb_hit
indicates if the quarterback was hit on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.qb_hit.unique(),'qb_hit == 1').desc[5]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
rush_attempt
indicates if the play was a rushing play. It is either nan
, 1
(True), or 0
(False). A QB scramble is considered a rush attempt.
(pbp_2022.rush_attempt.unique(),'rush_attempt == 1').desc[9],
pbp_2022.query('rush_attempt == 1 and qb_scramble == 1').desc[89]) pbp_2022.query(
(array([nan, 0., 1.]),
'(12:41) (Shotgun) 8-L.Jackson right tackle to BAL 40 for 4 yards (57-C.Mosley, 3-J.Whitehead).',
'(14:15) (Shotgun) 8-L.Jackson scrambles left end ran ob at BAL 35 for 8 yards (3-J.Whitehead).')
pass_attempt
indicates if the play was a passing play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.pass_attempt.unique(),'pass_attempt == 1').desc[3]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
sack
indicates if the quarterback was sacked on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.sack.unique(),'sack == 1').desc[54]) pbp_2022.query(
(array([nan, 0., 1.]),
'(9:43) (Shotgun) 8-L.Jackson sacked ob at NYJ 49 for 0 yards (56-Qu.Williams).')
touchdown
indicates if a player scored a touchdown on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.touchdown.unique(),'touchdown == 1').desc[68]) pbp_2022.query(
(array([nan, 0., 1.]),
'(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
pass_touchdown
, rush_touchdown
, and return_touchdown
indicate if the touchdown was a result of a pass, rush or kickoff/punt/fumble/interception return play, respectively. Their value is either nan
, 1
(True), or 0
(False).
(pbp_2022.pass_touchdown.unique(),'pass_touchdown == 1').desc[68]) pbp_2022.query(
(array([nan, 0., 1.]),
'(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
(pbp_2022.rush_touchdown.unique(),'rush_touchdown == 1').desc[298]) pbp_2022.query(
(array([nan, 0., 1.]),
'(13:34) (Shotgun) 17-J.Allen scrambles right end for 4 yards, TOUCHDOWN.')
(pbp_2022.return_touchdown.unique(),'return_touchdown == 1').desc[1651],
pbp_2022.query('return_touchdown == 1').desc[2197],
pbp_2022.query('return_touchdown == 1').desc[47094]) pbp_2022.query(
(array([nan, 0., 1.]),
'(7:40) (Shotgun) 10-M.Jones sacked at NE 6 for -9 yards (29-Br.Jones). FUMBLES (29-Br.Jones) [29-Br.Jones], RECOVERED by MIA-6-M.Ingram at NE 2. 6-M.Ingram for 2 yards, TOUCHDOWN.',
'(6:36) (Shotgun) 16-J.Goff pass short left intended for 88-T.Hockenson INTERCEPTED by 24-J.Bradberry (43-K.White) [95-M.Tuipulotu] at DET 27. 24-J.Bradberry for 27 yards, TOUCHDOWN.',
'6-N.Folk kicks 66 yards from NE 35 to BUF -1. 20-N.Hines for 101 yards, TOUCHDOWN.')
The following fields indicate if the play involved an attempt at an Extra Point, Two Point Conversion, Field Goal, Kickoff, or Punt, respectively:
- extra_point_attempt
- two_point_attempt
- field_goal_attempt
- kickoff_attempt
- punt_attempt
Their value is either nan
, 1
(True), or 0
(False).empt
(pbp_2022.extra_point_attempt.unique(),'extra_point_attempt == 1').desc[69]) pbp_2022.query(
(array([nan, 0., 1.]),
'9-J.Tucker extra point is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')
(pbp_2022.two_point_attempt.unique(),'two_point_attempt == 1').desc[1179]) pbp_2022.query(
(array([nan, 0., 1.]),
'TWO-POINT CONVERSION ATTEMPT. 16-T.Lawrence pass to 17-E.Engram is incomplete. ATTEMPT FAILS.')
(pbp_2022.field_goal_attempt.unique(),'field_goal_attempt == 1').desc[32]) pbp_2022.query(
(array([nan, 0., 1.]),
'(3:19) 9-J.Tucker 24 yard field goal is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')
(pbp_2022.kickoff_attempt.unique(),'kickoff_attempt == 1').desc[1]) pbp_2022.query(
(array([nan, 1., 0.]),
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
(pbp_2022.punt_attempt.unique(),'punt_attempt == 1').desc[6]) pbp_2022.query(
(array([nan, 0., 1.]),
'(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')
fumble
indicates if a player fumbled the ball on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.fumble.unique(),'fumble == 1').desc[80]) pbp_2022.query(
(array([nan, 0., 1.]),
'(1:16) (Shotgun) 19-J.Flacco pass short right to 83-T.Conklin to BAL 21 for 6 yards (32-M.Williams, 58-M.Pierce). FUMBLES (58-M.Pierce), touched at BAL 25, recovered by NYJ-17-G.Wilson at BAL 27. 17-G.Wilson to BAL 27 for no gain (14-K.Hamilton).')
complete_pass
indicates if a player completed a pass on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.complete_pass.unique(),'complete_pass == 1').desc[7]) pbp_2022.query(
(array([nan, 0., 1.]),
'(13:42) 8-L.Jackson pass short right to 7-R.Bateman pushed ob at BAL 32 for 4 yards (3-J.Whitehead).')
assist_tackle
indicates if a player assisted on the tackle on the play. It is either nan
, 1
(True), or 0
(False).
(pbp_2022.assist_tackle.unique(),'assist_tackle == 1').desc[2]) pbp_2022.query(
(array([nan, 0., 1.]),
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
The following fields provide the player_id (string), player_name (string) and yards gained (integer) for the passer, receiver or rusher on the play, respectively.
- passer_player_id
- passer_player_name
- passing_yards
- receiver_player_id
- receiver_player_name
- receiving_yards
- rusher_player_id
- rusher_player_name
- rushing_yards
3],
(pbp_2022.passer_player_id[3],
pbp_2022.passer_player_name[3],
pbp_2022.passing_yards[3]) pbp_2022.desc[
('00-0026158',
'J.Flacco',
nan,
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
3],
(pbp_2022.receiver_player_id[3],
pbp_2022.receiver_player_name[3],
pbp_2022.receiving_yards[3]) pbp_2022.desc[
('00-0036924',
'Mi.Carter',
nan,
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
2],
(pbp_2022.rusher_player_id[2],
pbp_2022.rusher_player_name[2],
pbp_2022.rushing_yards[2]) pbp_2022.desc[
('00-0036924',
'Mi.Carter',
19.0,
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
The following fields provide the player_id (string) and player_name (string) for players who intercepted the ball, returned a punt, returned a kickoff, punted the ball, kicked off the ball, recovered their own kickoff, or blocked the kick, respectively:
- interception_player_id
- interception_player_name
- punt_returner_player_id
- punt_returner_player_name
- kickoff_returner_player_name
- kickoff_returner_player_id
- punter_player_id
- punter_player_name
- kicker_player_name
- kicker_player_id
- own_kickoff_recovery_player_id
- own_kickoff_recovery_player_name
- blocked_player_id
- blocked_player_name
28],
(pbp_2022.interception_player_id[28],
pbp_2022.interception_player_name[28]) pbp_2022.desc[
('00-0033894',
'M.Williams',
'(5:07) (Shotgun) 19-J.Flacco pass short middle intended for 81-L.Cager INTERCEPTED by 32-M.Williams at NYJ 46. 32-M.Williams to NYJ 13 for 33 yards (19-J.Flacco).')
6],
(pbp_2022.punt_returner_player_id[6],
pbp_2022.punt_returner_player_name[6]) pbp_2022.desc[
('00-0036331',
'D.Duvernay',
'(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')
1],
(pbp_2022.kickoff_returner_player_id[1],
pbp_2022.kickoff_returner_player_name[1]) pbp_2022.desc[
('00-0034419',
'B.Berrios',
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
6],
(pbp_2022.punter_player_id[6],
pbp_2022.punter_player_name[6]) pbp_2022.desc[
('00-0036313',
'B.Mann',
'(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')
1],
(pbp_2022.kicker_player_id[1],
pbp_2022.kicker_player_name[1]) pbp_2022.desc[
('00-0029597',
'J.Tucker',
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
4964],
(pbp_2022.own_kickoff_recovery_player_id[4964],
pbp_2022.own_kickoff_recovery_player_name[4964]) pbp_2022.desc[
('00-0033770',
'J.Hardee',
'7-B.Mann kicks onside 12 yards from NYJ 35 to NYJ 47. RECOVERED by NYJ-34-J.Hardee.')
1947],
(pbp_2022.blocked_player_id[1947],
pbp_2022.blocked_player_name[1947]) pbp_2022.desc[
('00-0036926',
'P.Turner',
'(:02) 7-Y.Koo 63 yard field goal is BLOCKED (98-P.Turner), Center-48-L.McCullough, Holder-13-B.Pinion, recovered by ATL-13-B.Pinion at ATL 49. 13-B.Pinion to 50 for 1 yard (53-Z.Baun, 48-J.Gray).')
The following fields show player_id (string), player_name (string) or team (string) for a variety of defensive plays such as tackle for loss, quarterback hit, solo tackle, assist tackle and so on.
- tackle_for_loss_1_player_id
- tackle_for_loss_1_player_name
- tackle_for_loss_2_player_id
- tackle_for_loss_2_player_name
- qb_hit_1_player_id
- qb_hit_1_player_name
- qb_hit_2_player_id
- qb_hit_2_player_name
- solo_tackle_1_team
- solo_tackle_2_team
- solo_tackle_1_player_id
- solo_tackle_2_player_id
- solo_tackle_1_player_name
- solo_tackle_2_player_name
- assist_tackle_1_player_id
- assist_tackle_1_player_name
- assist_tackle_1_team
- assist_tackle_2_player_id
- assist_tackle_2_player_name
- assist_tackle_2_team
- assist_tackle_3_player_id
- assist_tackle_3_player_name
- assist_tackle_3_team
- assist_tackle_4_player_id
- assist_tackle_4_player_name
- assist_tackle_4_team
- tackle_with_assist
- tackle_with_assist_1_player_id
- tackle_with_assist_1_player_name
- tackle_with_assist_1_team
- tackle_with_assist_2_player_id
- tackle_with_assist_2_player_name
- tackle_with_assist_2_team
- pass_defense_1_player_id
- pass_defense_1_player_name
- pass_defense_2_player_id
- pass_defense_2_player_name
- sack_player_id
- sack_player_name
- half_sack_1_player_id
- half_sack_1_player_name
- half_sack_2_player_id
- half_sack_2_player_name
15],
(pbp_2022.tackled_for_loss[15],
pbp_2022.tackle_for_loss_1_player_id[15],
pbp_2022.tackle_for_loss_1_player_name[15]) pbp_2022.desc[
(1.0,
'00-0036130',
'J.Madubuike',
'(9:49) 20-Br.Hall right end to NYJ 9 for -2 yards (92-J.Madubuike).')
There are no plays where tackle_for_loss_2_player_id
has a value.
pbp_2022.tackle_for_loss_2_player_id.unique()
array([nan])
5],
(pbp_2022.qb_hit[5],
pbp_2022.qb_hit_1_player_id[5],
pbp_2022.qb_hit_1_player_name[5]) pbp_2022.desc[
(1.0,
'00-0026190',
'C.Campbell',
'(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
55],
(pbp_2022.qb_hit[55],
pbp_2022.qb_hit_1_player_id[55],
pbp_2022.qb_hit_1_player_name[55],
pbp_2022.qb_hit_2_player_id[55],
pbp_2022.qb_hit_2_player_name[55]) pbp_2022.desc[
(1.0,
'00-0034163',
'J.Johnson',
'00-0034163',
'J.Martin',
'(8:59) (Shotgun) 8-L.Jackson sacked at BAL 49 for -2 yards (sack split by 52-J.Johnson and 54-J.Martin).')
777],
(pbp_2022.solo_tackle[777],
pbp_2022.solo_tackle_1_team[777],
pbp_2022.solo_tackle_1_player_id[777],
pbp_2022.solo_tackle_1_player_name[777],
pbp_2022.solo_tackle_2_team[777],
pbp_2022.solo_tackle_2_player_id[777],
pbp_2022.solo_tackle_2_player_name[777]) pbp_2022.desc[
(1.0,
'MIN',
'00-0032129',
'J.Hicks',
'GB',
'00-0036631',
'R.Newman',
'(12:21) 12-A.Rodgers sacked at GB 35 for -9 yards (58-J.Hicks). FUMBLES (58-J.Hicks) [58-J.Hicks], RECOVERED by MIN-94-D.Tomlinson at GB 33. 94-D.Tomlinson to GB 33 for no gain (70-R.Newman).')
2],
(pbp_2022.assist_tackle[2],
pbp_2022.assist_tackle_1_team[2],
pbp_2022.assist_tackle_1_player_id[2],
pbp_2022.assist_tackle_1_player_name[2],
pbp_2022.assist_tackle_2_team[2],
pbp_2022.assist_tackle_2_player_id[2],
pbp_2022.assist_tackle_2_player_name[2]) pbp_2022.desc[
(1.0,
'BAL',
'00-0033894',
'M.Williams',
'BAL',
'00-0033294',
'C.Clark',
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
There are no plays where assist_tackle_3_player_id
or assist_tackle_4_player_id
have a value.
pbp_2022.assist_tackle_3_player_id.unique(), pbp_2022.assist_tackle_4_player_id.unique()
(array([nan]), array([nan]))
tackle_with_assist
is not the same as assist_tackle
.
2],
(pbp_2022.tackle_with_assist[2],
pbp_2022.tackle_with_assist_1_team[2],
pbp_2022.tackle_with_assist_1_player_id[2],
pbp_2022.tackle_with_assist_1_player_name[2],
pbp_2022.tackle_with_assist_2_team[2],
pbp_2022.tackle_with_assist_2_player_id[2],
pbp_2022.tackle_with_assist_2_player_name[2]) pbp_2022.desc[
(0.0,
nan,
nan,
nan,
nan,
nan,
nan,
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
22659],
(pbp_2022.tackle_with_assist[22659],
pbp_2022.tackle_with_assist_1_team[22659],
pbp_2022.tackle_with_assist_1_player_id[22659],
pbp_2022.tackle_with_assist_1_player_name[22659],
pbp_2022.tackle_with_assist_2_team[22659],
pbp_2022.tackle_with_assist_2_player_id[22659],
pbp_2022.tackle_with_assist_2_player_name[22659]) pbp_2022.desc[
(1.0,
'LAC',
'00-0031040',
'K.Mack',
'ATL',
'00-0035208',
'O.Zaccheaus',
'(9:31) (No Huddle, Shotgun) 1-M.Mariota pass short left to 5-D.London to LAC 6 for 5 yards (52-K.Mack, 43-M.Davis). FUMBLES (52-K.Mack), RECOVERED by LAC-52-K.Mack at LAC 6. 52-K.Mack pushed ob at 50 for 44 yards (17-O.Zaccheaus, 5-D.London).')
I’ll explore this more later before using these fields in analyses, but it seems like the assist_tackle
fields provide information on players who assisted with the tackle, while tackle_with_assist
lists information of the “main” player who was assisted on the tackle.
22659],
(pbp_2022.assist_tackle[22659],
pbp_2022.assist_tackle_1_team[22659],
pbp_2022.assist_tackle_1_player_id[22659],
pbp_2022.assist_tackle_1_player_name[22659],
pbp_2022.assist_tackle_2_team[22659],
pbp_2022.assist_tackle_2_player_id[22659],
pbp_2022.assist_tackle_2_player_name[22659]) pbp_2022.desc[
(1.0,
'LAC',
'00-0033697',
'M.Davis',
'ATL',
'00-0037238',
'D.London',
'(9:31) (No Huddle, Shotgun) 1-M.Mariota pass short left to 5-D.London to LAC 6 for 5 yards (52-K.Mack, 43-M.Davis). FUMBLES (52-K.Mack), RECOVERED by LAC-52-K.Mack at LAC 6. 52-K.Mack pushed ob at 50 for 44 yards (17-O.Zaccheaus, 5-D.London).')
1613],
(pbp_2022.pass_defense_1_player_id[1613],
pbp_2022.pass_defense_1_player_name[1613],
pbp_2022.pass_defense_2_player_id[1613],
pbp_2022.pass_defense_2_player_name[1613]) pbp_2022.desc[
('00-0033050',
'X.Howard',
'00-0036998',
'J.Holland',
'(10:05) (Shotgun) 10-M.Jones pass deep right intended for 1-D.Parker INTERCEPTED by 8-J.Holland (25-X.Howard) at MIA -3. 8-J.Holland to MIA 28 for 31 yards (76-I.Wynn).')
The following fields show player_id (string), player_name (string) or team (string) for a variety of fumble-related plays:
- forced_fumble_player_1_team
- forced_fumble_player_1_player_id
- forced_fumble_player_1_player_name
- forced_fumble_player_2_team
- forced_fumble_player_2_player_id
- forced_fumble_player_2_player_name
- fumbled_1_team
- fumbled_1_player_id
- fumbled_1_player_name
- fumbled_2_player_id
- fumbled_2_player_name
- fumbled_2_team
- fumble_recovery_1_team
- fumble_recovery_1_yards
- fumble_recovery_1_player_id
- fumble_recovery_1_player_name
- fumble_recovery_2_team
- fumble_recovery_2_yards
- fumble_recovery_2_player_id
- fumble_recovery_2_player_name
9041],
(pbp_2022.fumble_forced[9041],
pbp_2022.forced_fumble_player_1_team[9041],
pbp_2022.forced_fumble_player_1_player_id[9041],
pbp_2022.forced_fumble_player_1_player_name[9041],
pbp_2022.forced_fumble_player_2_team[9041],
pbp_2022.forced_fumble_player_2_player_id[9041],
pbp_2022.forced_fumble_player_2_player_name[9041]) pbp_2022.desc[
(1.0,
'NYG',
'00-0033046',
'J.Ward',
'NYG',
'00-0036167',
'T.Crowder',
'(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
9041],
(pbp_2022.fumbled_1_team[9041],
pbp_2022.fumbled_1_player_id[9041],
pbp_2022.fumbled_1_player_name[9041],
pbp_2022.fumbled_2_team[9041],
pbp_2022.fumbled_2_player_id[9041],
pbp_2022.fumbled_2_player_name[9041]) pbp_2022.desc[
('CHI',
'00-0034279',
'E.St. Brown',
'CHI',
'00-0036953',
'T.Ebner',
'(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
9041],
(pbp_2022.fumble_recovery_1_team[9041],
pbp_2022.fumble_recovery_1_player_id[9041],
pbp_2022.fumble_recovery_1_player_name[9041],
pbp_2022.fumble_recovery_1_yards[9041],
pbp_2022.fumble_recovery_2_team[9041],
pbp_2022.fumble_recovery_2_player_id[9041],
pbp_2022.fumble_recovery_2_player_name[9041],
pbp_2022.fumble_recovery_2_yards[9041]) pbp_2022.desc[
('CHI',
'00-0036945',
'J.Fields',
-3.0,
'CHI',
'00-0033082',
'L.Patrick',
-6.0,
'(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
54],
(pbp_2022.sack[54],
pbp_2022.sack_player_name[54],
pbp_2022.sack_player_id[54]) pbp_2022.desc[
(1.0,
'Qu.Williams',
'00-0035680',
'(9:43) (Shotgun) 8-L.Jackson sacked ob at NYJ 49 for 0 yards (56-Qu.Williams).')
When a sack is split, sack == 1
but sack_player_name
and id
are nan
.
55],
(pbp_2022.sack[55],
pbp_2022.sack_player_name[55],
pbp_2022.sack_player_id[55],
pbp_2022.half_sack_1_player_id[55],
pbp_2022.half_sack_1_player_name[55],
pbp_2022.half_sack_2_player_id[55],
pbp_2022.half_sack_2_player_name[55]) pbp_2022.desc[
(1.0,
nan,
nan,
'00-0034163',
'J.Johnson',
'00-0034163',
'J.Martin',
'(8:59) (Shotgun) 8-L.Jackson sacked at BAL 49 for -2 yards (sack split by 52-J.Johnson and 54-J.Martin).')
return_team
(string) and return_yards
(integer) are the abbreviation and yardage of the team that returned the kickoff or punt. I’ll look into if fumble returns are included before I use this field for analyses.
1],
(pbp_2022.return_team[1],
pbp_2022.return_yards[1]) pbp_2022.desc[
('NYJ',
25.0,
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
The following fields hold information about penalties.
- penalty_team (string)
- penalty_player_id (string)
- penalty_player_name (string)
- penalty_yards (integer)
- penalty_type (string)
5],
(pbp_2022.penalty[5],
pbp_2022.penalty_team[5],
pbp_2022.penalty_player_id[5],
pbp_2022.penalty_player_name[5],
pbp_2022.penalty_yards[5],
pbp_2022.penalty_type[5]) pbp_2022.desc[
(1.0,
'NYJ',
'00-0026158',
'J.Flacco',
10.0,
'Intentional Grounding',
'(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
pbp_2022.penalty_type.unique()
array([nan, 'Intentional Grounding', 'Illegal Contact',
'Offensive Holding', 'Defensive Pass Interference',
'Defensive Holding', 'Offensive Pass Interference', 'False Start',
'Horse Collar Tackle', 'Defensive Too Many Men on Field',
'Taunting', 'Delay of Game', 'Roughing the Passer',
'Unsportsmanlike Conduct', 'Low Block', 'Illegal Formation',
'Ineligible Downfield Pass', 'Unnecessary Roughness',
'Neutral Zone Infraction', 'Running Into the Kicker',
'Illegal Shift', 'Defensive Offside', 'Illegal Use of Hands',
'Illegal Block Above the Waist', 'Offensive Too Many Men on Field',
'Encroachment', 'Disqualification', 'Ineligible Downfield Kick',
'Face Mask', 'Player Out of Bounds on Kick',
'Illegal Forward Pass', 'Chop Block', 'Delay of Kickoff',
'Tripping', 'Illegal Substitution', 'Offensive Offside',
'Illegal Blindside Block', 'Illegal Touch Pass',
'Offside on Free Kick', 'Roughing the Kicker',
'Fair Catch Interference', 'Leverage', 'Illegal Motion',
'Defensive Delay of Game', 'Illegal Bat', 'Illegal Touch Kick',
'Illegal Double-Team Block', 'Invalid Fair Catch Signal',
'Illegal Crackback', 'Illegal Kick/Kicking Loose Ball'],
dtype=object)
replay_or_challenge
(1
for True and 0
for False) and replay_or_challenge_result
(nan
, 'upheld'
, or 'reversed'
) show information about whether a replay or challenge occurred on the play.
621],
(pbp_2022.replay_or_challenge[621],
pbp_2022.replay_or_challenge_result[621]) pbp_2022.desc[
(1,
'upheld',
'(7:42) (Shotgun) 25-M.Gordon right tackle to SEA 1 for no gain (6-Q.Diggs, 10-U.Nwosu). FUMBLES (6-Q.Diggs), RECOVERED by SEA-30-M.Jackson at SEA 2. 30-M.Jackson to SEA 10 for 8 yards (14-C.Sutton). The Replay Official reviewed the fumble ruling, and the play was Upheld. The ruling on the field stands.')
safety_player_name
and safety_player_id
have information about the player who caused the safety.
3255],
(pbp_2022.safety[3255],
pbp_2022.safety_player_name[3255],
pbp_2022.safety_player_id[3255]) pbp_2022.desc[
(1.0,
'D.Alford',
'00-0037034',
'(:13) (Run formation) 19-B.Powell right end ran ob in End Zone for -26 yards, SAFETY (37-D.Alford).')
series_result
is the result of the offensive series.
pbp_2022.series_result.unique()
array(['First down', 'Punt', 'Turnover', 'Field goal',
'Missed field goal', 'Touchdown', 'End of half',
'Turnover on downs', 'QB kneel', 'Opp touchdown', 'Safety', nan],
dtype=object)
play_type_nfl
shows slightly different play type categories.
pbp_2022.play_type_nfl.unique()
array(['GAME_START', 'KICK_OFF', 'RUSH', 'PASS', 'PUNT', 'TIMEOUT',
'PENALTY', 'FIELD_GOAL', 'END_QUARTER', 'SACK', 'XP_KICK',
'END_GAME', 'PAT2', nan, 'FREE_KICK'], dtype=object)
drive_play_count
shows how many plays the drive had. I’ll look into it more before using it for analyses. It doesn’t always match the number of plays on the drive, or at least seems not to, so I need to understand how they calculate this value.
pbp_2022.drive_play_count.unique()
array([nan, 4., 6., 5., 3., 8., 1., 9., 16., 11., 2., 13., 7.,
14., 10., 15., 12., 0., 18., 19., 20., 17., 21.])
drive_time_of_possession
is a formatted string of minutes:seconds the drive took.
5] pbp_2022.drive_time_of_possession.unique()[:
array([nan, '1:18', '3:53', '2:44', '1:04'], dtype=object)
drive_first_downs
is the number of first downs achieved on the drive.
5] pbp_2022.drive_first_downs.unique()[:
array([nan, 1., 0., 3., 2.])
drive_inside20
is either nan
, 1
(True) or 0
(False) and indicates if a drive ended inside of the red zone (20 yards from the end zone).
pbp_2022.drive_inside20.unique()
array([nan, 0., 1.])
drive_ended_with_score
indicates if a drive ended with the offensive team scoring. It is either nan
, 1
(True) or 0
(False).
pbp_2022.drive_ended_with_score.unique()
array([nan, 0., 1.])
I’ll have to look into it more before using it for analyses, but I believe drive_yards_penalized
is the total number of offensive penalty yards on the drive.
5] pbp_2022.drive_yards_penalized.unique()[:
array([ nan, -10., 0., 5., 32.])
drive_play_id_started
and drive_play_id_ended
indicate the start and end play_id
of the drive. Note that play_id
are not consecutive and doesn’t start at 1.
1],
(pbp_2022.drive_play_id_started[1]) pbp_2022.drive_play_id_ended[
(43.0, 172.0)
away_score
and home_score
are the final scores of the away team and home team.
1],
(pbp_2022.away_team[1],
pbp_2022.away_score[1],
pbp_2022.home_team[1]) pbp_2022.home_score[
('BAL', 24, 'NYJ', 9)
result
is the difference between the home and the away team (I think—will look into it more).
1] pbp_2022.result[
-15
total
is the total number of points scored by both teams.
1] pbp_2022.total[
33
div_game
indicates if the game is between teams in the same division. It is either 1
(True) or 0
(False).
1] pbp_2022.div_game.unique(), pbp_2022.div_game[
(array([0, 1]), 0)
away_coach
and home_coach
are the names of the away team and home team coaches, respectively.
1], pbp_2022.home_coach[1] pbp_2022.away_coach[
('John Harbaugh', 'Robert Saleh')
The following fields give the name and jersey number of the passer, rusher or receiver on the play:
- passer
- passer_id
- passer_jersey_number
- rusher
- rusher_id
- rusher_jersey_number
- receiver
- receiver_id
- receiver_jersey_number
3],
(pbp_2022.passer[3],
pbp_2022.passer_id[3]) pbp_2022.passer_jersey_number[
('J.Flacco', '00-0026158', 19.0)
2],
(pbp_2022.rusher[2],
pbp_2022.rusher_id[2]) pbp_2022.rusher_jersey_number[
('Mi.Carter', '00-0036924', 32.0)
3],
(pbp_2022.receiver[3],
pbp_2022.receiver_id[3]) pbp_2022.receiver_jersey_number[
('Mi.Carter', '00-0036924', 32.0)
The following fields indicate if the play is a pass, rush, first down, or special teams, respectively. Their value is nan
, 1
(True) or 0
(False):
- pass
- rush
- first_down
- special
'pass'][3], pbp_2022.desc[3] pbp_2022[
(1,
'(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
2], pbp_2022.desc[2] pbp_2022.rush[
(1,
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
2], pbp_2022.desc[2] pbp_2022.first_down[
(1.0,
'(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
1], pbp_2022.desc[1] pbp_2022.special[
(1,
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')