Exploring NFL Play-by-Play Data with SQL and Python

data analysis
SQL
python
An update on my analysis and visualization of NFL play-by-play data.
Author

Vishal Bakshi

Published

February 11, 2023

Side view of Jalen Hurts walking on the Eagles sideline with Kansas City Chiefs-colored confetti falling around him

Background and Goals

It’s been 5 years since I last explored NFL’s play-by-play data. It’s also been 5 years since my Eagles won the Super Bowl, which will be played in less than 24 hours from now. Go Birds.

It’s been so long since I’ve blogged that fastpages, the blogging library I use, has been deprecated.

I have thoroughly enjoyed some of the statistical analyses put forth by fans of the NFL this year. My favorite analyst is Deniz Selman, a fellow Eagles fan who makes these beautiful data presentations.

I also appreciate Deniz’ critique of analysis-without-context that often negates the brilliance of Jalen Hurts:

My second favorite analyst is Ben Baldwin, AKA Computer Cowboy especially his 4th down analysis realtime during games.

There has been an onslaught of statistical advances in the NFL since I last explored play-by-play data and I’m excited to learn as much as I can. In particular, I’d like to get a hang of the metrics EPA (Expected Points Added) and DVOA (Defense-adjusted Value Over Average), which may not necessarily intersect with my play-by-play analysis (I believe Football Outsiders is the proprietor of that formula).

I’d also like to use this project to practice more advanced SQL queries than I’m used to. Given the complexity of the play-by-play dataset (by team, down, field position, etc.) I’m hoping I can get those reps in.

Lastly, I’d like to explore data presentation with these statistics using R, python, Adobe Illustrator and Photoshop. I’ve been inspired by simple, elegant graphics like those made by Peter Gorman in Barely Maps and bold, picturesque statistics posted by PFF on twitter:

I’ll work on this project in this post throughout this year–and maybe beyond if it fuels me with enough material–or it’ll fork off into something entirely new or different.

I’ll start off by next exploring the schema of the play-by-play dataset.

Documenting the NFL Play-by-Play Dataset Fields

In this section, I describe the fields in the 2022 NFL Play-by-Play Dataset. Not all of the fields are intuitive or immediately useful, so not all 372 column descriptions will be listed.

import pandas as pd
import numpy as np
# load the data
fpath = "../../../nfl_pbp_data/play_by_play_2022.csv"
pbp_2022 = pd.read_csv(fpath, low_memory=False)

pbp_2022.head()
play_id game_id old_game_id home_team away_team season_type week posteam posteam_type defteam ... out_of_bounds home_opening_kickoff qb_epa xyac_epa xyac_mean_yardage xyac_median_yardage xyac_success xyac_fd xpass pass_oe
0 1 2022_01_BAL_NYJ 2022091107 NYJ BAL REG 1 NaN NaN NaN ... 0 1 0.000000 NaN NaN NaN NaN NaN NaN NaN
1 43 2022_01_BAL_NYJ 2022091107 NYJ BAL REG 1 NYJ home BAL ... 0 1 -0.443521 NaN NaN NaN NaN NaN NaN NaN
2 68 2022_01_BAL_NYJ 2022091107 NYJ BAL REG 1 NYJ home BAL ... 0 1 1.468819 NaN NaN NaN NaN NaN 0.440373 -44.037291
3 89 2022_01_BAL_NYJ 2022091107 NYJ BAL REG 1 NYJ home BAL ... 0 1 -0.492192 0.727261 6.988125 6.0 0.60693 0.227598 0.389904 61.009598
4 115 2022_01_BAL_NYJ 2022091107 NYJ BAL REG 1 NYJ home BAL ... 0 1 -0.325931 NaN NaN NaN NaN NaN 0.443575 -44.357494

5 rows × 372 columns

The 2022 NFL Play-by-Play dataset has 50147 rows (plays) and 372 columns.

pbp_2022.shape
(50147, 372)

play_id is an identifier for each play in each game. It not a unique identifier as there are many duplicates. There are 4597 unique play_id values in this dataset.

len(pbp_2022.play_id.unique())
4597

game_id is an identifier for each game in the dataset in the format of {year}_{week}_{away_team}_{home_team}. There are 284 unique games in this dataset.

len(pbp_2022.game_id.unique()), pbp_2022.game_id[1]
(284, '2022_01_BAL_NYJ')

There are 32 unique home_teams and away_teams.

len(pbp_2022.home_team.unique()), len(pbp_2022.away_team.unique())
(32, 32)

There are two season_type values: 'REG' for regular season and 'POST' for postseason.

pbp_2022.season_type.unique()
array(['REG', 'POST'], dtype=object)

There are 22 week values: - 18 regular season weeks (17 games + 1 bye) - 4 postseason weeks - Wild Card Weekend - Divisional Playoffs - Conference Championships - Super Bowl

pbp_2022.week.unique()
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22])

I believe posteam stands for the team that has possession of the ball. There are 32 unique teams that can have possession of the ball in a game, and in some cases the posteam is nan.

len(pbp_2022.posteam.unique()), pbp_2022.posteam.unique()
(33,
 array([nan, 'NYJ', 'BAL', 'BUF', 'LA', 'CAR', 'CLE', 'SEA', 'DEN', 'MIN',
        'GB', 'IND', 'HOU', 'JAX', 'WAS', 'KC', 'ARI', 'LAC', 'LV', 'NE',
        'MIA', 'ATL', 'NO', 'NYG', 'TEN', 'DET', 'PHI', 'PIT', 'CIN',
        'CHI', 'SF', 'DAL', 'TB'], dtype=object))

posteam_type has values 'home', 'away' and nan.

len(pbp_2022.posteam_type.unique()), pbp_2022.posteam_type.unique()
(3, array([nan, 'home', 'away'], dtype=object))

defteam lists any of the 32 teams on defense on a given play. It can also have the value nan.

len(pbp_2022.defteam.unique()), pbp_2022.defteam.unique()
(33,
 array([nan, 'BAL', 'NYJ', 'LA', 'BUF', 'CLE', 'CAR', 'DEN', 'SEA', 'GB',
        'MIN', 'HOU', 'IND', 'WAS', 'JAX', 'ARI', 'KC', 'LV', 'LAC', 'MIA',
        'NE', 'NO', 'ATL', 'TEN', 'NYG', 'PHI', 'DET', 'CIN', 'PIT', 'SF',
        'CHI', 'TB', 'DAL'], dtype=object))

side_of_field can be nan, any of the 32 team abbreviations, or 50 (midfield).

len(pbp_2022.side_of_field.unique()), pbp_2022.side_of_field.unique()
(34,
 array([nan, 'BAL', 'NYJ', 'LA', 'BUF', '50', 'CLE', 'CAR', 'DEN', 'SEA',
        'GB', 'MIN', 'HOU', 'IND', 'WAS', 'JAX', 'ARI', 'KC', 'LV', 'LAC',
        'MIA', 'NE', 'NO', 'ATL', 'TEN', 'NYG', 'PHI', 'DET', 'CIN', 'PIT',
        'SF', 'CHI', 'TB', 'DAL'], dtype=object))

yardline_100 can be nan or between 1 and 99.

len(pbp_2022.yardline_100.unique()), np.nanmin(pbp_2022.yardline_100), np.nanmax(pbp_2022.yardline_100)
(100, 1.0, 99.0)

There are 61 game_date values.

len(pbp_2022.game_date.unique()), pbp_2022.game_date[0]
(61, '2022-09-11')

quarter_seconds_remaining is between 0 and 900 (15 minutes).

pbp_2022.quarter_seconds_remaining.min(), pbp_2022.quarter_seconds_remaining.max()
(0, 900)

half_seconds_remaining is between 0 and 1800 (30 minutes).

pbp_2022.half_seconds_remaining.min(), pbp_2022.half_seconds_remaining.max()
(0, 1800)

game_seconds_remaining is between 0 and 3600 (60 minutes).

pbp_2022.game_seconds_remaining.min(), pbp_2022.game_seconds_remaining.max()
(0, 3600)

game_half is either Half1 (first half), Half2 (second half), or Overtime.

pbp_2022.game_half.unique()
array(['Half1', 'Half2', 'Overtime'], dtype=object)

quarter_end is either 1 (True) or 0 (False).

pbp_2022.quarter_end.unique(), pbp_2022.query('quarter_end == 1').desc[41]
(array([0, 1]), 'END QUARTER 1')

drive is the current number of drives in the game (including both teams) as well as nan values.

pbp_2022.drive.unique()
array([nan,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
       13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.,
       26., 27., 28., 29., 30., 31., 32., 33., 34., 35.])

sp teams seems to indicate whether the play involves the Special Teams unit, either 1 (True) or 0 (False).

pbp_2022.sp.unique(), pbp_2022.query('sp == 1').desc[32]
(array([0, 1]),
 '(3:19) 9-J.Tucker 24 yard field goal is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')

quarter indicates the current quarter of the play. quarter == 5 represents Overtime.

pbp_2022.qtr.unique()
array([1, 2, 3, 4, 5])

down represents the current down of the play (nan, 1st, 2nd, 3rd or 4th).

pbp_2022.down.unique()
array([nan,  1.,  2.,  3.,  4.])

goal_to_go indicates whether this play is 1st & Goal, 2nd & Goal, 3rd & Goal or 4th & Goal, either 1 (True) or 0 (False).

pbp_2022.goal_to_go.unique()
array([0, 1])

time is the minutes:seconds formatted time left in the current quarter.

pbp_2022.head().time.unique()
array(['15:00', '14:56', '14:29', '14:25'], dtype=object)

yrdln is a formatted string of team abbreviation and yard number.

pbp_2022.yrdln.unique()
array(['BAL 35', 'NYJ 22', 'NYJ 41', ..., 'NYJ 3', 'CIN 6', 'MIN 12'],
      dtype=object)

ydstogo is the number of yards before the next first down.

pbp_2022.ydstogo.unique()
array([ 0, 10,  5, 15,  6,  2,  1, 12,  9, 19, 11,  3,  8,  4, 16, 17,  7,
       20, 14, 18, 13, 22, 26, 24, 21, 25, 23, 28, 30, 27, 31, 38, 36, 29,
       34, 35, 32, 33])

ydsnet is the net yards (yards gained - yards lost) of the current drive.

pbp_2022.ydsnet.unique()
array([ nan,  14.,  21.,   7.,   1.,  15.,   9.,  16.,  44.,  18.,  62.,
        48.,   3.,  11.,   4.,  88.,  75.,  23.,  43.,  -2.,  38.,   0.,
        45.,  60.,  13.,   6.,  -1.,  58.,  25.,  89.,  59.,  19.,  66.,
        29.,  -4.,  24.,   2.,  12.,  42.,  78.,  52.,  57.,  64.,  35.,
        -3.,  70.,  77.,  72.,  50.,  37.,  31.,  -6.,  32.,  -5.,  20.,
        79.,  74.,  34.,  65.,   8.,  47.,   5.,  69.,  53.,  33.,  76.,
        80., -16.,  71.,  68.,  55.,  27.,  90.,  86.,  17.,  30.,  67.,
        63.,  73.,  61., -13.,  92.,  40.,  22.,  -7.,  39.,  41.,  28.,
        82.,  49.,  10.,  36.,  46.,  84.,  54., -23., -11.,  83.,  26.,
        94.,  87., -10.,  85.,  51., -14.,  56.,  -8.,  81.,  -9.,  93.,
       -12., -15., -17.,  91.,  99.,  98., -19.,  96.,  95.,  97., -20.,
       -25.])

desc is a narrative description of the current play.

pbp_2022.head().desc[1]
'9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).'

play_type is either nan or one of 9 different play types, including no_play.

len(pbp_2022.play_type.unique()), pbp_2022.play_type.unique()
(10,
 array([nan, 'kickoff', 'run', 'pass', 'punt', 'no_play', 'field_goal',
        'extra_point', 'qb_kneel', 'qb_spike'], dtype=object))

yards_gained is the number of yards gained (positive) or lost (negative) on the current play. It does not capture yards gained or lost due to a penalty.

pbp_2022.head().yards_gained, pbp_2022.yards_gained.min()
(0     NaN
 1     0.0
 2    19.0
 3     0.0
 4     5.0
 Name: yards_gained, dtype: float64,
 -26.0)

shotgun indicates whether the quarterback was in shotgun position, either 1 (True) or 0 (False).

pbp_2022.shotgun.unique(), pbp_2022.query('shotgun == 1').desc[3]
(array([0, 1]),
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')

no_huddle indicates whether the team huddled before the snap, either 1 (True) or 0 (False).

pbp_2022.no_huddle.unique(), pbp_2022.query('no_huddle == 1').desc[3]
(array([0, 1]),
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')

qb_dropback indicates whether the quarterback drops back on the play, either 1 (True), 0 (False) or nan.

pbp_2022.qb_dropback.unique(), pbp_2022.query('qb_dropback == 1').desc[3]
(array([nan,  0.,  1.]),
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')

qb_kneel indicates whether the quarterback kneels on the play, either 1 (True) or 0 (False).

pbp_2022.qb_kneel.unique(), pbp_2022.query('qb_kneel == 1').desc[176]
(array([0, 1]), '(:59) 8-L.Jackson kneels to NYJ 43 for -1 yards.')

qb_spike indicates whether the quarterback spikes the ball on the play, either 1 (True) or 0 (False).

pbp_2022.qb_spike.unique(), pbp_2022.query('qb_spike == 1').desc[520]
(array([0, 1]),
 '(:29) (No Huddle) 7-J.Brissett spiked the ball to stop the clock.')

qb_scramble indicates whether the quarterback scrambles on the play, either 1 (True) or 0 (False). It looks like a scramble is not the same as a designed quarterback run, so I’ll dig deeper into this before using this field in analyses.

pbp_2022.qb_scramble.unique()
array([0, 1])

pass_length is either nan, 'short' or 'deep'. I’ll first understand what distance (in yards) corresponds to these designations before I use this field in analyses.

pbp_2022.pass_length.unique()
array([nan, 'short', 'deep'], dtype=object)

pass_location is either nan, 'left', 'right', or 'middle'.

pbp_2022.pass_location.unique()
array([nan, 'left', 'right', 'middle'], dtype=object)

air_yards is the number of yards a quarterback’s pass traveled in the air. It can be positive, zero or negative.

pbp_2022.air_yards.unique()
array([ nan,   0.,  -4.,   3.,   2.,  16.,  11.,   5.,  21.,  14.,  -1.,
         1.,   7.,   6.,  15.,  -3.,   8.,  10.,  50.,  27.,  25.,  -5.,
        31.,  -6.,  17.,  51.,  13.,   4.,  12.,  36.,   9.,  32.,  18.,
        22.,  -2.,  23.,  45.,  40.,  52.,  -7.,  26.,  29.,  20.,  47.,
        24.,  30.,  28.,  37.,  39.,  -8.,  19.,  41.,  38., -12.,  42.,
       -10.,  46.,  35.,  33.,  -9.,  34.,  44.,  43.,  53.,  57.,  48.,
        49.,  54.,  58.,  56.,  59.,  55.,  61., -18., -54., -13.,  62.,
        65., -20., -16.])

yards_after_catch is the number of yards the receiver gains or loses after catching the ball.

pbp_2022.yards_after_catch.unique()
array([ nan,   8.,   1.,   6.,   0.,   3.,   5.,   4.,  12.,   9.,  10.,
        -4.,  18.,   7.,  15.,   2.,  11.,  13.,  -1.,  29.,  30.,  27.,
        28.,  16.,  26.,  24.,  25.,  -5.,  41.,  14.,  22.,  19.,  17.,
        21.,  32.,  20.,  -2.,  35.,  -3.,  51.,  66.,  38.,  46.,  23.,
        31.,  37.,  68.,  -6.,  33.,  52.,  75.,  34.,  71.,  44.,  61.,
        60.,  58.,  48.,  50.,  53.,  39.,  62.,  47.,  -7.,  42.,  40.,
        36.,  49.,  70.,  45.,  65.,  43.,  74., -10.,  -9.])

run_location is either nan, 'left', 'right', or 'middle'.

pbp_2022.run_location.unique()
array([nan, 'left', 'right', 'middle'], dtype=object)

run_gap represents which offensive line gap the runner ran through. It is either nan, 'end', 'tackle' or 'guard'. I’ll have to dig a bit deeper (look at some video corresponding to the run plays) to understand if 'guard' represents the A (gap between center and guard) or B gap (gap between guard and tackle), if 'tackle' represents the B or C gap (gap between tackle and end), and if 'end' represents the C or D (gap outside the end) gap.

pbp_2022.run_gap.unique()
array([nan, 'end', 'tackle', 'guard'], dtype=object)

field_goal_result is either nan, 'made', 'missed', or 'blocked'.

pbp_2022.field_goal_result.unique()
array([nan, 'made', 'missed', 'blocked'], dtype=object)

kick_distance is the distance of the kick in yards for the following play_type values: 'punt', 'field_goal', 'extra_point', and 'kickoff'. Looking through the data, not all 'kickoff's have a kick_distance value.

pbp_2022.kick_distance.unique(), pbp_2022.query('kick_distance.notnull()').play_type.unique()
(array([nan, 45., 40., 48., 24., 50., 56., 41., 33., 20., 49., 43.,  7.,
        36., 57., 25., 39., 60., 62., 61., 44., 46., 58., 26., 34., 64.,
        30., 47., 54., 28., 53., 38., 29., 70., 37., 27., 52., 42., 63.,
        51., 23., 55., 59., 69., 66., 14., 32., 35.,  0., 31., 67., 74.,
        19., 10., 22., 12.,  8.,  5., -1., 73., 65.,  3., 21.,  9., 16.,
        15., 13., 18., 17.,  6., 77., 68., 11., 71., 79.]),
 array(['punt', 'field_goal', 'extra_point', 'kickoff'], dtype=object))

extra_point_result is either nan, 'good', 'failed' or 'blocked'.

pbp_2022.extra_point_result.unique()
array([nan, 'good', 'failed', 'blocked'], dtype=object)

two_point_conv_result, the result of a two-point conversion is either nan, 'failure' or 'success'.

pbp_2022.two_point_conv_result.unique()
array([nan, 'failure', 'success'], dtype=object)

home_timeouts_remaining is the number of timeouts the home team has left. It is either 3, 2, 1, or 0.

pbp_2022.home_timeouts_remaining.unique()
array([3, 2, 1, 0])

away_timeouts_remaining is the number of timeouts the away team has left. It is either 3, 2, 1, or 0.

pbp_2022.away_timeouts_remaining.unique()
array([3, 2, 1, 0])

timeout indicates if a team calls a timeout, either 1 (True) or 0 (False).

pbp_2022.timeout.unique(), pbp_2022.query('timeout == 1').desc[13]
(array([nan,  0.,  1.]), 'Timeout #1 by BAL at 09:56.')

timeout_team indicates which team called the timeout, and has 33 unique values—1 nan and 32 team abbreviations.

(pbp_2022.timeout_team.unique(), 
pbp_2022.query('timeout == 1').desc[13], 
pbp_2022.query('timeout == 1').timeout_team[13])
(array([nan, 'BAL', 'NYJ', 'LA', 'BUF', 'CLE', 'CAR', 'DEN', 'SEA', 'GB',
        'MIN', 'IND', 'HOU', 'WAS', 'JAX', 'KC', 'ARI', 'LAC', 'LV', 'NE',
        'MIA', 'ATL', 'NO', 'NYG', 'TEN', 'PHI', 'DET', 'PIT', 'CIN', 'SF',
        'CHI', 'DAL', 'TB'], dtype=object),
 'Timeout #1 by BAL at 09:56.',
 'BAL')

td_team indicates which team scored the touchdown. It is nan or one of 32 team abbreviations.

(pbp_2022.td_team.unique(),
pbp_2022.query('td_team.notnull()').td_team[68],
pbp_2022.query('td_team.notnull()').desc[68])
(array([nan, 'BAL', 'NYJ', 'BUF', 'LA', 'CLE', 'CAR', 'SEA', 'DEN', 'MIN',
        'GB', 'HOU', 'IND', 'WAS', 'JAX', 'KC', 'ARI', 'LAC', 'LV', 'MIA',
        'NE', 'NO', 'ATL', 'TEN', 'NYG', 'DET', 'PHI', 'PIT', 'CIN', 'SF',
        'CHI', 'TB', 'DAL'], dtype=object),
 'BAL',
 '(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')

td_player_name indicates which player scored the touchdown. It is nan or one of 416 players who scored a touchdown in the 2022 season.

(pbp_2022.td_player_name.unique()[:5],
len(pbp_2022.td_player_name.unique()),
pbp_2022.query('td_team.notnull()').td_player_name[68],
pbp_2022.query('td_team.notnull()').desc[68])
(array([nan, 'D.Duvernay', 'R.Bateman', 'T.Conklin', 'G.Davis'],
       dtype=object),
 417,
 'D.Duvernay',
 '(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')

td_player_id indicates the id of the player who scored the touchdown. There are 422 unique player IDs. Later on, I’ll look into why there are 5 fewer player IDs than player names.

(pbp_2022.td_player_id.unique()[:5],
len(pbp_2022.td_player_id.unique()),
pbp_2022.query('td_team.notnull()').td_player_name[68],
 pbp_2022.query('td_team.notnull()').td_player_id[68],
pbp_2022.query('td_team.notnull()').desc[68])
(array([nan, '00-0036331', '00-0036550', '00-0034270', '00-0036196'],
       dtype=object),
 423,
 'D.Duvernay',
 '00-0036331',
 '(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')

posteam_timeouts_remaining is the number of timeouts remaining for the team with ball possession. It can be nan, 3, 2, 1, or 0.

pbp_2022.posteam_timeouts_remaining.unique()
array([nan,  3.,  2.,  0.,  1.])

defteam_timeouts_remaining is the number of timeouts remaining for the team on defense. It can be nan, 3, 2, 1, or 0.

pbp_2022.defteam_timeouts_remaining.unique()
array([nan,  3.,  2.,  1.,  0.])

total_home_score is the total number of points scored by the home team.

pbp_2022.total_home_score.unique()[:5]
array([0, 3, 9, 6, 7])

total_away_score is the total number of points scored by the away team.

pbp_2022.total_away_score.unique()[:5]
array([ 0,  3,  9, 10, 16])

posteam_score is the total number of points scored by the team with ball possession on the current play.

pbp_2022.posteam_score.unique()[:5]
array([nan,  0.,  3.,  9., 10.])

defteam_score is the total number of points scored by the team on defense on the current play.

pbp_2022.defteam_score.unique()[:5]
array([nan,  0.,  3., 10., 17.])

score_differential is the difference between posteam_score and defteam_score.

pbp_2022.score_differential.unique()[:5]
array([nan,  0., -3.,  3.,  9.])

punt_blocked indicates if the punt was blocked. It is either nan, 1 (True) or 0 (False).

pbp_2022.punt_blocked.unique(),pbp_2022.query('punt_blocked == 1').desc[3236]
(array([nan,  0.,  1.]),
 '(5:06) 11-R.Dixon punt is BLOCKED by 44-T.Andersen, Center-42-M.Orzech, RECOVERED by ATL-9-L.Carter at LA 26. 9-L.Carter for 26 yards, TOUCHDOWN.')

first_down_rush indicates whether a first down was achieved by a rushing play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.first_down_rush.unique(), 
 pbp_2022.query('first_down_rush == 1').desc[2],
 pbp_2022.query('first_down_rush == 1').play_type[2])
(array([nan,  0.,  1.]),
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).',
 'run',
 nan)

first_down_pass indicates whether a first down was achieved by a passing play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.first_down_pass.unique(), 
 pbp_2022.query('first_down_pass == 1').desc[26],
 pbp_2022.query('first_down_pass == 1').play_type[26])
(array([nan,  0.,  1.]),
 '(6:01) 19-J.Flacco pass deep left to 8-E.Moore to NYJ 41 for 24 yards (32-M.Williams).',
 'pass')

first_down_penalty indicates whether a first down was achieved by a penalty. It is either nan, 1 (True) or 0 (False).

(pbp_2022.first_down_penalty.unique(), 
 pbp_2022.query('first_down_penalty == 1').desc[17],
 pbp_2022.query('first_down_penalty == 1').play_type[17])
(array([nan,  0.,  1.]),
 '(8:31) (Shotgun) 19-J.Flacco pass incomplete deep left to 8-E.Moore. PENALTY on BAL-44-M.Humphrey, Illegal Contact, 5 yards, enforced at NYJ 12 - No Play.',
 'no_play')

third_down_converted indicates if the team with ball possession on third down got a first down on the play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.third_down_converted.unique(), 
 pbp_2022.query('third_down_converted == 1').down[9],
 pbp_2022.query('third_down_converted == 1').ydstogo[9],
 pbp_2022.query('third_down_converted == 1').desc[9],
pbp_2022.query('third_down_converted == 1').yards_gained[9])
(array([nan,  0.,  1.]),
 3.0,
 2,
 '(12:41) (Shotgun) 8-L.Jackson right tackle to BAL 40 for 4 yards (57-C.Mosley, 3-J.Whitehead).',
 4.0)

third_down_failed indicates if the team with ball possession on third down did not get a first down on the play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.third_down_failed.unique(), 
 pbp_2022.query('third_down_failed == 1').down[5],
 pbp_2022.query('third_down_failed == 1').ydstogo[5],
 pbp_2022.query('third_down_failed == 1').desc[5],
pbp_2022.query('third_down_failed == 1').yards_gained[5])
(array([nan,  0.,  1.]),
 3.0,
 5,
 '(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.',
 0.0)

fourth_down_converted indicates if the team with ball possession on fourth down got a first down on the play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.fourth_down_converted.unique(), 
 pbp_2022.query('fourth_down_converted == 1').down[145],
 pbp_2022.query('fourth_down_converted == 1').ydstogo[145],
 pbp_2022.query('fourth_down_converted == 1').desc[145],
pbp_2022.query('fourth_down_converted == 1').yards_gained[145])
(array([nan,  0.,  1.]),
 4.0,
 1,
 '(7:32) 19-J.Flacco pass short right to 84-C.Davis to BAL 21 for 7 yards (23-K.Fuller).',
 7.0)

fourth_down_failed indicates if the team with ball possession on fourth down did not get a first down on the play. It is either nan, 1 (True) or 0 (False).

(pbp_2022.fourth_down_failed.unique(), 
 pbp_2022.query('fourth_down_failed == 1').down[154],
 pbp_2022.query('fourth_down_failed == 1').ydstogo[154],
 pbp_2022.query('fourth_down_failed == 1').desc[154],
pbp_2022.query('fourth_down_failed == 1').yards_gained[154])
(array([nan,  0.,  1.]),
 4.0,
 6,
 '(4:22) (Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.',
 0.0)

incomplete_pass indicates if the pass was incomplete. It is either nan, 1 (True) or 0 (False).

(pbp_2022.incomplete_pass.unique(),
 pbp_2022.query('incomplete_pass == 1').desc[3])
(array([nan,  0.,  1.]),
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')

touchback indicates if the kickoff or punt either went past the back of the endzone or was fair-caught in the end zone.

(pbp_2022.touchback.unique(),
 pbp_2022.query('touchback == 1').desc[33])
(array([0, 1]),
 '9-J.Tucker kicks 65 yards from BAL 35 to end zone, Touchback.')

interception indicates if the quarterback’s pass was intercepted by a defender. It is either nan, 1 (True), or 0 (False).

(pbp_2022.interception.unique(),
 pbp_2022.query('interception == 1').desc[28])
(array([nan,  0.,  1.]),
 '(5:07) (Shotgun) 19-J.Flacco pass short middle intended for 81-L.Cager INTERCEPTED by 32-M.Williams at NYJ 46. 32-M.Williams to NYJ 13 for 33 yards (19-J.Flacco).')

fumble_forced indicates if a fumble was forced on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.fumble_forced.unique(),
 pbp_2022.query('fumble_forced == 1').desc[80])
(array([nan,  0.,  1.]),
 '(1:16) (Shotgun) 19-J.Flacco pass short right to 83-T.Conklin to BAL 21 for 6 yards (32-M.Williams, 58-M.Pierce). FUMBLES (58-M.Pierce), touched at BAL 25, recovered by NYJ-17-G.Wilson at BAL 27. 17-G.Wilson to BAL 27 for no gain (14-K.Hamilton).')

fumble_not_forced indicates if a fumble occurred on the play but was not forced by another player. It is either nan, 1 (True), or 0 (False).

(pbp_2022.fumble_not_forced.unique(),
 pbp_2022.query('fumble_not_forced == 1').desc[264])
(array([nan,  0.,  1.]),
 '(13:46) (Shotgun) 9-M.Stafford to LA 11 for -6 yards. FUMBLES, and recovers at LA 11. 9-M.Stafford sacked at LA 10 for -7 yards (50-G.Rousseau).')

fumble_out_of_bounds indicates if a fumbled ball went out of bounds. It is either nan, 1 (True), or 0 (False).

(pbp_2022.fumble_out_of_bounds.unique(),
 pbp_2022.query('fumble_out_of_bounds == 1').desc[1160])
(array([nan,  0.,  1.]),
 '(:32) (Shotgun) 16-T.Lawrence pass short right to 1-T.Etienne to WAS 11 for 3 yards (22-D.Forrest). FUMBLES (22-D.Forrest), ball out of bounds at WAS 19. The Replay Official reviewed the pass completion ruling, and the play was Upheld. The ruling on the field stands.')

solo_tackle indicates if a player made a solo tackle on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.solo_tackle.unique(),
 pbp_2022.query('solo_tackle == 1').desc[1])
(array([nan,  1.,  0.]),
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')

safety indicates if a defensive player scored a safety on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.safety.unique(),
 pbp_2022.query('safety == 1').desc[3255])
(array([nan,  0.,  1.]),
 '(:13) (Run formation) 19-B.Powell right end ran ob in End Zone for -26 yards, SAFETY (37-D.Alford).')

penalty indicates if there was a penalty on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.penalty.unique(),
 pbp_2022.query('penalty == 1').desc[5])
(array([nan,  0.,  1.]),
 '(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')

tackled_for_loss indicates if a player was tackled for a loss of yards. It is either nan, 1 (True), or 0 (False).

(pbp_2022.tackled_for_loss.unique(),
 pbp_2022.query('tackled_for_loss == 1').desc[15])
(array([nan,  0.,  1.]),
 '(9:49) 20-Br.Hall right end to NYJ 9 for -2 yards (92-J.Madubuike).')

fumble_lost indicates if a player lost a fumble to the other team. It is either nan, 1 (True), or 0 (False).

(pbp_2022.fumble_lost.unique(),
 pbp_2022.query('fumble_lost == 1').desc[129])
(array([nan,  0.,  1.]),
 '(14:13) (No Huddle, Shotgun) 19-J.Flacco pass short middle to 20-Br.Hall to BAL 16 for 6 yards (36-C.Clark). FUMBLES (36-C.Clark), RECOVERED by BAL-44-M.Humphrey at BAL 15.')

qb_hit indicates if the quarterback was hit on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.qb_hit.unique(),
 pbp_2022.query('qb_hit == 1').desc[5])
(array([nan,  0.,  1.]),
 '(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')

rush_attempt indicates if the play was a rushing play. It is either nan, 1 (True), or 0 (False). A QB scramble is considered a rush attempt.

(pbp_2022.rush_attempt.unique(),
 pbp_2022.query('rush_attempt == 1').desc[9],
 pbp_2022.query('rush_attempt == 1 and qb_scramble == 1').desc[89])
(array([nan,  0.,  1.]),
 '(12:41) (Shotgun) 8-L.Jackson right tackle to BAL 40 for 4 yards (57-C.Mosley, 3-J.Whitehead).',
 '(14:15) (Shotgun) 8-L.Jackson scrambles left end ran ob at BAL 35 for 8 yards (3-J.Whitehead).')

pass_attempt indicates if the play was a passing play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.pass_attempt.unique(),
 pbp_2022.query('pass_attempt == 1').desc[3])
(array([nan,  0.,  1.]),
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')

sack indicates if the quarterback was sacked on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.sack.unique(),
 pbp_2022.query('sack == 1').desc[54])
(array([nan,  0.,  1.]),
 '(9:43) (Shotgun) 8-L.Jackson sacked ob at NYJ 49 for 0 yards (56-Qu.Williams).')

touchdown indicates if a player scored a touchdown on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.touchdown.unique(),
 pbp_2022.query('touchdown == 1').desc[68])
(array([nan,  0.,  1.]),
 '(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')

pass_touchdown, rush_touchdown, and return_touchdown indicate if the touchdown was a result of a pass, rush or kickoff/punt/fumble/interception return play, respectively. Their value is either nan, 1 (True), or 0 (False).

(pbp_2022.pass_touchdown.unique(),
 pbp_2022.query('pass_touchdown == 1').desc[68])
(array([nan,  0.,  1.]),
 '(3:51) (Shotgun) 8-L.Jackson pass deep right to 13-D.Duvernay for 25 yards, TOUCHDOWN.')
(pbp_2022.rush_touchdown.unique(),
 pbp_2022.query('rush_touchdown == 1').desc[298])
(array([nan,  0.,  1.]),
 '(13:34) (Shotgun) 17-J.Allen scrambles right end for 4 yards, TOUCHDOWN.')
(pbp_2022.return_touchdown.unique(),
 pbp_2022.query('return_touchdown == 1').desc[1651],
 pbp_2022.query('return_touchdown == 1').desc[2197],
 pbp_2022.query('return_touchdown == 1').desc[47094])
(array([nan,  0.,  1.]),
 '(7:40) (Shotgun) 10-M.Jones sacked at NE 6 for -9 yards (29-Br.Jones). FUMBLES (29-Br.Jones) [29-Br.Jones], RECOVERED by MIA-6-M.Ingram at NE 2. 6-M.Ingram for 2 yards, TOUCHDOWN.',
 '(6:36) (Shotgun) 16-J.Goff pass short left intended for 88-T.Hockenson INTERCEPTED by 24-J.Bradberry (43-K.White) [95-M.Tuipulotu] at DET 27. 24-J.Bradberry for 27 yards, TOUCHDOWN.',
 '6-N.Folk kicks 66 yards from NE 35 to BUF -1. 20-N.Hines for 101 yards, TOUCHDOWN.')

The following fields indicate if the play involved an attempt at an Extra Point, Two Point Conversion, Field Goal, Kickoff, or Punt, respectively:

  • extra_point_attempt
  • two_point_attempt
  • field_goal_attempt
  • kickoff_attempt
  • punt_attempt

Their value is either nan, 1 (True), or 0 (False).empt

(pbp_2022.extra_point_attempt.unique(),
 pbp_2022.query('extra_point_attempt == 1').desc[69])
(array([nan,  0.,  1.]),
 '9-J.Tucker extra point is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')
(pbp_2022.two_point_attempt.unique(),
 pbp_2022.query('two_point_attempt == 1').desc[1179])
(array([nan,  0.,  1.]),
 'TWO-POINT CONVERSION ATTEMPT. 16-T.Lawrence pass to 17-E.Engram is incomplete. ATTEMPT FAILS.')
(pbp_2022.field_goal_attempt.unique(),
 pbp_2022.query('field_goal_attempt == 1').desc[32])
(array([nan,  0.,  1.]),
 '(3:19) 9-J.Tucker 24 yard field goal is GOOD, Center-46-N.Moore, Holder-11-J.Stout.')
(pbp_2022.kickoff_attempt.unique(),
 pbp_2022.query('kickoff_attempt == 1').desc[1])
(array([nan,  1.,  0.]),
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
(pbp_2022.punt_attempt.unique(),
 pbp_2022.query('punt_attempt == 1').desc[6])
(array([nan,  0.,  1.]),
 '(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')

fumble indicates if a player fumbled the ball on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.fumble.unique(),
 pbp_2022.query('fumble == 1').desc[80])
(array([nan,  0.,  1.]),
 '(1:16) (Shotgun) 19-J.Flacco pass short right to 83-T.Conklin to BAL 21 for 6 yards (32-M.Williams, 58-M.Pierce). FUMBLES (58-M.Pierce), touched at BAL 25, recovered by NYJ-17-G.Wilson at BAL 27. 17-G.Wilson to BAL 27 for no gain (14-K.Hamilton).')

complete_pass indicates if a player completed a pass on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.complete_pass.unique(),
 pbp_2022.query('complete_pass == 1').desc[7])
(array([nan,  0.,  1.]),
 '(13:42) 8-L.Jackson pass short right to 7-R.Bateman pushed ob at BAL 32 for 4 yards (3-J.Whitehead).')

assist_tackle indicates if a player assisted on the tackle on the play. It is either nan, 1 (True), or 0 (False).

(pbp_2022.assist_tackle.unique(),
 pbp_2022.query('assist_tackle == 1').desc[2])
(array([nan,  0.,  1.]),
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')

The following fields provide the player_id (string), player_name (string) and yards gained (integer) for the passer, receiver or rusher on the play, respectively.

  • passer_player_id
  • passer_player_name
  • passing_yards
  • receiver_player_id
  • receiver_player_name
  • receiving_yards
  • rusher_player_id
  • rusher_player_name
  • rushing_yards
(pbp_2022.passer_player_id[3],
 pbp_2022.passer_player_name[3],
 pbp_2022.passing_yards[3],
 pbp_2022.desc[3])
('00-0026158',
 'J.Flacco',
 nan,
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
(pbp_2022.receiver_player_id[3],
 pbp_2022.receiver_player_name[3],
 pbp_2022.receiving_yards[3],
 pbp_2022.desc[3])
('00-0036924',
 'Mi.Carter',
 nan,
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
(pbp_2022.rusher_player_id[2],
 pbp_2022.rusher_player_name[2],
 pbp_2022.rushing_yards[2],
 pbp_2022.desc[2])
('00-0036924',
 'Mi.Carter',
 19.0,
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')

The following fields provide the player_id (string) and player_name (string) for players who intercepted the ball, returned a punt, returned a kickoff, punted the ball, kicked off the ball, recovered their own kickoff, or blocked the kick, respectively:

  • interception_player_id
  • interception_player_name
  • punt_returner_player_id
  • punt_returner_player_name
  • kickoff_returner_player_name
  • kickoff_returner_player_id
  • punter_player_id
  • punter_player_name
  • kicker_player_name
  • kicker_player_id
  • own_kickoff_recovery_player_id
  • own_kickoff_recovery_player_name
  • blocked_player_id
  • blocked_player_name
(pbp_2022.interception_player_id[28],
 pbp_2022.interception_player_name[28],
 pbp_2022.desc[28])
('00-0033894',
 'M.Williams',
 '(5:07) (Shotgun) 19-J.Flacco pass short middle intended for 81-L.Cager INTERCEPTED by 32-M.Williams at NYJ 46. 32-M.Williams to NYJ 13 for 33 yards (19-J.Flacco).')
(pbp_2022.punt_returner_player_id[6],
 pbp_2022.punt_returner_player_name[6],
 pbp_2022.desc[6])
('00-0036331',
 'D.Duvernay',
 '(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')
(pbp_2022.kickoff_returner_player_id[1],
 pbp_2022.kickoff_returner_player_name[1],
 pbp_2022.desc[1])
('00-0034419',
 'B.Berrios',
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
(pbp_2022.punter_player_id[6],
 pbp_2022.punter_player_name[6],
 pbp_2022.desc[6])
('00-0036313',
 'B.Mann',
 '(13:53) 7-B.Mann punts 45 yards to BAL 19, Center-42-T.Hennessy. 13-D.Duvernay pushed ob at BAL 28 for 9 yards (42-T.Hennessy).')
(pbp_2022.kicker_player_id[1],
 pbp_2022.kicker_player_name[1],
 pbp_2022.desc[1])
('00-0029597',
 'J.Tucker',
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')
(pbp_2022.own_kickoff_recovery_player_id[4964],
 pbp_2022.own_kickoff_recovery_player_name[4964],
 pbp_2022.desc[4964])
('00-0033770',
 'J.Hardee',
 '7-B.Mann kicks onside 12 yards from NYJ 35 to NYJ 47. RECOVERED by NYJ-34-J.Hardee.')
(pbp_2022.blocked_player_id[1947],
 pbp_2022.blocked_player_name[1947],
 pbp_2022.desc[1947])
('00-0036926',
 'P.Turner',
 '(:02) 7-Y.Koo 63 yard field goal is BLOCKED (98-P.Turner), Center-48-L.McCullough, Holder-13-B.Pinion, recovered by ATL-13-B.Pinion at ATL 49. 13-B.Pinion to 50 for 1 yard (53-Z.Baun, 48-J.Gray).')

The following fields show player_id (string), player_name (string) or team (string) for a variety of defensive plays such as tackle for loss, quarterback hit, solo tackle, assist tackle and so on.

  • tackle_for_loss_1_player_id
  • tackle_for_loss_1_player_name
  • tackle_for_loss_2_player_id
  • tackle_for_loss_2_player_name
  • qb_hit_1_player_id
  • qb_hit_1_player_name
  • qb_hit_2_player_id
  • qb_hit_2_player_name
  • solo_tackle_1_team
  • solo_tackle_2_team
  • solo_tackle_1_player_id
  • solo_tackle_2_player_id
  • solo_tackle_1_player_name
  • solo_tackle_2_player_name
  • assist_tackle_1_player_id
  • assist_tackle_1_player_name
  • assist_tackle_1_team
  • assist_tackle_2_player_id
  • assist_tackle_2_player_name
  • assist_tackle_2_team
  • assist_tackle_3_player_id
  • assist_tackle_3_player_name
  • assist_tackle_3_team
  • assist_tackle_4_player_id
  • assist_tackle_4_player_name
  • assist_tackle_4_team
  • tackle_with_assist
  • tackle_with_assist_1_player_id
  • tackle_with_assist_1_player_name
  • tackle_with_assist_1_team
  • tackle_with_assist_2_player_id
  • tackle_with_assist_2_player_name
  • tackle_with_assist_2_team
  • pass_defense_1_player_id
  • pass_defense_1_player_name
  • pass_defense_2_player_id
  • pass_defense_2_player_name
  • sack_player_id
  • sack_player_name
  • half_sack_1_player_id
  • half_sack_1_player_name
  • half_sack_2_player_id
  • half_sack_2_player_name
(pbp_2022.tackled_for_loss[15],
 pbp_2022.tackle_for_loss_1_player_id[15],
 pbp_2022.tackle_for_loss_1_player_name[15],
 pbp_2022.desc[15])
(1.0,
 '00-0036130',
 'J.Madubuike',
 '(9:49) 20-Br.Hall right end to NYJ 9 for -2 yards (92-J.Madubuike).')

There are no plays where tackle_for_loss_2_player_id has a value.

pbp_2022.tackle_for_loss_2_player_id.unique()
array([nan])
(pbp_2022.qb_hit[5],
 pbp_2022.qb_hit_1_player_id[5],
 pbp_2022.qb_hit_1_player_name[5],
 pbp_2022.desc[5])
(1.0,
 '00-0026190',
 'C.Campbell',
 '(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
(pbp_2022.qb_hit[55],
 pbp_2022.qb_hit_1_player_id[55],
 pbp_2022.qb_hit_1_player_name[55],
 pbp_2022.qb_hit_2_player_id[55],
 pbp_2022.qb_hit_2_player_name[55],
 pbp_2022.desc[55])
(1.0,
 '00-0034163',
 'J.Johnson',
 '00-0034163',
 'J.Martin',
 '(8:59) (Shotgun) 8-L.Jackson sacked at BAL 49 for -2 yards (sack split by 52-J.Johnson and 54-J.Martin).')
(pbp_2022.solo_tackle[777],
 pbp_2022.solo_tackle_1_team[777],
 pbp_2022.solo_tackle_1_player_id[777],
 pbp_2022.solo_tackle_1_player_name[777],
 pbp_2022.solo_tackle_2_team[777],
 pbp_2022.solo_tackle_2_player_id[777],
 pbp_2022.solo_tackle_2_player_name[777],
 pbp_2022.desc[777])
(1.0,
 'MIN',
 '00-0032129',
 'J.Hicks',
 'GB',
 '00-0036631',
 'R.Newman',
 '(12:21) 12-A.Rodgers sacked at GB 35 for -9 yards (58-J.Hicks). FUMBLES (58-J.Hicks) [58-J.Hicks], RECOVERED by MIN-94-D.Tomlinson at GB 33. 94-D.Tomlinson to GB 33 for no gain (70-R.Newman).')
(pbp_2022.assist_tackle[2],
 pbp_2022.assist_tackle_1_team[2],
 pbp_2022.assist_tackle_1_player_id[2],
 pbp_2022.assist_tackle_1_player_name[2],
 pbp_2022.assist_tackle_2_team[2],
 pbp_2022.assist_tackle_2_player_id[2],
 pbp_2022.assist_tackle_2_player_name[2],
 pbp_2022.desc[2])
(1.0,
 'BAL',
 '00-0033894',
 'M.Williams',
 'BAL',
 '00-0033294',
 'C.Clark',
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')

There are no plays where assist_tackle_3_player_id or assist_tackle_4_player_id have a value.

pbp_2022.assist_tackle_3_player_id.unique(), pbp_2022.assist_tackle_4_player_id.unique()
(array([nan]), array([nan]))

tackle_with_assist is not the same as assist_tackle.

(pbp_2022.tackle_with_assist[2],
 pbp_2022.tackle_with_assist_1_team[2],
 pbp_2022.tackle_with_assist_1_player_id[2],
 pbp_2022.tackle_with_assist_1_player_name[2],
 pbp_2022.tackle_with_assist_2_team[2],
 pbp_2022.tackle_with_assist_2_player_id[2],
 pbp_2022.tackle_with_assist_2_player_name[2],
 pbp_2022.desc[2])
(0.0,
 nan,
 nan,
 nan,
 nan,
 nan,
 nan,
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
(pbp_2022.tackle_with_assist[22659],
 pbp_2022.tackle_with_assist_1_team[22659],
 pbp_2022.tackle_with_assist_1_player_id[22659],
 pbp_2022.tackle_with_assist_1_player_name[22659],
 pbp_2022.tackle_with_assist_2_team[22659],
 pbp_2022.tackle_with_assist_2_player_id[22659],
 pbp_2022.tackle_with_assist_2_player_name[22659],
 pbp_2022.desc[22659])
(1.0,
 'LAC',
 '00-0031040',
 'K.Mack',
 'ATL',
 '00-0035208',
 'O.Zaccheaus',
 '(9:31) (No Huddle, Shotgun) 1-M.Mariota pass short left to 5-D.London to LAC 6 for 5 yards (52-K.Mack, 43-M.Davis). FUMBLES (52-K.Mack), RECOVERED by LAC-52-K.Mack at LAC 6. 52-K.Mack pushed ob at 50 for 44 yards (17-O.Zaccheaus, 5-D.London).')

I’ll explore this more later before using these fields in analyses, but it seems like the assist_tackle fields provide information on players who assisted with the tackle, while tackle_with_assist lists information of the “main” player who was assisted on the tackle.

(pbp_2022.assist_tackle[22659],
 pbp_2022.assist_tackle_1_team[22659],
 pbp_2022.assist_tackle_1_player_id[22659],
 pbp_2022.assist_tackle_1_player_name[22659],
 pbp_2022.assist_tackle_2_team[22659],
 pbp_2022.assist_tackle_2_player_id[22659],
 pbp_2022.assist_tackle_2_player_name[22659],
 pbp_2022.desc[22659])
(1.0,
 'LAC',
 '00-0033697',
 'M.Davis',
 'ATL',
 '00-0037238',
 'D.London',
 '(9:31) (No Huddle, Shotgun) 1-M.Mariota pass short left to 5-D.London to LAC 6 for 5 yards (52-K.Mack, 43-M.Davis). FUMBLES (52-K.Mack), RECOVERED by LAC-52-K.Mack at LAC 6. 52-K.Mack pushed ob at 50 for 44 yards (17-O.Zaccheaus, 5-D.London).')
(pbp_2022.pass_defense_1_player_id[1613],
 pbp_2022.pass_defense_1_player_name[1613],
 pbp_2022.pass_defense_2_player_id[1613],
 pbp_2022.pass_defense_2_player_name[1613],
 pbp_2022.desc[1613])
('00-0033050',
 'X.Howard',
 '00-0036998',
 'J.Holland',
 '(10:05) (Shotgun) 10-M.Jones pass deep right intended for 1-D.Parker INTERCEPTED by 8-J.Holland (25-X.Howard) at MIA -3. 8-J.Holland to MIA 28 for 31 yards (76-I.Wynn).')

The following fields show player_id (string), player_name (string) or team (string) for a variety of fumble-related plays:

  • forced_fumble_player_1_team
  • forced_fumble_player_1_player_id
  • forced_fumble_player_1_player_name
  • forced_fumble_player_2_team
  • forced_fumble_player_2_player_id
  • forced_fumble_player_2_player_name
  • fumbled_1_team
  • fumbled_1_player_id
  • fumbled_1_player_name
  • fumbled_2_player_id
  • fumbled_2_player_name
  • fumbled_2_team
  • fumble_recovery_1_team
  • fumble_recovery_1_yards
  • fumble_recovery_1_player_id
  • fumble_recovery_1_player_name
  • fumble_recovery_2_team
  • fumble_recovery_2_yards
  • fumble_recovery_2_player_id
  • fumble_recovery_2_player_name
(pbp_2022.fumble_forced[9041],
 pbp_2022.forced_fumble_player_1_team[9041],
 pbp_2022.forced_fumble_player_1_player_id[9041],
 pbp_2022.forced_fumble_player_1_player_name[9041],
 pbp_2022.forced_fumble_player_2_team[9041],
 pbp_2022.forced_fumble_player_2_player_id[9041],
 pbp_2022.forced_fumble_player_2_player_name[9041],
 pbp_2022.desc[9041])
(1.0,
 'NYG',
 '00-0033046',
 'J.Ward',
 'NYG',
 '00-0036167',
 'T.Crowder',
 '(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
(pbp_2022.fumbled_1_team[9041],
 pbp_2022.fumbled_1_player_id[9041],
 pbp_2022.fumbled_1_player_name[9041],
 pbp_2022.fumbled_2_team[9041],
 pbp_2022.fumbled_2_player_id[9041],
 pbp_2022.fumbled_2_player_name[9041],
 pbp_2022.desc[9041])
('CHI',
 '00-0034279',
 'E.St. Brown',
 'CHI',
 '00-0036953',
 'T.Ebner',
 '(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
(pbp_2022.fumble_recovery_1_team[9041],
 pbp_2022.fumble_recovery_1_player_id[9041],
 pbp_2022.fumble_recovery_1_player_name[9041],
 pbp_2022.fumble_recovery_1_yards[9041],
 pbp_2022.fumble_recovery_2_team[9041],
 pbp_2022.fumble_recovery_2_player_id[9041],
 pbp_2022.fumble_recovery_2_player_name[9041],
 pbp_2022.fumble_recovery_2_yards[9041],
 pbp_2022.desc[9041])
('CHI',
 '00-0036945',
 'J.Fields',
 -3.0,
 'CHI',
 '00-0033082',
 'L.Patrick',
 -6.0,
 '(:03) (Shotgun) 1-J.Fields pass short right to 25-T.Ebner to CHI 35 for 2 yards. Lateral to 19-E.St. Brown to CHI 44 for 9 yards. FUMBLES, touched at CHI 44, recovered by CHI-1-J.Fields at CHI 39. 1-J.Fields to CHI 36 for -3 yards. Lateral to 19-E.St. Brown to CHI 44 for 8 yards. Lateral to 25-T.Ebner to NYG 44 for 12 yards (55-J.Ward). FUMBLES (55-J.Ward), recovered by CHI-62-L.Patrick at NYG 46. 62-L.Patrick to CHI 48 for -6 yards. Lateral to 1-J.Fields to CHI 49 for 1 yard. Lateral to 76-T.Jenkins to CHI 46 for -3 yards (48-T.Crowder). FUMBLES (48-T.Crowder), touched at CHI 45, recovered by CHI-25-T.Ebner at CHI 41. 25-T.Ebner to CHI 32 for -9 yards. FUMBLES, touched at CHI 32, RECOVERED by NYG-24-D.Belton at CHI 28.')
(pbp_2022.sack[54],
 pbp_2022.sack_player_name[54],
 pbp_2022.sack_player_id[54],
 pbp_2022.desc[54])
(1.0,
 'Qu.Williams',
 '00-0035680',
 '(9:43) (Shotgun) 8-L.Jackson sacked ob at NYJ 49 for 0 yards (56-Qu.Williams).')

When a sack is split, sack == 1 but sack_player_name and id are nan.

(pbp_2022.sack[55],
 pbp_2022.sack_player_name[55],
 pbp_2022.sack_player_id[55],
 pbp_2022.half_sack_1_player_id[55],
 pbp_2022.half_sack_1_player_name[55],
 pbp_2022.half_sack_2_player_id[55],
 pbp_2022.half_sack_2_player_name[55],
 pbp_2022.desc[55])
(1.0,
 nan,
 nan,
 '00-0034163',
 'J.Johnson',
 '00-0034163',
 'J.Martin',
 '(8:59) (Shotgun) 8-L.Jackson sacked at BAL 49 for -2 yards (sack split by 52-J.Johnson and 54-J.Martin).')

return_team (string) and return_yards (integer) are the abbreviation and yardage of the team that returned the kickoff or punt. I’ll look into if fumble returns are included before I use this field for analyses.

(pbp_2022.return_team[1], 
 pbp_2022.return_yards[1],
 pbp_2022.desc[1])
('NYJ',
 25.0,
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')

The following fields hold information about penalties.

  • penalty_team (string)
  • penalty_player_id (string)
  • penalty_player_name (string)
  • penalty_yards (integer)
  • penalty_type (string)
(pbp_2022.penalty[5],
 pbp_2022.penalty_team[5],
 pbp_2022.penalty_player_id[5],
 pbp_2022.penalty_player_name[5],
 pbp_2022.penalty_yards[5],
 pbp_2022.penalty_type[5],
 pbp_2022.desc[5])
(1.0,
 'NYJ',
 '00-0026158',
 'J.Flacco',
 10.0,
 'Intentional Grounding',
 '(14:01) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short right [93-C.Campbell]. PENALTY on NYJ-19-J.Flacco, Intentional Grounding, 10 yards, enforced at NYJ 46.')
pbp_2022.penalty_type.unique()
array([nan, 'Intentional Grounding', 'Illegal Contact',
       'Offensive Holding', 'Defensive Pass Interference',
       'Defensive Holding', 'Offensive Pass Interference', 'False Start',
       'Horse Collar Tackle', 'Defensive Too Many Men on Field',
       'Taunting', 'Delay of Game', 'Roughing the Passer',
       'Unsportsmanlike Conduct', 'Low Block', 'Illegal Formation',
       'Ineligible Downfield Pass', 'Unnecessary Roughness',
       'Neutral Zone Infraction', 'Running Into the Kicker',
       'Illegal Shift', 'Defensive Offside', 'Illegal Use of Hands',
       'Illegal Block Above the Waist', 'Offensive Too Many Men on Field',
       'Encroachment', 'Disqualification', 'Ineligible Downfield Kick',
       'Face Mask', 'Player Out of Bounds on Kick',
       'Illegal Forward Pass', 'Chop Block', 'Delay of Kickoff',
       'Tripping', 'Illegal Substitution', 'Offensive Offside',
       'Illegal Blindside Block', 'Illegal Touch Pass',
       'Offside on Free Kick', 'Roughing the Kicker',
       'Fair Catch Interference', 'Leverage', 'Illegal Motion',
       'Defensive Delay of Game', 'Illegal Bat', 'Illegal Touch Kick',
       'Illegal Double-Team Block', 'Invalid Fair Catch Signal',
       'Illegal Crackback', 'Illegal Kick/Kicking Loose Ball'],
      dtype=object)

replay_or_challenge (1 for True and 0 for False) and replay_or_challenge_result (nan, 'upheld', or 'reversed') show information about whether a replay or challenge occurred on the play.

(pbp_2022.replay_or_challenge[621],
 pbp_2022.replay_or_challenge_result[621],
 pbp_2022.desc[621])
(1,
 'upheld',
 '(7:42) (Shotgun) 25-M.Gordon right tackle to SEA 1 for no gain (6-Q.Diggs, 10-U.Nwosu). FUMBLES (6-Q.Diggs), RECOVERED by SEA-30-M.Jackson at SEA 2. 30-M.Jackson to SEA 10 for 8 yards (14-C.Sutton). The Replay Official reviewed the fumble ruling, and the play was Upheld. The ruling on the field stands.')

safety_player_name and safety_player_id have information about the player who caused the safety.

(pbp_2022.safety[3255],
 pbp_2022.safety_player_name[3255],
 pbp_2022.safety_player_id[3255],
 pbp_2022.desc[3255])
(1.0,
 'D.Alford',
 '00-0037034',
 '(:13) (Run formation) 19-B.Powell right end ran ob in End Zone for -26 yards, SAFETY (37-D.Alford).')

series_result is the result of the offensive series.

pbp_2022.series_result.unique()
array(['First down', 'Punt', 'Turnover', 'Field goal',
       'Missed field goal', 'Touchdown', 'End of half',
       'Turnover on downs', 'QB kneel', 'Opp touchdown', 'Safety', nan],
      dtype=object)

play_type_nfl shows slightly different play type categories.

pbp_2022.play_type_nfl.unique()
array(['GAME_START', 'KICK_OFF', 'RUSH', 'PASS', 'PUNT', 'TIMEOUT',
       'PENALTY', 'FIELD_GOAL', 'END_QUARTER', 'SACK', 'XP_KICK',
       'END_GAME', 'PAT2', nan, 'FREE_KICK'], dtype=object)

drive_play_count shows how many plays the drive had. I’ll look into it more before using it for analyses. It doesn’t always match the number of plays on the drive, or at least seems not to, so I need to understand how they calculate this value.

pbp_2022.drive_play_count.unique()
array([nan,  4.,  6.,  5.,  3.,  8.,  1.,  9., 16., 11.,  2., 13.,  7.,
       14., 10., 15., 12.,  0., 18., 19., 20., 17., 21.])

drive_time_of_possession is a formatted string of minutes:seconds the drive took.

pbp_2022.drive_time_of_possession.unique()[:5]
array([nan, '1:18', '3:53', '2:44', '1:04'], dtype=object)

drive_first_downs is the number of first downs achieved on the drive.

pbp_2022.drive_first_downs.unique()[:5]
array([nan,  1.,  0.,  3.,  2.])

drive_inside20 is either nan, 1 (True) or 0 (False) and indicates if a drive ended inside of the red zone (20 yards from the end zone).

pbp_2022.drive_inside20.unique()
array([nan,  0.,  1.])

drive_ended_with_score indicates if a drive ended with the offensive team scoring. It is either nan, 1 (True) or 0 (False).

pbp_2022.drive_ended_with_score.unique()
array([nan,  0.,  1.])

I’ll have to look into it more before using it for analyses, but I believe drive_yards_penalized is the total number of offensive penalty yards on the drive.

pbp_2022.drive_yards_penalized.unique()[:5]
array([ nan, -10.,   0.,   5.,  32.])

drive_play_id_started and drive_play_id_ended indicate the start and end play_id of the drive. Note that play_id are not consecutive and doesn’t start at 1.

(pbp_2022.drive_play_id_started[1],
pbp_2022.drive_play_id_ended[1])
(43.0, 172.0)

away_score and home_score are the final scores of the away team and home team.

(pbp_2022.away_team[1],
 pbp_2022.away_score[1],
 pbp_2022.home_team[1],
 pbp_2022.home_score[1])
('BAL', 24, 'NYJ', 9)

result is the difference between the home and the away team (I think—will look into it more).

pbp_2022.result[1]
-15

total is the total number of points scored by both teams.

pbp_2022.total[1]
33

div_game indicates if the game is between teams in the same division. It is either 1 (True) or 0 (False).

pbp_2022.div_game.unique(), pbp_2022.div_game[1]
(array([0, 1]), 0)

away_coach and home_coach are the names of the away team and home team coaches, respectively.

pbp_2022.away_coach[1], pbp_2022.home_coach[1]
('John Harbaugh', 'Robert Saleh')

The following fields give the name and jersey number of the passer, rusher or receiver on the play:

  • passer
  • passer_id
  • passer_jersey_number
  • rusher
  • rusher_id
  • rusher_jersey_number
  • receiver
  • receiver_id
  • receiver_jersey_number
(pbp_2022.passer[3], 
 pbp_2022.passer_id[3],
 pbp_2022.passer_jersey_number[3])
('J.Flacco', '00-0026158', 19.0)
(pbp_2022.rusher[2], 
 pbp_2022.rusher_id[2],
 pbp_2022.rusher_jersey_number[2])
('Mi.Carter', '00-0036924', 32.0)
(pbp_2022.receiver[3], 
 pbp_2022.receiver_id[3],
 pbp_2022.receiver_jersey_number[3])
('Mi.Carter', '00-0036924', 32.0)

The following fields indicate if the play is a pass, rush, first down, or special teams, respectively. Their value is nan, 1 (True) or 0 (False):

  • pass
  • rush
  • first_down
  • special
pbp_2022['pass'][3], pbp_2022.desc[3]
(1,
 '(14:29) (No Huddle, Shotgun) 19-J.Flacco pass incomplete short left to 32-Mi.Carter.')
pbp_2022.rush[2], pbp_2022.desc[2]
(1,
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
pbp_2022.first_down[2], pbp_2022.desc[2]
(1.0,
 '(14:56) 32-Mi.Carter left end to NYJ 41 for 19 yards (32-M.Williams; 36-C.Clark).')
pbp_2022.special[1], pbp_2022.desc[1]
(1,
 '9-J.Tucker kicks 68 yards from BAL 35 to NYJ -3. 10-B.Berrios to NYJ 22 for 25 yards (51-J.Ross).')