Playing with NBA Stats

stats.nba.com is one of the coolest places on the web. If you've ever wanted to have fun with "Big" data and happen to be a big fan of the Association, check it out. I'll show you how to get started.

Step 1

Download Python. There's hundreds of ways to get Python. I suggest installing Continuum Analytics' Anaconda. It's popular in the Python community. It also comes with a couple of Python modules that we'll use to scrape and contextualize.

Step 2

Download my Python scripts from my GitHub. These scripts scrape stats.nba.com and do some very, very basic data contextualization. The goal? We want to plot each team's all-time field goal shooting percentage (FG%). This requires obtaining each team's FG% for every game they ever played. That's what the file get_teams.py is for! The other script, plot_teams.py, generates our targeted plot.

Step 3

Run the Python scripts. The required Python modules are at the top of each Python script. You will likely need to install nba_py. To do this, simply issue the following command:

$ pip install nba_py

You can execute get_teams.py from the command prompt as follows:

$ python get_teams.py

Provided you have Internet access, you'll see several .csv files (each belonging to a different NBA team) fill up in your working directory. To start processing this data, execute plot_teams.py (in the same directory) as follows:

$ python plot_teams.py

This will generate a violin plot conveying each team's field goal percentage distribution throughout its history.

This is a fairly intuitive result.  It makes sense that, on the average, when a team wins, its field goal percentage is higher than when it loses. This is why the green KDEs are shifted above their blue (loser) counterparts.