stats.nba.com is one of the coolest places on the web. If you've ever wanted to have fun with "Big" data and happen to be a big fan of the Association, check it out. I'll show you how to get started.
Download Python. There's hundreds of ways to get Python. I suggest installing Continuum Analytics' Anaconda. It's popular in the Python community. It also comes with a couple of Python modules that we'll use to scrape and contextualize.
Download my Python scripts from my GitHub. These scripts scrape stats.nba.com and do some very, very basic data contextualization. The goal? We want to plot each team's all-time field goal shooting percentage (FG%). This requires obtaining each team's FG% for every game they ever played. That's what the file get_teams.py is for! The other script, plot_teams.py, generates our targeted plot.
Run the Python scripts. The required Python modules are at the top of each Python script. You will likely need to install nba_py. To do this, simply issue the following command:
$ pip install nba_py
You can execute get_teams.py from the command prompt as follows:
$ python get_teams.py
Provided you have Internet access, you'll see several .csv files (each belonging to a different NBA team) fill up in your working directory. To start processing this data, execute plot_teams.py (in the same directory) as follows:
$ python plot_teams.py
This will generate a violin plot conveying each team's field goal percentage distribution throughout its history.
This is a fairly intuitive result. It makes sense that, on the average, when a team wins, its field goal percentage is higher than when it loses. This is why the green KDEs are shifted above their blue (loser) counterparts.