Network Science Based Basketball Analytics

Network Science based basketball analytics comprise a various recent attempts to apply the perspective of networks to the analysis of basketball.

Overview
Traditional basketball statistics analyze individuals independently of their teammates or competitors and traditional player positions are determined by individual attributes. In contrast, these network based analytics are obtained by constructing a team or league level player networks, where individual players are nodes connected by the ball movement or by some measure of similarity. Then, the metrics are obtained by calculating network properties, such as degree, density, centrality, clustering, distance etc. This approach enriches the analysis of basketball with new individual and team level statistics and offers a new way of assigning position to a player.

Team level statistics
The biggest contribution to the team level metrics came from Arizona State University researchers led by Jennifer H. Fewell. Using 2010 NBA first round playoff data, they constructed the networks for each team using players as nodes and ball movement between them as links. They distinguish the trade-off between not necessarily mutually exclusive division of labor and team's unpredictability that are measured by Uphill downhill flux and Team entropy respectively.

Team entropy - A measure of unpredictability and variation in teams offense, higher entropy meaning more variation. It is calculated as aggregated individual Shannon entropies, where unpredictability is measured as uncertainty of the ball movement between any two nodes.

Uphill downhill flux - Measures the division of labor, or the expertise in moving the ball to the player with the best shooting percentage. According to Fewell et al. It can be interpreted as an average change in potential shooting percentage per pass. The metric is calculated as a sum of the differences between the shooting percentages of the nodes at the ends of each edge



F = \sum_{i \neq j} p_{ij} \ (x_i - x_j) $$ ,

where p ij is the probability of the link between players i and j, xi and xj are their shooting percentages.

Other measures include:

Team clustering coefficient - A direct application of a clustering coefficient. It measures how interconnected are the players, whether the ball moves via one node or whether in many ways between all the players.

Team degree centrality - Similarly to the previous metric, it measures if there is one dominant player in the team. It is calculated by the formula

$$ C_D = \sum_{v \in V} \frac{deg(v^*) - deg(v)} {\mid V \mid - 1} $$

where deg(v) is the degree of the node v, deg(v*) is the highest degree node, V is the number of nodes.

Combined low clustering and high degree centrality mean that the defense can put double team on the dominant player, since without him ball team experiences problems in moving the ball.

Average path length - Number of passes per play.

Path flow rate - Number of passes per unit time. It measures how quickly the team moves the ball.

Deviation from maximum operating potential - Using players as the nodes and ball movement and links and true shooting percentage as efficiency, analogy can by made to the traffic network. Each individual is assumed to have a skill curve f(x), which is declining in the number shots taken. Individual maximization of the efficiency yield $$ f(x_1)=f(x_2)...=f(x_i) $$ whereas to maximum efficiency is achieved by solving $$ \frac {dF}{dx_1} = \frac {dF}{dx_2}... =\frac {dF}{dx_i} $$, where $$ F = x_1f(x_1) + x_2f(x_2)...+x_3f(x_3) $$ The difference between these two constitute the teams deviation from the maximum potential.

Individual statistics
Success/Failure Ratio - The number of times the player (node) was involved in the successful play divided by the number of times the player was involved in the unsuccessful play. The metric is obtained from the team play by play network.

Under/over performance - The metric is calculated by mapping the bipartite player network. Players are connected if they were a part of one team. The links are weighted by how successful was the team, where the players played together. Then node centrality measures are compared to the reference centrality distributions for each node obtained by bootstrap - based randomization procedures and p - values are calculated. For example p - value of player i is given by :

$$ p_i = \frac {\sum_{k=1}^J (I|\pi_{i^*k} \ge \pi_i^0)}{J} $$, where πi* is the reference centrality score, πi0 is the calculated centrality score, J - number of iterations. High p - values indicate under-performance, low - over-performance.

Under - utilization - A player is under-utilized by the time if he has a low degree centrality, but is over-performing

Player positions
New basketball positions were classified by Stanford University student Muthu Alagappan, who while working for the data visualization company Ayasdi, mapped the network of one season NBA players linking them by the similarity of their statistics. Then, based on node clusters players were grouped into 13 positions.

Offensive Ball-Handler Player that specializes in scoring and ball handling, but has low averages of steals and blocks.

Defensive Ball-Handler Player who specialized in assisting and stealing the ball, but is average in scoring and shooting.

Combo Ball-Handler Player who is above average in both offense and defense, but doesn't excel in any.

Shooting Ball-Handler Player that is above average in shot attempts and points scored per game.

Role-Playing Ball-Handler Those who play few minutes and don't have large impact on the team.

3-Point Rebounder A big man and a ball handler with above average rebounds and three point shots attempted and made.

Scoring Rebounder Player with high scoring and rebound averages.

Paint Protector Those valued for blocking and rebounding, but with low average points scored.

Scoring Paint Protector Players that at both good and offense and defense in the paint.

NBA 1st-Team Those with above averages in most of the statistical categories.

NBA 2nd-Team Similar, but a bit worse than NBA 1st-Team players.

Role Player Similar, but worse than NBA 2nd-Team players.

One-of-a-Kind Ones that are so good and exceptional that could not be categorized.