Sports analytics

Sports analytics are collections of relevant historical statistics that can provide a competitive advantage to a team or individual by helping to inform players, coaches and other staff and help facilitate decision-making both during and prior to sporting events. The term "sports analytics" was popularized in mainstream sports culture following the release of the 2011 film Moneyball. In this film, Oakland Athletics general manager Billy Beane (played by Brad Pitt) relies heavily on the use of baseball analytics to build a competitive team on a minimal budget, building upon and extending the established practice of Sabermetrics.

There are two key aspects of sports analytics—on-field and off-field analytics. On-field analytics deals with improving the on-field performance of teams and players, including questions such as "which player on the Red Sox contributed most to the team's offense?" or "who is the best wing player in the NBA?", etc. Off-field analytics deals with the business side of sports. Off-field analytics focuses on helping a sport organization or body surface patterns and insights through data that would help increase ticket and merchandise sales, improve fan engagement, etc. Off-field analytics essentially uses data to help rights-holders make decisions that would lead to higher growth and increased profitability.

As technology has advanced over the last number of years, data collection has become more in-depth and can be conducted with relative ease. Advancements in data collection have allowed for sports analytics to grow as well, leading to the development of advanced statistics and machine learning, as well as sport specific technologies that allow for things like game simulations to be conducted by teams prior to play, improve fan acquisition and marketing strategies, and even understand the impact of sponsorship on each team as well as its fans.

Another significant impact sports analytics has had on professional sports is in relation to sports betting. In-depth sports analytics has taken sports gambling to new levels; whether it be fantasy sports leagues or nightly wagers, bettors now have more information at their disposal to help aid decision making than ever before. A number of companies and webpages have been developed to help provide fans with up-to-date information for their betting needs.

Early history
Baseball was one of the first sports to embrace sports analytics with Earnshaw Cook publishing Percentage Baseball in 1964. This was the first publication citing sports analytics to garner national media attention. In 1981, Bill James helped bring SABR (Society for American Baseball Research), one of the leading sports analytical organizations for baseball, into national prominence when Sports Illustrated featured James in the article He Does It By The Numbers by Daniel Okrent (1981).

In 1984, New York Mets manager Davey Johnson became the first known member of a known sports organization to advocate for the use of sports analytics. During his time with the Baltimore Orioles, Johnson had tried to convince the organization to use his FORTRAN baseball computer simulation to determine the team's optimal starting lineup. As manager of the Mets, Johnson tasked a team employee with writing a dBASE II application to run sophisticated statistical models in order to better understand the capabilities and tendencies of the team's opponents. By the close of the twentieth century, sports analytics had gained significant acceptance by the management of many Major League Baseball clubs, notably the Oakland A's, Boston Red Sox and Cleveland Indians.

At the same time, baseball fans and sports media had begun to adopt sports analytics as a way to understand and report the game. In 1996, Baseball Prospectus sought to build upon Bill James' work when it launched the Baseball Prospectus website in order to present sabermetric research and related findings as well as publish advanced metrics such as EqA, the Davenport Translations (DT's), and VORP. Baseball Prospectus has grown into a multi-channel sports media organization employing a team of statisticians and writers who publish New York Times Best Selling books and host weekly radio shows and podcasts.

Recent developments
The MLB has set the benchmark in sports analytics for a number of years, with some of the game's brightest minds having never set foot into the heat of a major or minor league baseball game. Theo Epstein of the Chicago Cubs is one of those minds who has never suited up in a professional baseball game; instead, Epstein relies on his Yale University education and the numbers behind the game to make many of his decisions. Epstein, known for his role in ending two of baseball's most famous streaks (the Boston Red Sox curse of the Great Bambino in 2004, and as recently as the 2016 World Series, helping end the 108-year drought between World Series wins for the Chicago Cubs), is a member of a growing community in major league baseball who do not rely on years of major league playing experience. This community has been able to grow thanks to the in-depth collection of statistics that has existed in baseball for decades. With analytics being relatively common in MLB, there is a breadth of statistics that have become vital in the analysis of the game, which include:


 * Batting average is one of the most commonly discussed statistics in baseball. A player's batting average is determined by dividing the number of hits by the number of at bats for that player. This statistic shows a player's tendencies and which pitch usually strikes them out and can help them identify pitches they struggle with at the plate.
 * On-base percentage is the percentage of times a player reaches base on either a hit, walk, or by being hit by a pitch. This is a significant offensive stat, as it looks beyond hits and, more importantly, illustrates how often a batter can avoid being put out at the plate. This is a more in-depth offensive statistic than batting average, as it takes into account walks and being hit by a pitch, both of which are indicators of how a player handles an at bat. Sabermetrics can help change a player's approach in order to raise their own base percentage, increasing productivity and, ultimately, their overall worth as a player.
 * Slugging average is the calculation that determines the number of bases a player earns on hits. To determine this stat, the number of bases earned is divided by the number of at bats. This is a good measure for measuring a batter's power, as the higher their slugging average is, the more likely they are to hit for extra bases (i.e. a double, triple or home run). For sluggers, analytics can help them improve decision-making at the plate. Now, hitters can study the tendencies of the pitchers they are going to face, thereby familiarizing themselves before they are up to bat.
 * WHIP stands for Walks plus Hits allowed per Inning Pitched and tends to be viewed as a strong way to measure the success of a pitcher, as it illustrates how many baserunners the pitcher allows on both hits and walks. This is also a method for looking at a pitcher's efficiency. Now, pitchers can study the upcoming lineup they are going to face and focus on tendencies of the batter, such as where they stand on the plate, what pitches they tend to chase, and what part of the field they like to hit.
 * Shifting is a defensive realignment from the standard positions to blanket one side of the field or another. The use shifts began as a result of hitters routinely getting base hits into certain gaps between fielders. With the use of analytics, managers and players can be aware of hitter tendencies and could implement a shift.

National Basketball Association (NBA)
Houston Rockets' Daryl Morey was the first NBA general manager to implement advanced metrics as a key aspect of player evaluation. In the years that followed Morey's hiring, the NBA moved quickly to adopt advanced metrics-based player evaluation practices. In 2012, John Hollinger left ESPN to become VP of Basketball Operations for the Memphis Grizzlies.

Beyond professional basketball front offices, major sports media websites such as Basketball Reference are dedicated to the collection, synthesis, and dissemination of advanced metrics to pro and college basketball organizations, sports media members, and fans.

NCAA college basketball
North Carolina, under coach Frank McGuire, was the first known basketball organization to utilize advanced possession metrics to gain a competitive advantage. Since then, sports analytics enthusiasts in basketball have created weighted statistics that measure each player and each team's on-court efficiency. Most basketball-specific advanced metrics feature a per-minute measurement to ensure that a player's incremental team contributions are measured irrespective of usage volume.

National Football League (NFL)
In 2003, the sports analytics-focused website Football Outsiders pioneered football's first comprehensive advanced metric, DVOA (defense-adjusted value over average), which compares a player's success on each play to the league average based on a number of variables including down, distance, location on field, current score gap, quarter, and strength of opponent. Football Outsiders' work has since been widely cited by analytical members the sports media establishment. A few years later, Pro Football Focus launched a comprehensive statistical database, which soon featured a sophisticated player grading system. Advanced Football Analytics (originally Advanced NFL Stats) has its EPA (expected points added) and WPA (win probability added) for NFL players.

Grantland lead football writer Bill Barnwell created the first metrics focused on predicting the future performance of an individual player, the Speed Score, which he referenced in a piece written for Pro Football Prospectus. After analyzing data pertaining to running back success, Barnwell discovered that the most successful running backs at the NFL level were both fast and heavy, therefore, Speed Score weights 40-yard dash times by assigning a premium to bigger, often stronger, running backs.

One of the driving forces for the use of sports analytics in the NFL has been the growth of fantasy football. Fantasy sports writer C. D. Carter and peers at XN Sports, NumberFire, and the long-form fantasy football analysis site, Rotoviz.com, have established an informal subculture of fantasy football sports writers who refer to themselves as "degens". The degen movement is responsible for the creation of numerous American football efficiency metrics that better explain past football performances and attempt to predict future player production. Height-adjusted Speed Score, College Dominator Rating, Target Premium, Catch Radius, Net Expected Points (NEP), and Production Premium were recently created and disseminated by degen writers and mathematicians. Building on the work of these writers, sites such as PlayerProfiler.com distill a wide variety of established advanced metrics into a single player snapshot designed to be palatable to the casual sports fan.

National Hockey League (NHL)
The NHL has kept statistics since its inception, yet it is a relatively new adopter of analytics-based decision making. The Toronto Maple Leafs were the first team in the NHL to hire a member of management with a largely analytical background when they hired assistant general manager Kyle Dubas in 2014. Dubas, similar to Theo Epstein in MLB, has never suited up in a professional game and relies on the numbers generated by players on a nightly basis both now and in the past to make decisions.


 * The Corsi statistic is an advanced statistic that has been widely adopted throughout the NHL, as teams, fans and media alike rely on the Corsi statistic to track shot attempt differential. Corsi has been recognized as the most informative single statistic in the game of hockey as it can provide insight into both the offensive and defensive play of a team as well as the amount of time a team has possession of the puck.

Professional Golf Association (PGA) Tour
The PGA Tour collects vast amounts of data throughout the season. These statistics track each shot a player takes in tournament play, collecting information on how far the ball travels and exactly where each shot is played from and where it finishes. These data have been used for a number of years by players and their coaches during practice sessions as well as during tournament preparation, highlighting the areas in which that player needs to improve before teeing it up in tournament play.


 * Shotlink data collection has revolutionized the way that data is collected in the game of golf. Introduced on a full-time basis in 2003, Shotlink relies on a number of strategically placed on-course laser rangefinders and cameras to collect precise data from every shot that is struck on the PGA Tour. With these data, players are able to see the areas of their game that need improving, and on a broader year-to-year basis, players can review course statistics from previous years to allow for relevant tournament preparation. On top of the year-to-year statistics provided players and fans can also easily access these statistics at an up to the minute rate, giving these data an extremely high velocity. Shotlink has also made its mark on the world of golf course design as designers have constant access to up to the minute statistics of professional golfers, allowing for these designers to create courses that can provide a challenge for the world's best players.

Soccer
Soccer uses tracking data, such as the positional data of the players and ball, for teams to obtain information about players’ conditioning. This data has also been used for evaluating attacking performance to estimate goals scored using Artificial Intelligence. Other approaches have included dribbling and passing. Research is also undergoing at Nagoya University to investigate the potential of using the defender-orientated ball recovery and being attacked as metrics, with it being used successfully with data from the Japanese J1 League to predict the strategies used by the teams.

History
Many statisticians attribute the popularization of sports analytics to current Oakland Athletics General Manager Billy Beane. Strapped with a minimalist budget, Beane relied on sabermetrics, a form of sports analytics, to evaluate players and make personnel decisions.

Understanding the importance of getting runners on base, Beane focussed on acquiring players with a high on base percentage with the logic that teams with a higher on base percentage are more likely to score runs. He was also able to achieve success on a shoestring budget by acquiring overlooked starting pitchers, often getting them for a fraction of the price that a big name pitcher may require. When Beane's Athletics began to achieve success, other major league teams took notice. The second team to adopt a similar approach was the Boston Red Sox, who in 2003 made Theo Epstein the interim general manager. Epstein, who remains the youngest general manager to ever be hired in MLB, came into the position without any professional playing experience, highly irregular at the time. Using a similar approach to that of Billy Beane, Epstein was able to form a Boston Red Sox team that in 2004, won the organization's first World Series in 86 years, breaking the alleged Curse of the Bambino. Many experts attribute some of Epstein's success to Boston Red Sox owner, John W. Henry, who achieved significant success in the investments industry by using data-based decision making. As owner, Henry provided Epstein with significant leeway when it came to data-based decision making and the use of sabermetrics, as he knew the impact that such tools can have in achieving success in both sports and business. Since his success in Boston, Epstein had moved on to Chicago, where in 2016 he led the Chicago Cubs to their first World Series title in 108 years. More recently, teams like the Houston Rockets of the NBA have put a heavy focus on analytics to dictate front office and on-court decisions. Daryl Morey, the General Manager of the Rockets decided to emphasize three point shots and used analytics to support his argument. As a result, the Rockets began shooting many more three-point shots and even traded their budding big man, Clint Capela.

The success of analytic based strategies and decision making in baseball was noted by executives in other professional sports leagues. Today, almost every professional organization has least one analytical expert on staff, if not an entire department dedicated to analytics.

Houston Astros (MLB)
The Astros rely heavily on analytics when making decisions. The team has employees with titles such as, director of decision sciences, medical risk manager and mathematic modeler. Unlike other professional teams who typically use analytics solely for player transactions and signings, the Astros have begun to use analytics to make decisions on how they will play on the field, "applying the defensive shift more than any other team in the MLB last season." Using this approach, the Houston Astros captured their first World Series victory in franchise history in 2017.

San Antonio Spurs (NBA)
One of the early adopters of SportVU, the San Antonio Spurs have been using analytics to gain a competitive advantage on opponents for a number of years. Collectively as a team the Spurs have honed in on the importance of the three pointer and as a result constantly rank among the league lead in three point attempts. The teams understanding of the importance of the "three" extends beyond the offensive side of the court as they are relentless at defending the three pointer in the defensive end of the court.

Chicago Blackhawks (NHL)
In 2009 the Chicago Blackhawks turned to an outside company to produce analytical assessments for them. Subsequently, the Blackhawks have achieved unparalleled success in the NHL, winning three Stanley Cups in six seasons. With this success has come a number of difficult decisions for Blackhawks management as they are often only able to hang onto a core group of players following each cup run, while other key players receive offers that the Blackhawks simply cannot match under the NHL's salary cap. However, by using this analytics based system, the team has continuously been able to fill these gaps by finding players who are undervalued by other teams but will fit well with the Blackhawks' style of play. Many times, a team put together like this will seem underwhelming but perform higher than expectations. This strategy could be adopted by teams with limited financial freedom to put together a competitive team. This process has been refined by the Blackhawks who provide yet another example of the longevity that can be associated with analytic base decision making.

Gambling
Sports analytics have had significant impact on the field of play but sports analytics have also contributed to the growing industry of sports gambling, which accounts for approximately 13% of the global gambling industry. Valued somewhere between $700–$1,000 billion, sports gambling is extremely popular among groups of all kinds, from avid sports fans to recreational gamblers, you would be hard pressed to find a professional sporting event with nothing riding on the results. Many gamblers are attracted to sports gambling because of the plethora of information and analytics that are at their disposal when making decisions. One gambler, Bob Stoll, has been ahead of the analytics curve for a number of years, successfully betting against the line 56% (575–453) of the time in college football, a significant rate as a winning percentage above 52.4% is considered profitable. With the number of statistics so openly available to fans, Stoll combines a number of different statistics such as, home and away records, record vs divisional/non-divisional teams, rush yards per rush, etc., to make educated picks that have paid off more than half of the time.

Results from academic research show evidence that Twitter contains enough information to be useful for predicting outcomes in football games.

With the popularity of sports gambling came the development of a number of sports betting services. "Sports betting services are provided by companies such as William Hill, Ladbrokes, bet365, bwin, Paddy Power, betfair, Unibet and many more through their websites and in many cases betting shops. In 2012, William Hill generated around 2 billion U.S. dollars in revenue with about 30 billion U.S. dollars in total being staked / wagered with the company."

Baseball
In the early ages of baseball, hitters had no insight on pitchers' pitch sequence tendencies and spin rate. In today's game, the use of artificial intelligence (AI) and analytics now shows hitters spin rate and pitch sequence information before the game.

For the other side of this, it can also benefit the pitcher. As in today's game, AI and analytics can help the pitcher by showing which pitches are weaknesses of certain hitters. It can also show which parts of the strike zone hitters struggle with, so the pitcher can try to throw it to those spots of the strike zone to give themselves an advantage.

Basketball
In the early ages of basketball, the majority of shots were taken close to the basket. Now they NBA and other leagues have implemented a three-point line, allowing players to shoot from a further distance for 3 points instead of 2 points. For that reason, players have become multi-dimensional and more difficult to defend. The use of AI and analytics can show defenders how to guard certain players based on how well they shoot from three-point range. If they don't shoot well from three-point range, then the defender can back off and allow the shot.

AI and analytics has also had a big impact on coaching. Late game scenarios, timeout usage, and defensive strategy, and player impact are examples of this. Certain teams in the NBA have coaches whose primary focus is on data and analytics to assist the head coach on making in-game adjustments.