Challenges of Sports Data Management
As a series of blog posts, we will touch on various technical and procedural challenges the engineering team at Frontrunner has faced as we expand and develop our product.
First, we will look into a core input to our system that drives our product: sports data. This is one of our driving factors for increasing our coverage and providing transparent, liquid, accurate markets.
A Multi-Provider Solution
As the sports betting industry as a whole continues to grow, there are more opportunities for improved and innovative data feeds. Besides old and new classic providers such as Sportradar and SportsDataIO, more companies are creating off-shoots or other forms of monetization by selling sports game, betting odds, and predictive data. Some even specialize in certain leagues or sporting categories. But while each of these have their advantages, they each organize and deliver their data in significantly different ways — even from sport to sport within the same provider.
Given this, it was vital for us to be able to stay flexible from a data parsing perspective to handle different formats, endpoints, and API limits.
For static data — for example, basic team info — we sync the information with our providers and also store our own unique identifiers. This allows us to maintain our own internal IDs for internal logic that will be unaffected if we switch providers. It also makes a multi-provider approach just as seamless, giving us the flexibility to collate stats, odds, and other game info across providers.
The other main decision we focused on was separating our parsing and data retrieval logic from our processing and business logic. By keeping our retrieval separate, we are able to more simply extend those services, maintain unique rules and use cases across leagues and providers, and react to third party changes quickly and effectively. By sanitizing the data at the end of this retrieval process, we will have consistent data for our processing services to use regardless of its source.
Onboarding New Leagues
In order to grow our user base and follow the sports calendar, it has been very important for us to add markets and features for various sports leagues. Since the summer, we have introduced MLB, soccer (English Premier League), NFL, and NBA markets as the new seasons have kicked off.
With our data retrieval service already organized separately from the rest of our code, we were able to focus on the similarities across sports and leagues. Leagues within the same sport are quite similar. Even across sports basic game status and scheduling is pretty similar, but different sports have vastly different types of stats.
We landed on organizing by sport, with specificity where needed based on the specific league. And for stats, we were able to find key similarities across all team sports with home and away team scores, team records, and an overall concept of a clock. Each sport and league handles this slightly differently, but our frontend applications are able to accept this information in common formats. Splitting by sport also gave us the extra benefit of being able to have each league unaffected by the other, in case of any issues.
Another interesting problem we faced was the standard display for different sports. American sports commonly follow an ‘Away Team @ Home Team’ setup, where the EPL follows a ‘Home Team v Away Team’ format. Note the ordering and delimiter difference. Creating these display rules was helpful in us staying generic in assigning statistics — such as the home team’s score — and specific in representation.
Games vs Futures Markets
As we continue to expand the types of markets we provide from single game winners to season long futures to per-game props and beyond, a large focus is on obtaining and providing relevant stats to our users and market maker algorithms. Sportsbooks inherently give little information about the events people bet on, but we want to help our users succeed without having to go to external sources, such as ESPN.
However, keeping up to date with all the stat changes can be expensive on the wallet and on the system. Using various timers and frequencies for retrieving data based on the state of games and seasons at large helps keep our system from constantly sending and processing requests.
Along with this, organizing our data in infrequently and frequently updated buckets is on our roadmap. Showing a team’s season-long stats and trends year over year can be really valuable in assessing whether they might win the championship in six months, while per-game stats are vital for player game props.
In addition to some of the next steps already discussed, a few more data-driven features include being able to tie a sharp change in price in a market to a specific sporting event or being able to receive notifications and suggestions based on player performance or injury news.
As more services develop with creative integrations and supplemental statistics, we’ll want to be ready to onboard or migrate efficiently. As much as we have ideas of which sports, leagues, and propositions we want to add to Frontrunner, we need to first make sure we have the data to support them as quickly and easily as possible.
Sign up at https://getfrontrunner.com and make your first trade today.
Frontrunner is a decentralized sports prediction market where users can buy shares of sports propositions and trade them like they would stocks. Unlike traditional sportsbooks where users place a bet and wait, Frontrunner gives users full control over their portfolios, allowing them to dynamically buy and sell positions as the odds change. By leveraging the power of free markets and the blockchain, we create transparent markets and liquid positions to reduce counterparty risk and fundamentally change the way that people invest in their sports knowledge and beliefs.
Website | Discord | Twitter | LinkedIn | Instagram | Facebook