Drafting a fantasy football team

Published

September 21, 2023

Modified

January 31, 2024

Cartola is a fantasy football league following the Brazilian Championship A Series.

Cartola offers a public API to access data for the current round. A couple of years ago, I created a script to automate data retrieval to a repository, which now hosts comprehensive historical data since 2022.

In this post, I will delve into the data for the 2022 season, formulate a mixed integer linear program to draft the optimal team, and present initial concepts for forecasting player scores using mixed effects linear models.

The game

We begin the season with a budget of C$ 100, the game’s paper currency.

Each round is preceded by a market session, where players are assigned a value. We are tasked with forming a team of 11 players plus a coach, all within our budget and adhering to a valid formation. A captain must be chosen from among the players, excluding the coach.

The market is available until the round starts. Players then earn scores based on their real-life match performances. Our team’s score is the aggregate of our players’ scores, with our captain’s score doubled in the 2022 season.

Following the conclusion of the round, player values are recalibrated based on performance -— with increases for scores above their average and decreases for below-average performances. Our budget for the next round is our previous budget, plus the sum of our players’ value variations.

Data wrangling

Let’s talk about data structures: each round has a market, and each market is a list of players. A player is a structure like this:

Player(round=0, player=42234, team=264, position=1, games=0, average=0.0, value=10.0, score=0.0, appreciation=0.0, minimum=4.53)

Let’s get the list of markets for 2022 and flatten it into a single DataFrame:

+-------------------------------------------------------------------------------+
| round   player   team   position   …   value   score   appreciation   minimum |
+===============================================================================+
| 1       37424    1371   6          …   3.0     0.0     0.0            0.0     |
| 1       37646    314    3          …   5.0     0.0     0.0            2.3     |
| 1       37656    266    1          …   9.0     0.0     0.0            4.08    |
| …       …        …      …          …   …       …       …              …       |
| 38      121398   354    4          …   1.0     0.0     0.0            0.0     |
| 38      121399   354    4          …   1.0     0.0     0.0            0.0     |
| 38      121400   354    5          …   1.0     0.0     0.0            0.0     |
+-------------------------------------------------------------------------------+
shape: (30_063, 10)

Now, let’s focus on a specific player to illustrate our data while we wrangle it:

+-------------------------------------------------------------------------------+
| round   player   team   position   …   value   score   appreciation   minimum |
+===============================================================================+
| 1       42234    264    1          …   10.0    0.0     0.0            4.53    |
| 2       42234    264    1          …   7.93    2.0     -2.07          5.52    |
| 3       42234    264    1          …   10.44   11.0    2.51           4.75    |
| …       …        …      …          …   …       …       …              …       |
| 36      42234    264    1          …   11.51   0.0     0.03           3.63    |
| 37      42234    264    1          …   12.68   0.0     1.17           9.29    |
| 38      42234    264    1          …   11.06   0.0     -1.62          1.37    |
+-------------------------------------------------------------------------------+
shape: (38, 10)

Filtering participation

Players will show up in the market for many rounds that they do not participate in. However, for our analysis, we are only interested in players that actually played a game in the round.

Each player has a status field intended to indicate their participation in the round. However, this field is often inaccurate, likely due to the API data being updated before the round.

One solution is to keep only rows where there is an increase in the number of games the player has played:

+------------------------+
| round   player   games |
+========================+
| 1       42234    0     |
| 2       42234    1     |
| 3       42234    2     |
| …       …        …     |
| 36      42234    28    |
| 37      42234    29    |
| 38      42234    30    |
+------------------------+
shape: (31, 3)

Imputing scores

Similarly, the player score field is often inaccurate, likely for the same reasons as the status field. Fortunately, the average field is reliable, allowing us to recover the score:

\[ \begin{align*} \mathrm{Average}(\mathbf{s}_{1:t}) = \frac{\mathrm{Average}(\mathbf{s}_{1:(t-1)}) + s_t}{2} \\ s_t = 2\mathrm{Average}(\mathbf{s}_{1:t}) - \mathrm{Average}(\mathbf{s}_{1:(t-1)}), \end{align*} \]

where \(\mathbf{s}\) is the vector of scores for a given player across all rounds.

+----------------------------------+
| round   player   score   average |
+==================================+
| 1       42234    2.0     2.0     |
| 2       42234    11.0    6.5     |
| 3       42234    9.5     8.0     |
| …       …        …       …       |
| 36      42234    5.1     4.96    |
| 37      42234    4.62    4.79    |
| 38      42234    4.79    4.79    |
+----------------------------------+
shape: (31, 4)

Adding fixtures

Let’s fetch the list of fixtures to enrich our dataset. A fixture is an object like:

Fixture(round=1, home=282, away=285)

Let’s consolidate these fixtures into a single DataFrame and then pivot them into a long format:

+------------------------------+
| round   team   versus   home |
+==============================+
| 1       282    285      1    |
| 1       266    277      1    |
| 1       276    293      1    |
| …       …      …        …    |
| 38      276    290      0    |
| 38      294    1371     0    |
| 38      263    293      0    |
+------------------------------+
shape: (760, 4)

Finally, let’s join this data to our dataset:

+---------------------------------------+
| round   player   team   versus   home |
+=======================================+
| 1       42234    264    263      0    |
| 2       42234    264    314      1    |
| 3       42234    264    275      0    |
| …       …        …      …        …    |
| 36      42234    264    354      1    |
| 37      42234    264    294      0    |
| 38      42234    264    282      1    |
+---------------------------------------+
shape: (31, 5)

Aligning variables

In our subsequent analysis, the average field will exclude the score from the given round. Additionally, the appreciation field will be calculated in relation to the round’s score.

+---------------------------------------------------------+
| round   player   average   value   score   appreciation |
+=========================================================+
| 1       42234    0.0       10.0    2.0     -2.07        |
| 2       42234    2.0       7.93    11.0    2.51         |
| 3       42234    6.5       10.44   9.5     1.25         |
| …       …        …         …       …       …            |
| 36      42234    4.82      11.51   5.1     1.17         |
| 37      42234    4.96      12.68   4.62    -1.62        |
| 38      42234    4.79      11.06   4.79    0.0          |
+---------------------------------------------------------+
shape: (31, 6)

Team picking

Now let’s solve the problem of picking the best team a given market. Let $ $ be the set of valid formations, then for each formation \(f \in \mathcal{F}\), solve:

\[ \begin{equation*} \begin{array}{ll@{}ll} \text{maximize} & \displaystyle \hat{\mathbf{s}}^T \mathbf{x}, & \mathbf{x} \in \{\mathbf{0}, \mathbf{1}\} \\ \text{subject to} & \displaystyle \mathbf{v}^T \mathbf{x} \leq b \\ & \displaystyle \mathbf{P}^T \mathbf{x} = f, \\ \end{array} \end{equation*} \]

where

\(\mathbf{x}\) is a variable vector of player picks in the market; \(\hat{\mathbf{s}}\) is the vector of predicted player scores in the market; \(b\) is our available budget for that round; \(\mathbf{P}\) is the matrix of dummy-encoded player formations in the market.

Finally, take the solution with the highest objective.

import numpy as np
import pulp


class Formation(BaseModel):
    goalkeeper: int = Field(alias="gol")
    defender: int = Field(alias="zag")
    winger: int = Field(alias="lat")
    midfielder: int = Field(alias="mei")
    forward: int = Field(alias="ata")
    coach: int = Field(alias="tec")


class Problem(BaseModel):
    scores: List[float]
    values: List[float]
    budget: float
    positions: List[List[int]]
    formations: List[Formation]

    def solve(self) -> List[pulp.LpSolution]:
        formations = [list(f.model_dump().values()) for f in self.formations]
        problems = [self.construct(f) for f in formations]
        [p.solve(pulp.COIN(msg=False)) for p in problems]
        objectives = [p.objective.value() for p in problems]
        best = np.argmax(np.array(objectives))
        solution = problems[best]
        variables = [v.value() for v in solution.variables()]
        picks = np.array(variables)
        return picks

    def construct(self, formation: List[int]) -> pulp.LpProblem:
        n = len(self.scores)
        m = len(formation)
        problem = pulp.LpProblem("team_picking", pulp.LpMaximize)
        indexes = ["pick_" + str(i).zfill(len(str(n))) for i in range(n)]
        picks = [pulp.LpVariable(i, cat=pulp.const.LpBinary) for i in indexes]
        problem += pulp.lpDot(picks, self.scores)
        problem += pulp.lpDot(picks, self.values) <= self.budget
        for i in range(m):
            problem += pulp.lpDot(picks, self.positions[i]) == formation[i]
        return problem

Backtesting

By solving the team picking problem for all rounds, we can backtest our performance in the season. Before backtesting, let’s get the set of valid formations \(\mathcal{F}\):

[Formation(goalkeeper=1, defender=3, winger=0, midfielder=4, forward=3, coach=1),
 Formation(goalkeeper=1, defender=3, winger=0, midfielder=5, forward=2, coach=1),
 Formation(goalkeeper=1, defender=2, winger=2, midfielder=3, forward=3, coach=1),
 Formation(goalkeeper=1, defender=2, winger=2, midfielder=4, forward=2, coach=1),
 Formation(goalkeeper=1, defender=2, winger=2, midfielder=5, forward=1, coach=1),
 Formation(goalkeeper=1, defender=3, winger=2, midfielder=3, forward=2, coach=1),
 Formation(goalkeeper=1, defender=3, winger=2, midfielder=4, forward=1, coach=1)]

Knowing our formation constraints, we’re ready to backtest. Starting with a budget of C$ 100, for each round let’s:

  1. Predict each player’s score based on their performance on previous rounds;
  2. Pick the team with the best total score;
  3. Add the sum of the team player’s appreciation to our budget.
from typing import Callable
import polars as pl


def backtest(
    players: pl.DataFrame, predict: Callable, initial_budget: float = 100.0
) -> pl.DataFrame:
    rounds = players.get_column("round").max()
    budget = [None] * rounds
    teams = [None] * rounds
    budget[0] = initial_budget
    appreciation = 0
    for round in range(rounds):
        if round > 0:
            budget[round] = budget[round - 1] + appreciation
        data = players.filter(pl.col("round") < round + 1)
        candidates = players.filter(pl.col("round") == round + 1)
        candidates = predict(data, candidates)
        problem = Problem(
            scores=candidates.get_column("prediction"),
            values=candidates.get_column("value"),
            positions=candidates.get_column("position").to_dummies(),
            budget=budget[round],
            formations=formations,
        )
        picks = problem.solve()
        team = candidates.filter(picks == 1)
        teams[round] = team
        appreciation = team.get_column("appreciation").sum()
    teams = pl.concat(teams)
    return teams

Before exploring predictions, we’ll begin with a few hypothetical backtests using actual observed scores for team selection. Backtesting this strategy, this is our team in the first round:

+-----------------------------------------------------------------------------+
| round   player   team   position   …   minimum   versus   home   prediction |
+=============================================================================+
| 1       71571    356    1          …   3.19      1371     1      11.0       |
| 1       42145    294    2          …   2.75      290      1      15.8       |
| 1       105584   264    2          …   2.75      263      0      10.5       |
| …       …        …      …          …   …         …        …      …          |
| 1       89840    276    5          …   5.42      293      1      27.1       |
| 1       104530   294    5          …   2.3       290      1      11.0       |
| 1       97341    276    6          …   0.0       293      1      9.52       |
+-----------------------------------------------------------------------------+
shape: (12, 13)

And we can plot out cumulative performance during the season:

This might seem like a perfect campaign at first, but it’s possible that, early in the season, we didn’t have enough budget to pick the best scoring teams. To test this hypothesis, we backtest the same strategy with unlimited budget from the start:

Both runs are nearly identical, which is evidence that focusing on appreciation is not so important if we have accurate predictions for the scores. If we predict scores perfectly, we get a near perfect run.

To put our backtests into perspective, the 2022 season champion had a total score of 3434.37. This is very impressive and not very far from the near perfect run.

Score prediction

For each round, we must predict \(\hat{s}\), the vector of score predictions, using data from previous rounds.

However, during the first round, we don’t have any previous data to train our model. In this case, we need to include prior information. One way to do that would be to use data from previous seasons. However, we know a variable where this information is already encoded: the player value. Each season starts with players valued according to their past performance. Knowing this, all our models start with \(\hat{s} = v\) in the first round.

Let’s use Bambi (Capretto et al. 2022) and its default priors to fit our models. We won’t delve into convergence diagnostics, since we are more interested in the average of the predictive posteriors and the backtest itself is measure of the prediction quality.

Capretto, Tomás, Camen Piho, Ravin Kumar, Jacob Westfall, Tal Yarkoni, and Osvaldo A Martin. 2022. “Bambi: A Simple Interface for Fitting Bayesian Linear Models in Python.” Journal of Statistical Software 103 (15): 1–29. https://doi.org/10.18637/jss.v103.i15.
Bailey, David H., Jonathan M. Borwein, Marcos Lopez de Prado, and Qiji Jim Zhu. 2013. “The Probability of Back-Test over-Fitting.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2326253.

One question that arises here is: why not use non-parametric models such as gradient boosted trees or neural nets? After some experimentation, I concluded they are not a good fit for this problem: either because they assume independence between observations, or because they are too data hungry. Also, tuning these models for backtests might lead us into a rabbit hole (Bailey et al. 2013).

Player average

\[ \begin{align*} \mathbf{\hat{s}} = \mathbf{Z} \mathbf{\beta} \\ \mathbf{s} \sim N(\mathbf{\hat{s}}, \sigma), \end{align*} \]

where \(\mathbf{Z}\) is a dummy-encoded matrix of players; \(\mathbf{\beta}\) is a vector of parameters for each player.

In this model, \(\mathbf{\beta}\) is simply a vector of player averages. Let’s also consider that players that show up in the middle of the season have an average of zero before their first round. This will be our baseline model.

Player random effects

\[ \begin{align*} \mathbf{\hat{s}} = \alpha + \mathbf{Z} \mathbf{b} \\ \mathbf{b} \sim N(0, \sigma_b), \end{align*} \]

where \(\alpha\) is an intercept and \(\mathbf{b}\) is a vector of player random effects.

This model performs significantly better than the average model, possibly because of the partial pooling between the random effects, that pulls large effects towards the overall mean (Clark 2019). In our dataset, it’s common for players that played one or two games to have large averages by chance.

Clark, Michael. 2019. “Michael Clark: Shrinkage in Mixed Effects Models.” https://m-clark.github.io/posts/2019-05-14-shrinkage-in-mixed-models/.

Fixture mixed effects

\[ \mathbf{\hat{s}} = \alpha + \mathbf{X} \mathbf{\beta} + \mathbf{Z} \mathbf{b}, \]

where \(\mathbf{X}\) is a matrix of the dummy-encoded fixture variables: the player team, whether they are playing at home, and their adversary team variables; \(\mathbf{\beta}\) is a vector of fixed effects.

This model brings more context to our predictions. It also provides a reasonable way to predict a new player, by setting their \(b = 0\) (the mean of the random effects). However, it does not improve significantly over our random effects model.

Conclusion

We developed a comprehensive framework for the fantasy football team picking problem. There are more ideas we could explore to improve our chances of winning:

  • enriching our data and models with player scouts;
  • including more information in our priors;
  • testing strategies that balance predicted score and appreciation;
  • further model diagnostics.

However, I suppose expert human player predictions have a certain edge over those of hobbyist statistical models in fantasy leagues, due to the fact that there are all sorts of relevant data unavailable in public datasets.

At least, this seems to be the case for brazilian soccer, also known as “a little box of surprises”.

Citation

BibTeX citation:
@online{assunção2023,
  author = {Assunção, Luís},
  title = {Drafting a Fantasy Football Team},
  date = {2023-09-21},
  url = {https://assuncaolfi.github.io/site/blog/fantasy-football},
  langid = {en}
}
For attribution, please cite this work as:
Assunção, Luís. 2023. “Drafting a Fantasy Football Team.” September 21, 2023. https://assuncaolfi.github.io/site/blog/fantasy-football.