footix.utils package

Submodules

footix.utils.decorators module

footix.utils.decorators.verify_required_column(column_names)[source]

Decorator that validates the presence of required columns in a pandas DataFrame.

The decorator inspects both positional and keyword arguments. If a pd.DataFrame is supplied under the name df (positional or keyword) it checks that all names in column_names are present. A ValueError is raised with a clear message if any columns are missing.

Parameters

column_namesIterable[str]

An iterable of column names that must exist in the DataFrame.

Returns

Callable[[Callable[P, R]], Callable[P, R]]

The wrapped function.

Parameters:

column_names (Iterable[str])

Return type:

Callable[[Callable[[~P], R]], Callable[[~P], R]]

footix.utils.team_name_resolver module

Robust team name resolver for matching calendar names to training names.

Combines a static YAML mapping (hand-curated, per-league) with rapidfuzz fuzzy matching (WRatio). When the fuzzy confidence is insufficient and the resolver is in interactive mode, the user is prompted to confirm or provide the correct training name. Confirmed mappings are persisted back to the YAML so subsequent runs require no interaction for the same team names.

Usage:
>>> from footix.utils.team_name_resolver import TeamNameResolver
>>> resolver = TeamNameResolver(league="ligue_1", interactive=True)
>>> mapping = resolver.resolve(calendar_names=[...], training_names=[...])
exception footix.utils.team_name_resolver.UnresolvedTeamNameError(message, team_name, candidates)[source]

Bases: ValueError

Raised when a team name cannot be resolved in non-interactive mode.

Parameters:
Return type:

None

team_name

The unresolved calendar team name.

candidates

Top fuzzy candidates as (name, score) tuples.

class footix.utils.team_name_resolver.TeamNameResolver(league, mapping_dir=None, interactive=True, auto_threshold=90.0, confirm_threshold=70.0)[source]

Bases: object

Map calendar-format team names to model-training team names.

Resolution priority:
  1. Static YAML mapping (case-insensitive exact match).

  2. Exact case-insensitive match in training names.

  3. rapidfuzz WRatio ≥ auto_threshold → auto-accept.

  4. rapidfuzz WRatio ≥ confirm_threshold → interactive confirm (if interactive=True) or UnresolvedTeamNameError otherwise.

  5. Below confirm_threshold → interactive selection from top-5 candidates or UnresolvedTeamNameError in non-interactive mode.

New mappings discovered during a run are persisted to the YAML file so subsequent runs skip the interactive step for the same team names.

Parameters:
  • league (str) – League YAML key, e.g. "ligue_1", "ligue_2", "bundesliga_1". Can also be a full competition string such as "FRA Ligue 1" — it will be converted automatically.

  • mapping_dir (Path | None) – Directory containing <league>.yaml mapping files. Defaults to <repo_root>/data/team_name_mappings/.

  • interactive (bool) – When True, ambiguous matches trigger a CLI prompt. Set to False for CI / non-interactive pipelines; UnresolvedTeamNameError is raised instead.

  • auto_threshold (float) – rapidfuzz WRatio score (0–100) above which a match is accepted automatically. Default 90.

  • confirm_threshold (float) – rapidfuzz WRatio score (0–100) above which a match is proposed for interactive confirmation. Default 70.

resolve(calendar_names, training_names)[source]

Build a full mapping from calendar_names to training_names.

Each calendar name goes through the resolution priority chain (static → exact → fuzzy auto → interactive / error) and the result is returned as a dictionary. Any newly confirmed mappings are persisted to the YAML before this method returns.

Parameters:
  • calendar_names (list[str]) – Team names as they appear in the fixture/calendar CSV.

  • training_names (list[str]) – Team names as they appear in the training dataset (football-data.co.uk convention).

Returns:

Dict mapping every calendar name to its resolved training name. Skipped names are mapped to themselves.

Raises:

UnresolvedTeamNameError – If interactive=False and a name cannot be auto-resolved above auto_threshold.

Return type:

dict[str, str]

footix.utils.typing module

class footix.utils.typing.ProtoModel(*args, **kwargs)[source]

Bases: Protocol

fit(*args, **kwargs)[source]
Parameters:
Return type:

Any

predict(HomeTeam, AwayTeam)[source]
Parameters:
  • HomeTeam (str)

  • AwayTeam (str)

Return type:

Any

class footix.utils.typing.RPSResult(z_score, mean, std_dev)[source]

Bases: NamedTuple

Named tuple for Ranked Probability Score statistics.

Parameters:
z_score: float

Alias for field number 0

mean: float

Alias for field number 1

std_dev: float

Alias for field number 2

class footix.utils.typing.ProbaResult(proba_home, proba_draw, proba_away)[source]

Bases: NamedTuple

Named tuple for Probabilities.

Parameters:
proba_home: float

Alias for field number 0

proba_draw: float

Alias for field number 1

proba_away: float

Alias for field number 2

class footix.utils.typing.SampleProbaResult(proba_home, proba_draw, proba_away)[source]

Bases: NamedTuple

A NamedTuple representing the probability results for a match outcome.

Parameters:
proba_home

Array of probabilities for the home team winning.

Type:

np.ndarray

proba_draw

Array of probabilities for a draw.

Type:

np.ndarray

proba_away

Array of probabilities for the away team winning.

Type:

np.ndarray

proba_home: ndarray

Alias for field number 0

proba_draw: ndarray

Alias for field number 1

proba_away: ndarray

Alias for field number 2

footix.utils.utils module

footix.utils.utils.poisson_model_recap(home_team, away_team, model)[source]
Parameters:
Return type:

None

Module contents

Utility functions and helpers for Footix.

This module provides common utilities including type definitions, decorators, and helper functions used across the Footix package.

Submodules:
  • typing: Type definitions and aliases

  • team_name_resolver: Robust calendar-to-training team name resolver

class footix.utils.TeamNameResolver(league, mapping_dir=None, interactive=True, auto_threshold=90.0, confirm_threshold=70.0)[source]

Bases: object

Map calendar-format team names to model-training team names.

Resolution priority:
  1. Static YAML mapping (case-insensitive exact match).

  2. Exact case-insensitive match in training names.

  3. rapidfuzz WRatio ≥ auto_threshold → auto-accept.

  4. rapidfuzz WRatio ≥ confirm_threshold → interactive confirm (if interactive=True) or UnresolvedTeamNameError otherwise.

  5. Below confirm_threshold → interactive selection from top-5 candidates or UnresolvedTeamNameError in non-interactive mode.

New mappings discovered during a run are persisted to the YAML file so subsequent runs skip the interactive step for the same team names.

Parameters:
  • league (str) – League YAML key, e.g. "ligue_1", "ligue_2", "bundesliga_1". Can also be a full competition string such as "FRA Ligue 1" — it will be converted automatically.

  • mapping_dir (Path | None) – Directory containing <league>.yaml mapping files. Defaults to <repo_root>/data/team_name_mappings/.

  • interactive (bool) – When True, ambiguous matches trigger a CLI prompt. Set to False for CI / non-interactive pipelines; UnresolvedTeamNameError is raised instead.

  • auto_threshold (float) – rapidfuzz WRatio score (0–100) above which a match is accepted automatically. Default 90.

  • confirm_threshold (float) – rapidfuzz WRatio score (0–100) above which a match is proposed for interactive confirmation. Default 70.

resolve(calendar_names, training_names)[source]

Build a full mapping from calendar_names to training_names.

Each calendar name goes through the resolution priority chain (static → exact → fuzzy auto → interactive / error) and the result is returned as a dictionary. Any newly confirmed mappings are persisted to the YAML before this method returns.

Parameters:
  • calendar_names (list[str]) – Team names as they appear in the fixture/calendar CSV.

  • training_names (list[str]) – Team names as they appear in the training dataset (football-data.co.uk convention).

Returns:

Dict mapping every calendar name to its resolved training name. Skipped names are mapped to themselves.

Raises:

UnresolvedTeamNameError – If interactive=False and a name cannot be auto-resolved above auto_threshold.

Return type:

dict[str, str]

exception footix.utils.UnresolvedTeamNameError(message, team_name, candidates)[source]

Bases: ValueError

Raised when a team name cannot be resolved in non-interactive mode.

Parameters:
Return type:

None

team_name

The unresolved calendar team name.

candidates

Top fuzzy candidates as (name, score) tuples.