code

Data Modeling: SQLite + Python - Part 3

Robin Michael

06 Aug 2025 • 6 min read

This is a continuation of Sqlite in memory data structures used for general problem solving. As this is a continuation I am going to skip over some of the background details on why to use in memory sqlite and how to create and get to the harder problem that this solves elegantly.

Introduction

In Australian Football playoff teams are determined by wins and tiebreaker condition of match competitiveness. Match competitiveness is calculated by the formula of 100 * (points scored / points allowed) For example a win counts as 4 points, a tie 2 points and a loss 0 points. If multiple teams have the same match points then teams are ranked by the tiebreaker condition listed above.

Say two teams Team A, Team B with identical match points of 6 but one Team A has scored 100 points and allowed 50 in those matches while Team B scored 100 and allowed 99. In this situation the quality of Team A matches have been more in their favor than Team B by nature of scoring more points than allowing. This could happen as Team B luckily won or tied a match that there were losing up to a point whereas Team A won convincingly.

All teams will play each other with 1 team marked as home and the other as away. Therefore if you have 4 teams they will all play 6 (3 as home and 3 as away) matches to play against every other team in this format.

At the end of the season you could have a scenario like below. In this case the top 2 teams are advancing to the playoffs which requires no tiebreaker East Coast, Swans advance

team_name	opponent_name	score_for	score_against	home_result	away_result
east coast	essendon	102	42	east coast - win	essendon - loss
east coast	swans	112	81	east coast - win	swans - loss
east coast	tigers	81	36	east coast - win	tigers - loss
essendon	east coast	37	55	essendon - loss	east coast - win
essendon	swans	44	50	essendon - loss	swans - win
essendon	tigers	111	88	essendon - win	tigers - loss
swans	east coast	38	64	swans - loss	east coast - win
swans	essendon	72	39	swans - win	essendon - loss
swans	tigers	57	53	swans - win	tigers - loss
tigers	east coast	37	73	tigers - loss	east coast - win
tigers	essendon	46	65	tigers - loss	essendon - win
tigers	swans	95	62	tigers - win	swans - loss

However if you take the match listed "d:b" Which is Tigers:East Coast and change the score from "37:73" from a East Coast win to a Tigers win instead of say "74:73" and you take the top 3 teams then it would go to a tiebreaker between Essendon, Tigers with Tigers advancing by having a higher competitiveness score

team_name	point_scored	points_allowed	total_points
east coast	487	308	20
swans	360	407	12
tigers	392	449	8
essendon	338	413	8

338/413 = 81.84 to 392/449 = 87.30 for the Tigers

Again the data is presented as a series of string data that is broken apart into the following.

Team Identifiers

A = ["a:Essendon", "b:East Coast", "c:Swans", "d:Tigers"]

Match Opponents

B = ["a:b", "a:c", "a:d", "b:a", "b:c", "b:d", "c:a", "c:b", "c:d", "d:a", "d:b", "d:c"]

Match Scores

C = ["37:55", "44:50", "111:88", "102:42", "112:81", "81:36", "72:39", "38:64", "57:53", "46:65", "37:73", "95:62"]

Sqlite Database Plan

There is a need to create a table that holds the MatchDetail from the perspective of one team

From the above table I can calculate a running sum of the scored points for/against in each match as the first team listed and as opponent along with capturing the win score as the field points.

With this table you can find the match summaries with a query like this to see all the matches played (each team plays 3 matches and the opponent, points for and points against and with the final result)

SELECT team_name,
       opponent_name,
       score_for,
       score_against,
       concat(team_name, ' - ', CASE points_team WHEN 4 THEN "win" WHEN 2 THEN "tie" ELSE "loss" END) as home_result,
       concat(opponent_name, ' - ', CASE points_opponent WHEN 4 THEN "win" WHEN 2 THEN "tie" ELSE "loss" END) as away_result
FROM MatchDetail
order by team_name, opponent_name;

team_name	opponent_name	score_for	score_against	home_result	away_result
east coast	essendon	102	42	east coast - win	essendon - loss
east coast	swans	112	81	east coast - win	swans - loss
east coast	tigers	81	36	east coast - win	tigers - loss
essendon	east coast	37	55	essendon - loss	east coast - win
essendon	swans	44	50	essendon - loss	swans - win
essendon	tigers	111	88	essendon - win	tigers - loss
swans	east coast	38	64	swans - loss	east coast - win
swans	essendon	72	39	swans - win	essendon - loss
swans	tigers	57	53	swans - win	tigers - loss
tigers	east coast	37	73	tigers - loss	east coast - win
tigers	essendon	46	65	tigers - loss	essendon - win
tigers	swans	95	62	tigers - win	swans - loss

Furthermore to get the playoff ranking you need to sum each record as home and away (team_name, opponent_name) to build up the competitiveness tiebreaker value

In this case using Common Table Expressions you can sum the points for a team and then join the two sets together to get that ranking like so

WITH points_for as (
    SELECT team_name, SUM(score_for) score_1, SUM(score_against) score_2, SUM(points_team) win_score
    FROM MatchDetail
    group by team_name
),
points_against as (
    SELECT opponent_name, SUM(score_against) score_1, SUM(score_for) score_2, SUM(points_opponent) win_score
    FROM MatchDetail
    group by opponent_name
)
SELECT pf.team_name, pf.score_1 + pa.score_1 point_scored, pf.score_2 + pa.score_2 points_allowed, pf.win_score + pa.win_score total_points
from points_for pf
inner join points_against pa
on pf.team_name = pa.opponent_name
ORDER BY pf.win_score + pa.win_score DESC, (100 * (CAST((pf.score_1 + pa.score_1) as float)) / CAST((pf.score_2 + pa.score_2) as float)) DESC
LIMIT 2;

The key point here is to group by team the perspective of the scores as the home team team_name and as the away team opponent_name to get the full picture. You need to consider the win_score from each teams perspective on the match to calculate the winning points.

Implementation

I need to do the following

Parse the first array of strings as the Team id and names
Parse the second array of strings as the team matches by their ids
Parse and match the match scores through the team matches back to their ids

In this case the method signature will look like this def get_playoff_teams(teams:list[str], matches:list[str], match_scores:list[str]) -> list[str]:

Returning the list of team names in order of playoff rankings

Method details

def get_playoff_teams(teams:list[str], matches:list[str], match_scores:list[str]) -> list[str]:
    try:
        all_teams = get_teams(teams)
        con = sqlite3.connect(':memory:')
        con.execute(
            "CREATE TABLE MatchDetail(team_name text, opponent_name text, score_for int, score_against int, points_team int, points_opponent int);")
        [insert_match_data(all_teams, con, match, score) for match, score in zip(matches, match_scores)]
        max_points = con.execute("""
                    WITH points_for as (
                        SELECT team_name, SUM(score_for) score_1, SUM(score_against) score_2, SUM(points_team) win_score
                        FROM MatchDetail
                        group by team_name
                    ),
                    points_against as (
                        SELECT opponent_name, SUM(score_against) score_1, SUM(score_for) score_2, SUM(points_opponent) win_score
                        FROM MatchDetail
                        group by opponent_name
                    )
                    SELECT pf.team_name, pf.score_1 + pa.score_1 point_scored, pf.score_2 + pa.score_2 points_allowed, pf.win_score + pa.win_score total_points
                    from points_for pf
                    inner join points_against pa
                    on pf.team_name = pa.opponent_name
                    ORDER BY pf.win_score + pa.win_score DESC, (100 * (CAST((pf.score_1 + pa.score_1) as float)) / CAST((pf.score_2 + pa.score_2) as float)) DESC
                    LIMIT 2;
                                 """)
        playoff_teams = [result[0] for result in max_points]
        return playoff_teams
    except Exception as e:
        ic(e)
        return ["", ""]

Details below of inner method calls

Parse the array of team ids and names

def get_teams(teams) -> dict[str, str]:
    all_teams = {}
    for input_team in teams:
        team_details = input_team.split(":")
        all_teams[team_details[0].lower().strip()] = team_details[1].lower().strip()
    return all_teams

Parse the team match score win,tie,loss

def match_points(score_against_team2: int, score_for_team1: int) -> int:
    team2_points = 2
    if score_against_team2 > score_for_team1:
        team2_points = 4
    if score_against_team2 < score_for_team1:
        team2_points = 0
    return team2_points

Parse and insert the team match points into the MatchDetails table

def insert_match_data(all_teams: dict[str, str], con: sqlite3.Connection, match: str, score: str) -> None:
    match_details = match.split(":")
    score_details = score.split(":")
    team1_name = all_teams[match_details[0].lower().strip()]
    team2_name = all_teams[match_details[1].lower().strip()]
    score_for_team1 = int(score_details[0])
    score_against_team2 = int(score_details[1])
    team1_points = match_points(score_for_team1, score_against_team2)
    team2_points = match_points(score_against_team2, score_for_team1)
    con.execute(
        "INSERT INTO MatchDetail (team_name, opponent_name, score_for, score_against, points_team, points_opponent) VALUES (?, ?, ?, ?, ?, ?)",
        ic((team1_name, team2_name, score_for_team1, score_against_team2, team1_points, team2_points)))

When called as such

if __name__ == '__main__':
    A = ["a:Essendon", "b:East Coast", "c:Swans", "d:Tigers"]
    B = ["a:b", "a:c", "a:d", "b:a", "b:c", "b:d", "c:a", "c:b", "c:d", "d:a", "d:b", "d:c"]
    C = ["37:55", "44:50", "111:88", "102:42", "112:81", "81:36", "72:39", "38:64", "57:53", "46:65", "37:73", "95:62"]
    ic(get_playoff_teams(A,B,C))

Returns the following ['east coast', 'swans']

Conclusion

Again the use of structured table data, here allows for introspection and manipulation of the data in multiple flows and approaches. The two way nature of each single match having a perspective of home/away team with points scored/against is deeply relational and avoids several complicated and unwieldy dictionary and list structures and customized key lookups. This solution is both performant, clear, and readily manipulatable according to SQL rules and set transformations. 💯to me

💡

Sqlite in memory tables should be your first choice when working with connected relational data.

Introduction

Sqlite Database Plan

Implementation

Conclusion

Sign up for more like this.