Step 06

Text insights report

Use pandas to make your scout talk: winrates, streaks, and trends in a cozy markdown.

Run

Run after each small change. Tiny loops win.

uv run python -m src.scout

You will touch

src/scout/ (report generation)
data/derived/matches.csv
reports/summary.md

Time

60–120 minutes

Do this (suggested order)

Install pandas: uv add pandas.
Read data/derived/matches.csv into a dataframe.
Do your three sanity prints (shape, dtypes, head) before any math.
Compute a few stats (overall, champ pool with min games, streaks, last 10 vs last 30).
Write reports/summary.md and rerun until it reads like a human wrote it.

You’ll practice

pandas basics: read, groupby, sort
Compute rates and small-sample rules
Write human-readable markdown

Explainers (for context, not homework)

Pandas: the minimum you need — The prints + groupbys you’ll reuse
uv + dependencies — Add pandas without tears

Build

Dependency

pandas

Read + write

Read data/derived/matches.csv
Write reports/summary.md

Content

Overall stats (games, winrate, average/median K/D/A)
Champion pool with min games filter
Longest win/loss streaks
Recent trend (last 10 vs last 30)

Check yourself

reports/summary.md reads like a scout report, not a ransom note.

If it breaks

win stored as text not boolean
Division by zero (deaths = 0)
Forgetting to sort by time before trends

Hints (spoilers)

Hint: sanity prints (do them before the math)

Before doing any math: print df.shape, df.dtypes, df.head(3). If those are wrong, your “insights” will be fan fiction.

The three sanity prints

print(df.shape)
print(df.dtypes)
print(df.head(3).to_string(index=False))

Bigger hint: streaks need sorted games

Streaks only make sense if the games are in order. Sort by time first, then count runs of wins/losses.

Tiny streak idea (conceptual)

Sort first. Then compute runs.

df = df.sort_values('game_start')
# one way: compare win to previous win and count run lengths

Unblock-me: deaths = 0 (KDA math explodes)

If you compute (kills + assists) / deaths, someone will have deaths == 0. Decide your rule (cap, use max(deaths, 1), or show as “perfect”).

A boring fix

df['kda'] = (df['kills'] + df['assists']) / df['deaths'].clip(lower=1)

Expected report file

reports/
  summary.md

Suggested report skeleton

Short, readable, and easy to compare between runs.

# Scout Report — <Name#TAG>

## Overall
- Games: 42
- Winrate: 52.4%

## Champion pool (min 3 games)
- Ahri: 8 games, 62.5% win

## Streaks
- Longest win streak: 5
- Longest loss streak: 4

## Trend
- Last 10: 70%
- Last 30: 46%

The three prints that prevent nonsense

print(df.shape)
print(df.dtypes)
print(df.head(3).to_string(index=False))