Step 04

Download match dataset

Fetch recent match IDs, fetch each match JSON, and make reruns resume-friendly.

Run

Run after each small change. Tiny loops win.

uv run python -m src.scout

You will touch

Time

60–120 minutes

Load your puuid (from Step 02’s saved JSON).
Fetch recent match IDs (match-v5) and save data/raw/match_ids_<puuid>.json.
Loop through IDs and download match details into data/raw/matches/<matchId>.json.
Make the run resumable: if the file exists, skip it.
Print counts: ids fetched, files found, new downloaded.

Match IDs

Match details

Resumable runs

Hint: peek safely (JSON inspection ladder)

Inspect JSON like stairs, not a dive: print top-level keys → print info keys → print participant count. Stop there and decide your next question.

The ladder

print(match.keys())
print(match['info'].keys())
print('participants:', len(match['info']['participants']))

Bigger hint: find yourself (don’t guess the participant)

Don’t guess which participant is “you”. Match JSON contains 10 participants; you want the one whose puuid matches yours.

A tiny find-the-index move

parts = match['info']['participants']
idx = next(i for i,p in enumerate(parts) if p.get('puuid') == my_puuid)
me = parts[idx]

Unblock-me: empty match ID list (print the host + params)

If you get [], don’t spiral. Print the base URL (host included) and the params. Most of the time: wrong routing value or filters.

Two prints that answer 80% of questions

print('HOST:', base_url)
print('PARAMS:', params)

Unblock-me: rate limits (429 = you’re too fast, not too dumb)

If you hit 429, you’re not failing—you’re speedrunning. Fetch fewer matches, add a small sleep + retry, and rely on your cache.

The calming move

limit to 20–50 matches
sleep 0.5–1.0s between requests
cache everything

Expected raw files

data/
  raw/
    match_ids_<puuid>.json
    matches/
      <matchId>.json

The JSON inspection ladder

Ask a tiny question, print a tiny answer, repeat.

print(match.keys())
print(match['info'].keys())
print('participants:', len(match['info']['participants']))

Resume-friendly behavior (what you want to see)

SKIP (exists) <matchId>
DOWNLOAD     <matchId>
DONE: new=12, skipped=38