Mack's LoL Scout
Step 03

Disk cache + update mode

Befriend the rate limit dragon: reuse local JSON, only fetch when you mean it.

Run

Run after each small change. Tiny loops win.

uv run python -m src.scout
You will touch
  • src/scout/ (your fetch_json helper)
  • data/cache/
Time

60–90 minutes

Do this (suggested order)

  1. Create data/cache/.
  2. Design a deterministic cache key from url + params (sorted).
  3. Write fetch_json(url, params, *, update=False): reuse cached files by default, fetch on update.
  4. Save metadata with each response (url, params, fetched_at, status_code).
  5. Add tiny logs so you can see behavior: CACHE HIT / CACHE MISS / NETWORK.

You’ll practice

  • Deterministic cache keys
  • Store metadata (fetched_at, params, status)
  • Design an “update mode” switch

Explainers (for context, not homework)

Build

Helper
  • fetch_json(url, params) checks data/cache first
  • Cache files store url, params, fetched_at, status_code, json/text
Update mode
  • OFF → never network if file exists
  • ON → fetch fresh and merge local dataset

Check yourself

  • Run twice: first run shows CACHE MISS, second is all CACHE HIT and makes ~0 network calls.

If it breaks

  • Filename collisions
  • Caching error responses forever
  • Forgetting query params in the cache key

Hints (spoilers)

Hint: safe filenames (hash it and move on with your life)

Cache key should include URL + sorted params. If you hash, life is easier. If you don’t hash, be very polite to filenames.

A simple key idea

Not required, but hashing prevents “filename too long” sadness.

import hashlib, json

payload = {'url': url, 'params': params}
raw = json.dumps(payload, sort_keys=True, separators=(',', ':')).encode('utf-8')
cache_key = hashlib.sha1(raw).hexdigest()
Bigger hint: params must be part of the key

The same endpoint with different params is a different request. If params aren’t in the key, you will cache the wrong thing and your code will “randomly” lie later.

Bigger hint: error caching (save receipts, but don’t treat them as a win)

Save error responses (status + body text) for debugging, but don’t treat them as “good cached data” forever. Decide what counts as a real hit (usually: status_code == 200).

A tiny rule you can log

if cached.status_code != 200: treat as MISS (and maybe refetch in update mode)
Unblock-me: cache lives on disk, so paths matter

If your program writes cache files into a weird place, you’re probably running from the wrong directory. Run from the project root (the folder that contains data/).

Cache folder (expected)

data/
  cache/
    <cache_key>.json

Cache file shape (suggested)

You can change names, but keep the idea: metadata + payload.

{
  "url": "...",
  "params": {"...": "..."},
  "fetched_at": "2025-12-15T00:00:00Z",
  "status_code": 200,
  "json": {"...": "..."}
}

Tiny log lines that make debugging easy

CACHE HIT  <cache_key>
CACHE MISS <cache_key>
NETWORK    <url>