My Web This Week Backfill Design
Date: 2026-03-26
Issue: #101
Goal
Backfill historical My web this week roundup posts starting from a fixed start date of 2025-01-01.
The change should generate and commit the actual historical files under _posts/ for qualifying Sundays. It should not generate or update _snippets/ as part of the backfill.
Scope
In scope:
- extend the existing weekly generator to support a historical backfill mode
- generate
_posts/YYYY-MM-DD-my-web-this-week.md for qualifying Sundays
- refresh existing historical weekly posts in place when rerun
- keep the existing weekly post format, title shape, categories, tags, and body layout
Out of scope:
- backfilling
_snippets/
- changing the existing weekly automation behavior unless needed for shared generator code
- changing source-selection rules beyond what the current generator already supports
Architecture
Keep scripts/generate_my_web_this_week.py as the single source of truth for roundup generation.
Add a backfill execution path that:
- derives the first Sunday on or after the configured start date
- iterates Sunday by Sunday through an explicit end date
- reuses the current Monday-through-Saturday source selection logic for each Sunday
- writes a weekly roundup post only when that week has at least one qualifying source item
- overwrites an existing historical weekly post if the generator is rerun
This keeps one implementation for extraction and rendering instead of splitting live generation and backfill into separate scripts.
Backfill Rules
Range rules:
- start from the first Sunday on or after
2025-01-01
- end at an explicit command-line date
- include only Sundays within that closed range
Generation rules:
- for each Sunday, reuse the existing qualifying-source selection logic
- skip empty weeks without failing the overall run
- overwrite an existing weekly post for that Sunday with regenerated content
- do not create or update
_snippets/ in backfill mode
- preserve the current weekly post shape and metadata
Reporting rules:
- print one result line per Sunday:
generated, updated, or skipped
- print final totals for generated, updated, skipped, and skipped-source warnings
CLI Shape
Keep the current single-date behavior and add an explicit backfill mode.
Single Sunday:
python3 scripts/generate_my_web_this_week.py --date YYYY-MM-DD --write
Backfill:
python3 scripts/generate_my_web_this_week.py \
--backfill-start YYYY-MM-DD \
--backfill-end YYYY-MM-DD \
--write
Safety rules:
- exactly one mode must be chosen: single-date or backfill
- backfill start and end must be provided together
- all date inputs must parse cleanly
- the effective Sunday range must be ordered correctly
- dry run remains the default
- backfill mode must never write
_snippets/
Testing And Validation
Add tests for:
- deriving the first Sunday on or after a given date
- iterating Sundays across a range
- skipping empty weeks without raising
EmptyRoundupError
- writing or refreshing
_posts/ files only in backfill mode
- confirming
_snippets/ are untouched in backfill mode
- preserving current single-date behavior
Validation steps:
python3 -m unittest tests/test_generate_my_web_this_week.py tests/test_validate_posts.py
python3 scripts/generate_my_web_this_week.py --backfill-start 2025-01-01 --backfill-end 2026-03-29
python3 scripts/generate_my_web_this_week.py --backfill-start 2025-01-01 --backfill-end 2026-03-29 --write
python3 scripts/validate_posts.py --today 2026-03-26
Risks
- historical source posts may still fail current extraction rules and leave some weeks empty
- reruns must preserve idempotence even when a subset of historical weekly posts already exists
- validator behavior must stay compatible with historical weekly posts that have no snippet companion