@gurupanguji

Source Citation Rendering Design

Goal

Fix link-post source citations that Kramdown currently folds into <blockquote><cite>...</cite></blockquote> by accident, and bulk-normalize the archive to a markdown-only source-link shape that stays portable and renders consistently.

Problem

Issue #115 documents two live failures:

Both posts use this authored shape:

> quoted excerpt

Source: [Link title](https://example.com)

Kramdown treats the trailing Source: line as attribution for the preceding blockquote and emits a <cite> inside the quote block. That behavior is parser-driven, not intentional site markup, so the rendered result is brittle and visually inconsistent.

The issue is broader than two posts. A repo scan found roughly 205 posts where Source: appears near quoted content with the same general pattern.

Design

The canonical archive shape should remain plain markdown.

Canonical output:

> Quoted text.

[Source: Publication or story title](https://example.com)

This keeps the files portable and simple while avoiding Kramdown’s quote-attribution behavior.

2. Normalize content, do not patch rendered HTML

This should be a content rewrite, not a render-time transform.

Reason:

3. Add a narrow bulk normalizer for the risky markdown pattern

Add a script, likely scripts/normalize_source_citations.py, that:

4. Preserve authored prose and only rewrite the quote-source boundary

The converter should be mechanical and narrow.

It should preserve:

It should not:

5. Support common safe variants but skip ambiguous cases

Safe candidates should include:

Skip cases should include:

When the shape is ambiguous, the script should skip and report the reason rather than guess.

6. Add a validator guardrail for future posts

Extend scripts/validate_posts.py to reject the risky authored pattern for posts at or after the policy date for this fix.

The validator should flag:

The safe allowed shape is the standalone markdown link:

[Source: Example Story](https://example.com/story)

This keeps future posts from reintroducing the bug after the archive cleanup while preserving markdown portability.

File Changes

New

Modify

Verification

Automated

Add test coverage for:

Manual

Spot-check:

Confirm that quotes remain markdown-only, the source link renders separately, and no surrounding prose drifted.

Risks

Archive shape varies more than expected

The archive likely contains close cousins of the target pattern. The script should stay narrow and skip anything it cannot rewrite safely.

Bulk diffs can hide subtle drift

This cleanup can touch many posts. The rewrite must stay mechanical so the diff remains reviewable.

Validator could over-match

The new guardrail should only flag the precise risky raw markdown pattern. It should not reject intentional HTML quote markup or historical posts that predate the enforcement window.

Acceptance Criteria

  1. The two reported posts no longer rely on accidental Kramdown <cite> generation.
  2. A bulk normalization script converts the safe archive pattern to explicit quote-plus-cite HTML.
  3. The archive cleanup runs across the safe candidate set and leaves ambiguous posts untouched.
  4. Validation fails for newly-authored raw markdown blockquote plus Source: patterns that would regress this bug.
  5. Validation also rejects temporary gp-quote HTML markup so the repository stays markdown-first.