khairold
← Back to Work

Singapore Legal SEO

Scraped 10,000+ court judgments, built an AI pipeline, and shipped a full SSR legal information site — in days.

February 2026
AstroCloudflare D1DrizzleWorkersAI

The Problem

Singapore’s court judgments are buried in eLitigation, a government system operated by CrimsonLogic for the Singapore Courts. It works, but it’s built for lawyers filing cases — not for browsing, researching, or discovering legal information casually.

The search is clunky. There’s no way to browse by court or topic. Individual judgments are walls of unformatted text. For law firms, legal researchers, and anyone trying to understand Singapore case law, the experience is friction from start to finish.

I saw an opportunity: take this public data, structure it properly, and build a fast, searchable, SEO-optimized site that makes Singapore legal information actually accessible.

The Approach

Phase 1 — Scraping the data

eLitigation’s listing endpoint is straightforward — paginated results filtered by court and year. I wrote scrapers to pull case metadata (name, citation, decision date, court, case numbers) and full judgment text from every available case.

The corpus: 10,470+ judgments spanning 2008–2026, across multiple courts — Supreme Court (SGHC), Court of Appeal (SGCA), Family Court (SGHCF, SGFC), and more. Each case averages 430–560 decisions per year, with structured HTML using CSS classes like Judg-1, Judg-2, Judg-Quote-0 that made parsing reliable.

Phase 2 — Processing and storage

Each scraped judgment goes through an AI processing pipeline:

  • Summarization — Claude generates a concise summary of the judgment
  • Catchword extraction — Key legal topics and areas of law
  • Categorization — Court type, area of law, outcome

Everything lands in Cloudflare D1 with full-text search powered by FTS5. Drizzle ORM handles the data layer — type-safe queries with zero runtime overhead.

Phase 3 — The site

Built a full SSR site on Astro 5 with Tailwind v4, deployed to Cloudflare Pages:

  • Homepage with aggregate stats (cases indexed, courts covered, year range)
  • Case listing with pagination (25 cases per page)
  • Case detail pages with formatted judgment text, AI summaries, catchwords, and Schema.org LegalCase structured data
  • Court index — browse all courts, click into filtered case lists per court
  • Catchword tag pages — browse cases by legal topic
  • Full-text search with FTS5 snippet highlighting

Every page is server-rendered on Cloudflare Workers, hitting D1 directly. No client-side JavaScript framework — pure Astro components. (React had a MessageChannel issue on Workers, so I went framework-free and never looked back.)

SEO fundamentals are baked in: dynamic meta tags, canonical URLs, auto-generated sitemap, robots.txt, and structured data on every case page.

The Result

The site is live with 640+ cases indexed in the initial deployment, with the pipeline ready to process the full 10,000+ corpus. Pages load in under a second. Search returns highlighted snippets instantly. Every court judgment that was buried in eLitigation now has its own clean, fast, linkable page.

The programmatic SEO foundation is solid — thousands of unique pages, each targeting long-tail legal queries that nobody else is serving well. The content is real (government court judgments), the structure is semantic, and the internal linking is systematic.

Next up: Phase 3 of the project — entity extraction. Pulling judges, lawyers, and law firms from judgment text to build profile pages and relationship graphs. This turns the site from a document repository into a legal intelligence platform.

What I Learned

  • Cloudflare’s stack is absurdly good for this. Workers + D1 + Pages gives you a globally distributed, server-rendered site with a built-in database for effectively $0/month. The DX is excellent.
  • SSR on Workers beats static generation for large catalogs. With 10,000+ pages, static builds would take forever and redeploy on every data update. SSR means the site always reflects the latest data.
  • AI processing at scale is pipeline design, not prompt engineering. The prompts are simple. The hard part is building reliable scraping, deduplication, error handling, and incremental processing.
  • Skip React on Workers. The MessageChannel polyfill issue was a dead end. Pure Astro components render faster, ship less JavaScript, and work perfectly on the edge.