Skip to content
CribScore
Skip to content
CribScore Docs

Recipes

Replace State Portal Scraping

Tear down the scraping stack. CribScore's licensing + inspection + trust surface replaces hundreds of brittle parsers with one bearer-authed REST contract.

What you are replacing#

If you maintain Playwright / Puppeteer scrapers across state childcare licensing portals, your pain is in three places: parser drift, schema cleanup, and trust attribution. CribScore moves all three behind one maintained surface.

  • Parser drift — portals change layout; your scrapers break weekly.
  • Schema cleanup — every state uses different column names, statuses, geocoding.
  • Trust attribution — your application code ends up encoding which states are reliable, when it should be a data property.

Migration order#

Replace surfaces in the order that pays back fastest. Start with search + detail (highest scraper maintenance cost); end with evidence and risk (newest signals, no scraper equivalent).

  • 1. Replace `provider_search` scraper → `/v1/facilities`
  • 2. Replace `provider_detail` scraper → `/v1/facilities/{id}`
  • 3. Replace `state_coverage` lookup → `/v1/trust/jurisdictions`
  • 4. Add evidence + risk (no scraper equivalent — net-new capability)

Before / after#

The migrated code is shorter and trust-tagged out of the box. Provenance, freshness, and trust tier come on every record — no extra plumbing.

Afterpython
# Before: brittle Playwright loop over 50 state portals
# After: one CribScore call with provenance + trust tier baked in
import httpx

response = httpx.get(
    "https://api.cribscore.co/v1/facilities/fac_01HZ8X9K2D7N3M5P0AYR4FTC2W",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    timeout=30.0,
)
record = response.json()

print(record["name"], record["license_status"])
print("source:", record["source_url"])
print("trust tier:", record["trust"]["tier"])

Validation#

Run the new code against a known-good facility ID in your highest-value state. Confirm the response includes a `source_url` you can open and a `trust.tier` of `launch_ready`.