Methodology & Technical Overview

Records Request and Source Materials

In late 2025, a formal public records request was submitted to the Whitman-Hanson Regional School District (WHRSD) seeking a broad set of documents related to district finances and associated administrative records.

The district requested an extension to process the request and established a revised deadline of January 8, 2026. The responsive records were delivered on January 5, 2026.

The delivery consisted of a single PDF file approximately 500 megabytes in size containing 4,179 pages. The document appears to have been generated directly from Google Workspace email auditing and records export systems. All pages contain native, machine-readable text; none are scanned images.

Certain categories of records requested were not included in the delivery. These include, but may not be limited to:

While attachments were not included, many emails reference attachments or shared files by filename and context. These references provide concrete identifiers that may support targeted follow-up records requests.

Document Processing

Due to the size and structure of the original PDF, the document was not suitable for practical navigation, keyword searching, or public distribution in its original form.

The PDF was programmatically decomposed into individual, page-level PDF files using deterministic naming (records_page_0001.pdf through records_page_4179.pdf). Each file corresponds exactly to a page in the original document.

In parallel, the text content of each page was extracted using PDF text extraction tools configured to preserve layout and spacing. Optical character recognition (OCR) was not required due to the text-based nature of the source.

HTML Page Generation

Each page of the original document was rendered as a corresponding static HTML page. Every page includes:

The extracted text for each page is included in the HTML using visually hidden elements. This text is present in the document object model (DOM) for indexing and search purposes, but is not displayed to readers to avoid duplication or visual clutter.

Search Indexing

Full-text search functionality is provided using Pagefind, a static-site search indexing tool. Search indexes are generated at build time and served as static assets.

Only content explicitly marked for indexing is included in the search corpus. User interface elements and navigation controls are excluded.

All search queries are executed client-side within the user’s browser. No search requests are sent to a server or logged by the site itself.

Page Identification and Metadata

Where available, structured fields such as message date and subject line were extracted from the page text and used to generate descriptive page titles.

When such fields are unavailable, a neutral fallback identifier is used. These titles appear in browser metadata and search results and do not alter the underlying record content.

Issue Reporting Mechanism

Each page includes a “Report this page” link. This link opens a pre-populated email addressed to 02382@whrsd-transparency.org and includes:

This mechanism exists because a manual review of all 4,179 pages prior to publication was not feasible. All reports require manual evaluation. No automated removal or moderation is performed.

Hosting and Infrastructure

The site is deployed as a fully static asset bundle and hosted on Cloudflare Pages behind the custom domain whrsd-transparency.org.

Cloudflare provides HTTPS termination, global content delivery, caching, and DNS management.

Cloudflare Web Analytics is enabled to provide aggregate traffic metrics.

No cookies are set by this site. No browser storage mechanisms (cookies, localStorage, sessionStorage, IndexedDB) are used for analytics, tracking, or user identification.

Cloudflare Web Analytics operates without embedding tracking scripts, pixels, or beacons into site pages. Metrics are derived from standard HTTP request metadata required to deliver content, such as request timestamps, URLs, and response codes, plus coarse location inferred from IP at the edge.

No user profiles are created. No cross-site tracking is performed. No data is shared with third-party analytics platforms or advertising networks.

Search functionality is provided entirely client-side using static assets generated at build time. Search queries are executed locally in the visitor’s browser and are not transmitted to the server or logged.

Scope and Limitations

This site presents records exactly as provided by the Whitman-Hanson Regional School District. No content has been edited, summarized, or selectively omitted beyond the processing steps described above.

The site does not assert completeness of the records provided, nor does it offer interpretation or analysis. Its purpose is to improve accessibility, navigation, and searchability of the materials released.

All omissions or redactions originate from the source materials as delivered.