USPS CASS Certification Guidelines

As part of the Core Address Parsing & Standardization pipeline, the Coding Accuracy Support System (CASS) is the USPS compliance framework that transforms raw, inconsistent US address input into deterministic, deliverability-confirmed records. Where other standardization steps clean and parse, CASS is the authoritative validation layer: it assigns ZIP+4 extensions, Delivery Point Validation (DPV) codes, and Carrier Route identifiers that downstream mailing, logistics, and geocoding systems depend on.


CASS Address Validation Pipeline Five-stage pipeline showing raw address input flowing through sanitize, pre-normalize, CASS engine validation, DPV response parsing, and validated output with ZIP+4 and Carrier Route appended. Raw Input CRM / ERP / form Sanitize UTF-8 · whitespace Pre-normalize Pub 28 · abbrev CASS Engine DPV · ZIP+4 Carrier Route DPV Route Y/D → out · S/M → queue Output validated Sanitize → Pre-normalize → CASS validate → Route by DPV → Write

Prerequisites

Production Workflow

CASS compliance is a deterministic sequence. Skipping or reordering stages causes DPV mismatches or certification test failures.

Step 1 — Ingest and sanitize

Pull raw records from source systems (CRM, ERP, web forms, legacy databases). At the ingestion boundary:

  • Detect and transcode non-UTF-8 encodings (chardet or explicit codec declarations cover Windows-1252 and ISO-8859-1 export artifacts from legacy systems).
  • Strip non-printable characters and normalize whitespace to a single space.
  • Reject records missing a primary number and street name before they enter the normalization queue.

Use streaming parsers (polars scan, chunked pandas iteration, or generator-based CSV reads) for high-volume ingestion to avoid memory bottlenecks.

Step 2 — Pre-normalize against Publication 28

Convert colloquial inputs into forms the CASS engine can match against its reference tables:

  • Expand directional abbreviations: NNORTH, SWSOUTHWEST
  • Standardize suffix variants: STSTREET, AVEAVENUE, BLVDBOULEVARD
  • Enforce canonical secondary-unit designators: #123APT 123, 123BAPT B
  • Validate city–state–ZIP triads; flag cross-state ZIP discrepancies for the DPV routing queue

CASS only processes domestic US records. If your pipeline receives mixed-country data, apply International Address Format Standardization to route non-US records to appropriate regional parsers before reaching this step.

For inputs that include PO Boxes or Rural Routes, enforce Publication 28 canonical forms before submission — PO BOX <n> and RR <n> BOX <n>. The Handling PO Boxes and Rural Routes guide covers the extraction patterns and edge cases specific to those address types.

Step 3 — Validate and append (DPV, ZIP+4, Carrier Route)

Route the standardized payload to your CASS vendor endpoint. The engine returns:

Field Meaning
dpv_code Deliverability verdict: Y, D, S, or M (see table below)
zip4 Four-digit ZIP extension (e.g. 1234)
carrier_route Delivery route code (e.g. C001 city, R001 rural, B001 PO Box)
dpv_footnotes Supplementary flags — vacant, seasonal, military, throwback
standardized_line1 CASS-corrected primary address line
standardized_city USPS-preferred city name
standardized_state Two-letter state abbreviation
standardized_zip Corrected five-digit ZIP

Parse dpv_footnotes in addition to dpv_code. A dpv_code of S with a footnote of H (unit missing but building confirmed) has a different recovery path than S with footnote N (no match found at all).

Step 4 — Route by DPV code

DPV Code Meaning Action
Y Exact match — primary and secondary confirmed deliverable Write to output
D Default — building confirmed, unit not verified Write to output; flag for secondary-unit enrichment
S Secondary missing — building exists, unit absent or ambiguous Route to manual review queue
M Primary missing — no match for primary number + street Route to fallback or discard

For S and M codes, consider routing through a multi-API fallback chain before discarding the record — a second geocoding provider may resolve ambiguous addresses that the CASS engine cannot confirm against its reference tables.

Step 5 — Write validated records

Enforce strict typing in the output schema:

  • zip4 as VARCHAR(4) — never numeric (leading zeros are valid)
  • dpv_code as a categorical enum
  • carrier_route as VARCHAR(4)

Store an idempotent hash of the original input alongside the CASS response for reconciliation. This enables efficient deduplication on re-runs and simplifies debugging when upstream schema changes produce unexpected DPV regressions.

The Step-by-Step Guide to CASS Address Validation provides the exact API call sequences and error-handling routines for each vendor endpoint.

Primary Code Implementation

"""
cass_pipeline.py — Production CASS validation with async batching and Pydantic v2 validation.
Requires: httpx>=0.27, pydantic>=2.0, tenacity>=8.0
"""

import hashlib
import logging
import uuid
from typing import Literal, Optional

import httpx
from pydantic import BaseModel, Field, ValidationError
from tenacity import retry, stop_after_attempt, wait_exponential

logger = logging.getLogger("cass_pipeline")


class RawAddress(BaseModel):
    address_line_1: str
    address_line_2: Optional[str] = None
    city: str
    state: str
    zip: str
    country: str = "US"

    def input_hash(self) -> str:
        """Stable idempotency key for deduplication and reconciliation."""
        raw = f"{self.address_line_1}|{self.address_line_2}|{self.city}|{self.state}|{self.zip}"
        return hashlib.sha256(raw.encode()).hexdigest()[:16]


class CASSResponse(BaseModel):
    dpv_code: Literal["Y", "D", "S", "M"]
    dpv_footnotes: Optional[str] = None
    zip4: Optional[str] = Field(default=None, min_length=4, max_length=4)
    carrier_route: Optional[str] = None
    standardized_line1: str
    standardized_city: str
    standardized_state: str
    standardized_zip: str

    @property
    def is_deliverable(self) -> bool:
        return self.dpv_code in ("Y", "D")


@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=30))
async def _post_batch(
    client: httpx.AsyncClient,
    api_endpoint: str,
    api_key: str,
    addresses: list[dict],
    request_id: str,
) -> list[dict]:
    """POST a single batch; raises httpx.HTTPStatusError on non-2xx responses."""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "X-Request-ID": request_id,
    }
    payload = {"addresses": addresses, "options": {"cass": True, "dpv": True}}
    resp = await client.post(api_endpoint, json=payload, headers=headers, timeout=30.0)
    resp.raise_for_status()
    return resp.json().get("results", [])


async def validate_address_batch(
    client: httpx.AsyncClient,
    addresses: list[RawAddress],
    api_endpoint: str,
    api_key: str,
    chunk_size: int = 500,
) -> list[tuple[RawAddress, Optional[CASSResponse]]]:
    """
    Validate a list of RawAddress records via CASS and return paired results.

    Returns a list of (input, CASSResponse | None) tuples.
    None indicates a validation schema mismatch or non-retryable API error.
    """
    results: list[tuple[RawAddress, Optional[CASSResponse]]] = []

    for offset in range(0, len(addresses), chunk_size):
        chunk = addresses[offset : offset + chunk_size]
        request_id = str(uuid.uuid4())
        payload = [a.model_dump() for a in chunk]

        try:
            raw_results = await _post_batch(client, api_endpoint, api_key, payload, request_id)
        except httpx.HTTPStatusError as exc:
            logger.error(
                "CASS batch failed: status=%s request_id=%s chunk_offset=%d",
                exc.response.status_code,
                request_id,
                offset,
            )
            results.extend((addr, None) for addr in chunk)
            continue

        for addr, item in zip(chunk, raw_results):
            try:
                cass = CASSResponse.model_validate(item)
                results.append((addr, cass))
                logger.info(
                    "dpv=%s zip4=%s route=%s hash=%s",
                    cass.dpv_code,
                    cass.zip4,
                    cass.carrier_route,
                    addr.input_hash(),
                )
            except ValidationError as exc:
                logger.warning("Schema mismatch for %s: %s", addr.input_hash(), exc)
                results.append((addr, None))

    return results

Vectorized pandas variant

import asyncio
import httpx
import pandas as pd
from cass_pipeline import RawAddress, CASSResponse, validate_address_batch


async def validate_dataframe(
    df: pd.DataFrame,
    api_endpoint: str,
    api_key: str,
) -> pd.DataFrame:
    """
    Accepts a DataFrame with columns matching RawAddress fields.
    Returns the original DataFrame with CASS result columns appended.
    """
    addresses = [RawAddress(**row) for row in df.to_dict(orient="records")]

    async with httpx.AsyncClient() as client:
        pairs = await validate_address_batch(client, addresses, api_endpoint, api_key)

    records = []
    for addr, cass in pairs:
        if cass:
            records.append({
                "dpv_code": cass.dpv_code,
                "dpv_footnotes": cass.dpv_footnotes,
                "zip4": cass.zip4,
                "carrier_route": cass.carrier_route,
                "standardized_line1": cass.standardized_line1,
                "standardized_city": cass.standardized_city,
                "standardized_state": cass.standardized_state,
                "standardized_zip": cass.standardized_zip,
                "cass_hash": addr.input_hash(),
            })
        else:
            records.append({k: None for k in [
                "dpv_code", "dpv_footnotes", "zip4", "carrier_route",
                "standardized_line1", "standardized_city", "standardized_state",
                "standardized_zip", "cass_hash",
            ]})

    cass_df = pd.DataFrame(records)
    return pd.concat([df.reset_index(drop=True), cass_df], axis=1)

USPS DPV Footnote Reference

The dpv_footnotes string carries one or more two-character codes. The most operationally significant:

Code Meaning Recommended action
AA Input ZIP + city/state matched a valid ZIP Proceed
A1 ZIP not matched — ZIP correction applied Log correction; audit upstream
BB Entire address DPV confirmed (code Y) Write to output
CC Primary number invalid; corrected by engine Log correction
N1 Address missing secondary number (apt/unit) Route to secondary-unit enrichment
M1 Primary number missing Manual review queue
M3 Primary number invalid Manual review queue
P1 PO Box zip code was assigned Verify intent
RR Confirmed rural route address Proceed
R1 Rural route default — RR found, box not confirmed Verify box number
H# Unit number confirmed (H3 = exact, H6 = only highrise default) Proceed / flag
F1 Military address (APO, FPO, DPO) Route to military-mail path
G1 General delivery address Flag; not a standard residential/commercial delivery
U1 Unique ZIP code (campus, firm, USPS facility) Proceed; typically confirmed

Edge Cases

Secondary unit ambiguity

CASS requires explicit, recognized unit designators (APT, STE, UNIT, FL, RM, BLDG). Inputs like #123, 123B, or Apt. 4 (with punctuation) fail to match despite referring to valid delivery points. Normalize secondary components to {DESIGNATOR} {VALUE} format before submission.

import re

_SECONDARY_RE = re.compile(
    r"(?P<pre>.*?)\s*(?:#|No\.?)\s*(?P<num>\d+[A-Za-z]?)\s*$",
    re.IGNORECASE,
)

def normalize_secondary(line: str) -> str:
    """Convert '#123' or 'No. 4B' patterns to 'APT {num}' for CASS compatibility."""
    m = _SECONDARY_RE.match(line.strip())
    if m:
        return f"{m.group('pre').strip()} APT {m.group('num').upper()}".strip()
    return line

PO Box and Rural Route formatting

CASS processes these delivery types differently from street addresses. Submissions must strictly follow Publication 28 canonical forms. Refer to Handling PO Boxes and Rural Routes for extraction patterns covering colloquial variants like P.O. Box, Post Office Box, and Rt. 2 Bx 15.

State / ZIP mismatch

If a record’s state abbreviation does not correspond to the ZIP code prefix ranges, the CASS engine either rejects the record or silently overrides the state. Always log standardized_state alongside original_state and route any mismatch to an audit queue — these often reveal upstream data-entry errors or multi-state ZIP codes near state borders.

Unicode and encoding drift

Legacy CRM or ERP exports frequently use Windows-1252 or ISO-8859-1. Characters like Ñ, smart quotes, or accented vowels corrupt silently if not transcoded at ingest. Apply chardet detection at the file-open boundary and encode explicitly to UTF-8 before any string operations. If the addresses contain non-ASCII characters that survived transcoding, also apply Unicode and character normalization (NFKC) before CASS submission to collapse ligatures and compatibility forms.

Batch size and silent truncation

Vendors typically enforce payload caps between 1,000 and 5,000 records per request. Exceeding the cap silently truncates the result set in some implementations; others return HTTP 413. Always chunk defensively (500 per batch is safe across all major vendors) and assert len(results) == len(chunk) after each API call to detect truncation immediately.

Performance and Vectorization

Approach Throughput (records/sec) Notes
Synchronous requests loop ~20–80 Baseline; unsuitable for volumes above 10k
httpx async with asyncio.gather ~800–2,000 Saturates most vendor rate limits; add semaphore
httpx async + semaphore (50 concurrent) ~400–600 Respects typical vendor quotas; preferred default
Parallel processes (multiprocessing) ~2,000–5,000 Only worthwhile above 500k records/run

Practical recommendations:

  • Use asyncio.Semaphore(50) to cap concurrent requests and avoid 429 Too Many Requests responses.
  • Prefer polars for pre/post-processing: polars lazy evaluation and Arrow-backed columns process 1M-row address frames in under 10 seconds on commodity hardware, versus 45–90 seconds with pandas.
  • Cache CASS results by input hash in Redis (TTL 30 days). Re-runs on incrementally updated CRM exports typically see 60–80% cache hit rates, cutting API costs proportionally. See API Quota Tracking and Cost Management for budget guardrails around per-call vendor costs.

Certification Testing and Maintenance

Annual recertification cycle

The USPS releases updated test datasets each year containing newly constructed streets, retired delivery points, and edge cases added from real-world failure reports. To maintain certification:

  1. Download the official test suite from the USPS PostalPro portal.
  2. Run your engine against the full dataset in a staging environment that mirrors production configuration exactly.
  3. Achieve ≥ 98% accuracy on DPV matching and ZIP+4 assignment.
  4. Submit results via the vendor portal or direct USPS submission system before the certification deadline.

Monthly database updates

Address data decays rapidly — the USPS estimates 14–18% of addresses change annually. Automate monthly reference-table updates:

  • Schedule updates during low-traffic windows (02:00–04:00 UTC).
  • Use database transactions to swap reference tables atomically; partial-state queries during an update produce transient DPV failures that are difficult to distinguish from structural errors.
  • Tag versions (v2026.05, v2026.06) to enable rollback if a vendor release introduces regressions.

Continuous monitoring

Deploy dashboards tracking:

  • cass_api_latency_p99 — alert at > 2 s
  • dpv_match_rate — alert if drops below 95%; common causes: stale vendor data, upstream schema drift, or secondary-unit normalization regression
  • error_rate_by_code — broken down by S, M, and API failures
  • batch_throughput_records_per_second — baseline this at deploy and alert on sustained 20% drops

Correlate latency spikes with vendor status pages to distinguish internal bottlenecks from external outages before escalating.

Troubleshooting

DPV match rate drops suddenly

Root cause: Monthly vendor database update introduced a regression, or upstream data schema changed silently (new source system exporting state as full name rather than two-letter code). Fix: Roll back to the previous vendor version tag. Run the recertification test suite against both versions to confirm the regression. If the upstream schema changed, update the pre-normalization layer and re-process the affected date range.

API returns HTTP 413 errors

Root cause: Batch payload exceeds vendor’s undocumented size limit (some vendors count bytes, not records). Fix: Reduce chunk_size to 200 and add a payload-size guard: assert len(json.dumps(payload)) < 512_000. Dynamic chunking based on average record size is more robust than a fixed record count.

zip4 field is None on confirmed deliverable records

Root cause: The address matched at the building level (DPV code D) but a unique ZIP+4 cannot be assigned without a confirmed unit. Common for new construction or recently subdivided parcels. Fix: This is expected behavior. Store the record with zip4=None and flag it for periodic re-validation — once the USPS adds the unit to its reference tables (typically within one to two monthly cycles), a re-submission will return a full ZIP+4.

Silent state override in output

Root cause: The CASS engine corrected a state–ZIP mismatch by trusting the ZIP code over the submitted state field. Fix: Always log original_state alongside standardized_state. Route any mismatch to a data-quality queue. Multi-state ZIP codes (ZIP codes near state borders assigned to post offices in the neighboring state) are a common cause; maintain a lookup table of known cross-border ZIPs to reduce false-positive audit flags.

tenacity retries exhausted on large batch jobs

Root cause: Sustained 429 Too Many Requests responses indicate the concurrency cap needs to be lowered, or the vendor’s daily quota has been reached. Fix: Reduce asyncio.Semaphore count to 20 and add a daily quota guard using the approach in API Quota Tracking and Cost Management. If the quota is genuinely exhausted, checkpoint the current offset and resume the next day; the input_hash idempotency key prevents double-billing previously processed records.

FAQ

Do I need to obtain CASS certification directly from USPS or can I use a vendor?

Most production teams use a CASS-certified vendor (Smarty, Melissa, Loqate, etc.) rather than pursuing direct USPS certification. Direct certification requires annual testing submissions and is typically reserved for large mailers or software vendors distributing certified engines. Using a certified vendor API is compliant for downstream pipelines.

What DPV code means an address is fully deliverable?

A DPV code of Y (confirmed deliverable to the exact unit) is the gold standard. D (default — building confirmed, unit not verified) is considered conditionally deliverable. S (secondary missing) and M (missing primary number) require human review or fallback routing before treating the record as deliverable.

How often does the USPS address database change?

USPS issues monthly address data updates (NCOALink, AMS, DPV reference files). New subdivisions, renamed streets, and retired delivery points appear within one to two update cycles. Vendors propagate these within days of the USPS release date. Pipelines that skip monthly updates can see DPV match rates drop by 1–3 percentage points per quarter as the reference data ages.

Can CASS validation handle PO Boxes and Rural Routes?

Yes, but only if they are formatted per USPS Publication 28 before submission. PO Box inputs must use the canonical form PO BOX <number>, and Rural Routes must follow the RR <n> BOX <n> convention. Colloquial variants like P.O. Box or Route 2 Box 15 trigger M or S DPV codes despite referring to valid delivery points.

What happens to non-US addresses sent to a CASS engine?

A CASS engine has no reference data for non-US addresses and will return a non-deliverable DPV code or an error. You must branch before the CASS call: detect country, route domestic US records to CASS, and send international records to a separate normalization path. Failing to branch contaminates DPV match-rate metrics with structurally unresolvable failures.