← Back to Blog
DeveloperAPIDataAnalysis

How to Evaluate a Crime Data API: Coverage, Freshness, and What Actually Matters

📅 May 5, 2026·⏱ 11 min read·By SpotCrime

Most crime data APIs look the same from the documentation. Coverage claims are vague, freshness claims are optimistic, and taxonomy differences are buried in footnotes. Here is a seven-dimension framework for evaluating what you're actually buying — before you build production infrastructure on it.

When developers evaluate payment APIs, they look at uptime SLAs, latency distributions, error rate history, and documentation quality. The evaluation is systematic because the stakes are obvious — a flaky payment integration fails a checkout flow, and someone notices immediately.

Crime data APIs warrant the same rigor, but rarely get it. Most teams sign up for a trial, test a few endpoints, confirm that addresses return results, and proceed. The failure modes that matter — stale data, geographic gaps, misclassified incidents — show up later, in production, when a user notices that a “real-time” crime feed has not updated in 36 hours, or that a violent incident in their neighborhood is missing from the database entirely.

What follows is a framework for evaluating crime data APIs before you build. Seven dimensions, in rough order of importance, with specific questions to ask and benchmarks to compare against.

1. Geographic Coverage: What “We Cover the US” Actually Means

Every major crime data provider claims national coverage. Those claims are rarely false, but they are almost never equivalent.

The most important distinction is between coverage that comes from direct agency relationshipsversus coverage that is aggregated, imputed, or scraped. A direct agency relationship means the provider has an established feed from a department's records management system or CAD (computer-aided dispatch) system. Aggregated coverage means the provider is pulling from secondary sources — other aggregators, public data portals, or FOIA-harvested datasets.

The distinction matters for three reasons:

  • Freshness scales with relationship type. Direct agency feeds can update in near-real-time. Scraped public portals update when the portal does — which may be weekly, monthly, or less.
  • Gap behavior differs. When a direct feed fails, you know it. When a scraped source stops publishing, you may not find out for weeks.
  • Verification differs. Incident records from direct feeds often include internal case numbers and classification fields that aggregated data rarely preserves.

Ask the provider: How many direct agency relationships do you maintain? What percentage of the US population is covered by those direct relationships? What do you serve for agencies not in your network?

The FBI UCR historically covered roughly 85–90% of the US population through participating agencies. But the 2021 transition from the legacy Summary Reporting System to NIBRS caused that coverage to fall sharply: cities representing approximately 37% of the US population disappeared from national datasets for at least two years as agencies completed the transition. Any API still relying on legacy UCR data carries that gap — even if the provider's marketing copy predates it.

2. Data Freshness: What “Real-Time” Means in Practice

“Real-time” has become a marketing term. In crime data, it describes anything from sub-hour dispatch-sourced alerts to “we update our database daily from the previous day's published reports.” Both providers call it real-time. The latency difference is 23 hours.

The right benchmark is not absolute freshness — it is whether the freshness is adequate for your use case. Different applications have very different requirements:

Freshness Requirements by Application Type

Safety alerting and family location apps

Need incident data within 30–60 minutes of an event. Slower data creates false safety signals or triggers alerts after the window of relevance has passed.

Real estate platforms and insurance underwriting

Can tolerate 24–48 hour latency. Historical trend accuracy matters more than minute-level freshness.

Academic research and policy analysis

Typically works with quarterly or annual aggregates. The FBI UCR's 12–18 month publication lag is a major constraint for researchers, but irrelevant to a mortgage underwriting model calibrated on multi-year averages.

What to look for in provider documentation: average publication latency from event time, p95 latency (the tail matters more than the median for alerting products), update frequency by incident type, and whether the provider distinguishes between “report filed” time and “incident occurred” time. These are often different by hours or days, and the distinction is critical for time-of-day analysis.

3. Incident Taxonomy and Normalization

This is where most developers underestimate the complexity of crime data — and where the quality gap between providers is widest.

Police departments across the US use different crime classification systems. Some use FBI NIBRS offense codes — 52 Group A offense categories and 10 Group B categories, with granular distinctions between, for example, aggravated assault with a firearm versus aggravated assault with a dangerous weapon. Others use internal codes that map imprecisely to national standards. Legacy UCR reporters used summary statistics rather than incident-level codes at all.

The practical consequence: a raw “theft” record from Chicago and a raw “theft” record from Los Angeles may not represent the same set of events. One may include motor vehicle theft, which the other categorizes separately. One may include shoplifting below a dollar threshold; the other may exclude it entirely.

Good normalization means the provider has mapped agency-specific codes to a consistent internal taxonomy, so that a query for “burglary” returns burglary incidents across all covered agencies — not whatever each agency called it internally. Poor normalization means your application logic needs to handle those discrepancies. It will not do so correctly for agencies it has never encountered before.

Specific questions to ask:

  • Does the provider publish their offense taxonomy in full?
  • How do they handle multi-offense incidents — where a single event involves both assault and weapons possession?
  • How do they handle incidents that get reclassified after the initial report is filed?
  • Are their categories aligned to NIBRS, and how do they handle legacy UCR data?

For context on how labor-intensive this step is: the crimede-coder.com Python crime data science guide describes normalization as one of the most time-consuming phases of building a local crime analytics pipeline — and that is for a single agency's data. At scale across hundreds of agencies, the problem compounds substantially.

4. Incident-Level vs. Aggregate Access

APIs differ in a fundamental structural way that documentation often obscures: whether they serve individual incident records or only aggregate counts.

Aggregate APIs return counts of incidents by geography and time period — for example, “237 thefts in ZIP code 90210 in Q1 2026.” Incident-level APIs return individual records: a location, a date and time, an incident type, and a case number.

The distinction matters because aggregate access cannot support:

  • Proximity queries — how many incidents within 500 meters of this address? You cannot compute this from aggregate counts by ZIP code.
  • Fine-grained time-series analysis — daily or hourly incident patterns require individual timestamps.
  • Real-time alerting — alerting requires individual incident records with event times, not periodic batch aggregates.
  • Safety score computation — weighting incidents by recency, distance, and severity requires incident-level data. Aggregate counts give you a number; they do not give you the ingredients to build a SpotScore™.

Some providers offer a graduated access model: aggregate at a lower tier, incident-level at higher tiers. Others are aggregate-only because their sourcing method — pulling from published city portals — does not yield individual records. Confirm which you are getting before integrating, not after.

5. SLA, Uptime, and Rate Limits

Research tools are evaluated on accuracy. Production infrastructure is evaluated on reliability. Crime data APIs are rarely positioned as production infrastructure, but that is exactly what they become the moment a user depends on them.

Most crime data providers do not publish SLA documentation with the specificity that cloud infrastructure providers do. This is a gap. For applications where data availability directly affects user experience — safety apps, corporate security dashboards, real estate listings — a provider that goes offline for 48 hours without notice is not an acceptable dependency.

Questions to ask before signing a contract:

  • What is the committed uptime SLA, and what remediation applies when it is missed?
  • What are the rate limits — requests per minute, requests per day, burst ceiling?
  • How are rate limit errors surfaced — 429 with Retry-After headers, or silent degradation?
  • Is there a status page with historical uptime data publicly accessible?
  • What is the incident response time when a data feed fails, and how are customers notified?

Rate limits deserve particular attention for high-traffic applications. A family safety platform with millions of active users cannot operate on a 1,000-request-per-minute cap without aggressive caching and query batching. Know the numbers before you architect around them — not after you hit the wall in production.

6. Documentation and Developer Experience

API documentation quality is a leading indicator of how much integration pain you will experience — both initially and ongoing. Providers that have invested in developer experience have typically also invested in API design, error handling consistency, and support responsiveness.

What good documentation looks like:

  • An OpenAPI specification (Swagger-compatible) for every endpoint
  • Consistent, enumerated error codes with documented meanings
  • Code examples in at least two languages (Python and JavaScript are the baseline)
  • A sandbox environment with realistic test data
  • A changelog with versioning policy — so you know what breaking changes look like

What poor documentation looks like: a PDF from 2023 with endpoint descriptions but no examples, no error code reference, no changelog, and a support email with a five-business-day SLA.

A practical benchmark: a developer with no prior exposure to the API should be able to make a successful request within 15 minutes of reading the documentation. If they cannot, the documentation is below standard for a production dependency.

7. Privacy, Compliance, and Data Suppression Policy

Not all crime data should be published at incident level. A quality provider has a clear, documented policy about what they suppress and why — and that policy reflects genuine consideration of privacy harms, not just minimum legal compliance.

Appropriate suppressions include:

  • Exact victim home addresses. Incident location is sufficient for neighborhood safety analysis; publishing where a victim lives enables stalking and retaliation.
  • Records involving juvenile offenders. Most state statutes restrict public access to juvenile court and arrest records.
  • Ongoing investigations. Departments regularly withhold incident data that could compromise active investigations or witness safety.
  • Records below a reporting threshold. Some providers suppress individual incidents in low-density areas where a single record could de-anonymize a victim.

These suppressions directly affect application design. If your product needs exact address-level precision on victim-related incidents, a privacy-compliant provider will not give you that — and you should not want it. If you are building a real-time alerting product, understand whether “real-time” means dispatch-sourced immediately or published after the incident report has been reviewed and cleared for release. Those timelines differ by hours and sometimes days.

A note on CJIS: the Criminal Justice Information Services Security Policy governs access to certain law enforcement databases — arrest records, criminal histories, data accessed through restricted law enforcement networks. Most consumer-grade crime data APIs work with publicly reported crime data, which is generally not subject to CJIS controls. But if your product involves integration with law enforcement-facing systems or restricted records, the CJIS boundary matters and should be clarified explicitly with any prospective provider.

Putting It Together: A Practical Scorecard

The seven dimensions above translate into a concrete evaluation checklist. Run every prospective provider through it before a contract conversation:

DimensionWhat to AskMinimum Bar
Geographic CoverageDirect agency count? Population % from direct feeds?Named agencies, stated coverage percentage
Data FreshnessAverage and p95 latency from event to API?Under 2 hours for alerting; 24h for underwriting
Incident TaxonomyPublished offense taxonomy? NIBRS-aligned?Documented, consistent across agencies
Access LevelIncident-level or aggregate only?Incident-level for proximity queries and alerting
SLA / Rate LimitsUptime SLA? RPM ceiling? Status page?Published SLA with historical uptime data
DocumentationOpenAPI spec? Sandbox? Code examples?Working request in under 15 minutes
Privacy PolicySuppression policy documented? PII handling?Published suppression policy, CJIS boundary stated

For context: free public sources — the FBI Crime Data Explorer, city open data portals, and tools like the Real-Time Crime Index — are invaluable for research and benchmarking, but almost none satisfy all seven dimensions for production use. As we covered in our recent comparison of the RTCI and FBI UCR, the FBI CDE has excellent taxonomy documentation and reasonable geographic coverage, but an 18-month publication lag disqualifies it for real-time applications. The RTCI has better freshness but limited incident-level access. Municipal open data portals vary enormously — some publish daily; others have not been updated since 2022.

The practical implication is that most production applications end up evaluating commercial providers. The seven-dimension framework above is the right starting point for that evaluation — not the demo, not the pricing page, and not the marketing copy. Ask for the specifics. A provider that cannot answer questions about p95 latency, suppression policy, and agency relationship count probably cannot answer them because the answers are not good.

The field is maturing. The providers who survive in production will be the ones who can answer these questions precisely — and whose answers hold up over time.

Access Address-Level Crime Data

Real-time incidents · SpotScore™ safety ratings · 36-month trends · 22,000+ US cities. Normalized and verified — because raw data isn't enough.