arxiv:2605.21712

Broadening Access to Transportation Safety Data with Generative AI: A Schema-Grounded Framework for Spatial Natural Language Queries

Published on May 20

· Submitted by

Mahdi Azhdari on May 26

University of Massachusetts Amherst

Upvote

Authors:

Mahdi Azhdari ,

Abstract

A natural language interface for transportation safety analysis uses large language models to translate user queries into structured spatial operations while maintaining deterministic database execution for reliable and reproducible results.

AI-generated summary

Transportation safety analysis requires integrating crash records, roadway attributes, and geospatial data through GIS-based workflows, but access remains uneven across agencies and community stakeholders. Technical prerequisites create a gap between analytical tools central to safety planning and the practitioners able to use them. Local agencies, school committees, and residents may have safety concerns but limited capacity to retrieve, filter, map, and analyze relevant data. Generative AI offers a way to narrow this divide, but its public-sector use raises questions about reliability, reproducibility, and governance. This paper presents a schema-grounded natural language interface for transportation safety analysis, using a large language model (LLM) to interpret user intent while preserving deterministic, reviewable execution against an authoritative database. User queries are translated into structured semantic frames, validated by a rule-based layer, compiled into a typed directed acyclic graph of spatial operations, and executed against a PostGIS database. This bounded design separates language interpretation from deterministic execution, keeping results reproducible and schema-grounded while removing access barriers. The framework is evaluated using a statewide Massachusetts transportation safety database integrating crash records, roadway attributes, and geospatial layers including schools, bus stops, crosswalks, and municipal boundaries. All queries executed successfully; the validation layer corrects errors in 29% of evaluation queries, reflecting the gap between flexible natural language and strict schema-grounded requirements. The results suggest that combining natural language accessibility with deterministic execution is a practical direction for broadening access to transportation safety data, with implications for trustworthy AI in public-sector planning.

View arXiv page View PDF GitHub 0 Add to collection

Community

mazhdari

Paper author Paper submitter about 5 hours ago

We introduce a schema-grounded framework for natural language access to spatial safety analysis. An LLM interprets user queries into a structured semantic frame (entities with roles like primary/support/scope/anchor/filter, spatial constraints, attribute constraints, ranking), which a rule-based validation and repair layer then normalizes against a domain-specific schema. The validated frame compiles into a typed directed acyclic graph of PostGIS operations and executes deterministically. Language interpretation and analytical execution are fully separated, so the model cannot influence results beyond the frame.
Evaluated on a statewide Massachusetts transportation safety database (127K crash records, 504K road segments, plus schools, bus stops, crosswalks, and municipal boundaries) across 80 queries spanning entity retrieval, spatial scoping, attribute and temporal filtering, spatial relationships, and ranking. All 80 executed successfully; the validation layer corrected 29% before execution, mostly normalization (e.g., "cyclists" → "Collision with cyclist", "1km" → 1000m). The design suggests a pattern for bounded NL interfaces in domains where reproducibility, auditability, and schema conformance matter. Supports Gemini 2.5 Flash and GPT-4o as configurable interpretation backends.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.21712

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.21712 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.21712 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.21712 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.