New Paper “Transparency and Trust in Collaborative Mapping: Concerns and Dilemmas in AI-Generated Road Integration within OpenStreetMap”

Automatically generated map data has become far more common, reducing the amount of direct human involvement. AI-assisted mapping, where human validation refines machine-generated output, is increasingly used to update crowdsourced databases, such as OpenStreetMap (OSM). However, OSM contributors have expressed mixed sentiments toward this development, with ongoing debates about the trustworthiness, authenticity, and long-term implications of integrating AI-derived content into the platform.

The use of AI algorithms to generate geospatial data has spread beyond major technology corporations such as Google, Meta, and Microsoft to include smaller academic and private research laboratories. This trend raises new challenges, as AI-generated data may be imported into OSM either incrementally or in medium-sized batches without explicit attribution or declaration of origin. Such practices complicate provenance tracking and provoke questions about whether it remains possible to distinguish between AI- and human-generated features once they are integrated into the database.

The primary contribution of this paper is to expose the vulnerabilities and risks associated with the current approach to integrating AI-aR data into OSM. The research team examines how these integration practices affect the reliability of crowdsourced mapping platforms and the integrity of their datasets.

To address these concerns, the researchers conducted a twofold investigation. First, they analyzed community discussions, identifying emotional concerns and the evolving role of human mappers. Next, they assessed whether AI-assisted Roads (AI-aR) can be reliably detected within OSM, using machine learning models as diagnostic tools to reveal transparency limitations and advocate for improved tagging practices.

Community debates highlight persistent tensions surrounding inadequate tagging practices, loss of local context, and the growing influence of corporate actors. Experimental results indicate that machine learning models perform best on benchmark datasets containing purely human-generated or AI-generated data, yet their accuracy declines on mixed, real-world edits. Incorporating temporal features improves performance, suggesting that time-dependent patterns play an important role in identifying AI-generated content. Nevertheless, these patterns remain unstable and subject to rapid change, limiting their long-term reliability.

The research findings underscore the danger of data quality erosion driven by mechanisms such as validation-loop bias and the accountability sink effect. As human and AI mapping efforts become increasingly intertwined, distinguishing between AI-assisted and human-generated data will become progressively more difficult. Current models may be most effective when analyzing newly added AI content with minimal human modification, but their performance degrades as collaborative editing blurs data provenance over time.

To contribute to an open dialogue, Francis Andorful presented these findings at the State of the Map Europe conference, where the work served as an opportunity to communicate research insights back to the OSM community and to stimulate further discussion.

This research was supported by the Robert and Christine Danziger Foundation as well as partly funded by the Klaus Tschira Foundation and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Brasil (CAPES)).

Reference: Andorful, F., Herfort, B., Melanda, E. A., Antonio, N. D., Zipf, A., & Camboim, S. P. (2025). Transparency and Trust in Collaborative Mapping: Concerns and Dilemmas in AI-Assisted Road Integration within OpenStreetMap. Annals of the American Association of Geographers, 1–22. https://doi.org/10.1080/24694452.2025.2589286