Participatory Mapping allows local communities to contribute spatial knowledge and insights that are often missing from official datasets. In practice, participants draw on printed basemaps during fieldwork. These annotated maps are then scanned or photographed, creating sketch maps that may include information on hazards, infrastructure, places of interest, or other important local features.
Turning this information into digital form is still difficult. Hand-drawn notes differ in style, color, thickness, and placement. Scanning or photographing the maps can add problems like noise, shadows, lighting changes, distortions, and clutter. The basemap itself can be complicated, especially with satellite images. Current methods often use pixel-level segmentation or simple image comparison, but these can have trouble handling real-world differences.
The study treats sketch-map annotation extraction as an object-level detection task. Rather than classifying each pixel as basemap or annotation, the method detects hand-drawn markings as objects using YOLOv9e-based detectors. The study compares single-image detection, where the model uses only the annotated map, with dual-image detection, where it uses both the clean basemap and the annotated map. In the paired approach, a Siamese YOLOv9e model compares features from both images to highlight annotation-specific changes and focus on what participants added.
The evaluation uses about 2,300 real printed sketch maps and around 18,000 synthetic sketch-map samples generated over OpenStreetMap and Esri World Imagery basemaps. The real maps include both vector-rendered and satellite basemaps, cover different spatial scales, and show a range of annotation styles. The synthetic samples mimic hand-drawn markings and realistic photographic artifacts to support training when manually labeled data is limited.
The results show strong performance across different basemap types. A single-image YOLOv9e model with synthetic pretraining reaches mAP@50 scores of 91.5% on hand-drawn satellite imagery maps and 97.3% on hand-drawn OSM basemaps. Using paired clean and annotated basemaps with the Siamese YOLOv9e model improves performance to 97.4% on satellite imagery and 98.1% on OSM basemaps. The improvement is especially clear for satellite imagery, where complex backgrounds make it harder to separate annotations from the underlying map content.
Synthetic pre-training also improves results on real hand-drawn data. This suggests that simulated annotations can be useful when there are not many manually labeled examples. However, the authors note that the detector is intended to support semi-automatic sketch map digitization. Manual correction may still be needed if false positives or missed annotations affect how well participant input is represented, especially for thin, overlapping, or poorly captured markings.
Overall, the paper shows that object-level detection, especially when combined with paired clean, annotated basemaps for change detection, can make the extraction of hand-drawn information from participatory sketch maps more robust and better suited for downstream GIS workflows.
Reference: Langer, C., Thomé, C., Fulman, N., Knoblauch, S., Zipf, A., & Grinblat, Y. (2026, June 10). Object-Level Detection of Hand-Drawn Annotations in Participatory Sketch Maps Using Paired Clean and Annotated Basemaps. AGILE GIScience Series, 7, 32. AGILE-GISS – Object-Level Detection of Hand-Drawn Annotations in Participatory Sketch Maps Using Paired Clean and Annotated Basemaps



