Student Research - Mission Earth
MOSAIC: Metadata Optimization and Statistical Anomaly Detection using Unsupervised Clustering for Geospatial Temperature Data
Organization(s):United States of America Citizen Science
Country:United States of America
Student(s):Neel Kansara, Darryl Tang, Jordan Tran
Grade Level:Secondary School (grades 9-12, ages 14-18)
Educator(s):Andrew Clark, Dr. Kevin Czajkowski, Brianna Lind, Dr. Russanne Low, Ian Maywar, Sara Mierzwiak, Pramila Paudyal
Contributors:
Report Type(s):Mission Earth Report
Protocols:Surface Temperature
Presentation Video:
View Video
Presentation Poster:
View Document
Language(s):English
Date Submitted:07/26/2025
This analysis produces a robust, interpretable flagging system for identifying anomalies in surface temperature data collected as a part of the GLOBE mission, through metadata augmentation and the use of unsupervised clustering algorithms and statistical tests. Our research highlights the varied results of individual clustering models, justifying an ensemble scoring approach between multiple unsupervised clustering models. The partial correlations among various anomaly flags are also demonstrated, showing that the flagging system captures multiple dimensions of abnormality. The metadata augmentation and flagging approach added these columns to the GLOBE dataset:
➢ Metadata: Country, Country Code, Continent, Year, Month, Biome, Season
➢ Flags: latlon_range, high_elev_hot_temp, duplicated_coords, zscore_outlier, lof, mahalanobis, ensemble_score
This flagging system enhances the trustworthiness and usability of the data, providing future researchers with a scalable strategy to detect anomalies in Earth science datasets.