Mission Earth Projects

GLOBE Side Navigation

Identifying Outliers in the GLOBE Surface Temperature Protocol

Student(s):Vin Bhat, Saanvi Shah
Grade Level:Secondary School (grades 9-12, ages 14-18)
Educator(s):Andrew Clark (IGES), Kevin Czajkowski (UToledo), Brianna Lind (IGES), Rusanne Low (IGES), Ian Maywar (CUNY), Sara Mierzwiak (UToledo), Pramila Paudyal (UToledo)
Contributors:
Report Type(s):International Virtual Science Symposium Report, Mission Earth Report
Protocols:Surface Temperature
Presentation Video: View Video
Language(s):English
Date Submitted:2025-07-17
The GLOBE database serves as a vital resource to scientists research Earth’s changing climate. Thus, it is crucial to ensure the data provided to researchers is accurate and filterable. Our research aims to identify outliers in the GLOBE Surface Temperature Protocol database through the use of a multitude of data science techniques. We processed a dataset of 716,031 entries and 64 features, removing fully null columns and profiling values for completeness. We found 731 location outliers that were either impossible coordinates or located at (0,0) latitude and longitude, often linked to bulk submissions by single organizations such as Iksal ‘c’ Primary School and the University of Toledo. We flagged 1,024 entries with surface temperature exceeding 60°C, likely due to Fahrenheit-Celsius entry confusion. Boolean interpretable flagging columns were created to identify outlier types, and 60,031 rows (~8.4%) had at least one flag. After removing these, we applied unsupervised clustering techniques like K-Means and Gaussian Mixture Models which flagged an additional 7,309 anomalies. Our final cleaned dataset improves interpretability while preserving the majority of valid records. These efforts culminated in a framework that provides real-time, interpretable feedback to citizen scientists and ensures higher data integrity for downstream research. Our proposed next steps include expanding condition-based error thresholds, integrating the pipeline with the GLOBE platform, and publishing insights to support environmental monitoring and education.



Comments