Toward a Scientific Discovery Engine for Weather and Climate Data: A Visual Analytics Workbench for Embedding-Based Exploration
d041308
| DOI: 10.5065/12ZJ-ZZ25
Weather and climate science is producing increasingly large, high-dimensional datasets from numerical simulations, Earth system models, and AI-based weather and climate models. Embedding-based representations can make these data searchable through similarity search and analog retrieval, but nearest neighbors in latent space are not automatically scientifically meaningful. Researchers need tools to inspect how embeddings organize meteorological data, compare representation models, develop retrieval strategies, and verify results against physical evidence. We present an open-source visual analytics workbench for inspectable, configurable, and scalable embedding-based search over weather and climate data. The system links embedding experiments to source data, metadata, spatial context, model configurations, and retrieval parameters, allowing users to explore latent spaces, construct global or localized queries, and inspect retrieved analogs through meteorological views. We demonstrate the workbench through tropical-cyclone retrieval using ERA5 derived embeddings and IBTrACS metadata, and evaluate its out-of-core retrieval backend to show that large embedding collections can be searched beyond in-memory limits on commodity workstation hardware.
| Tropical Cyclones |
This work is licensed under a Creative Commons Attribution 4.0 International License.