Deep learning foundation models revolutionize fields like protein structure prediction, drug discovery, computer vision, and natural language processing. They rely on pretraining to learn intricate patterns from diverse data and fine-tuning to excel in specific tasks with limited data. The Earth system, comprising interconnected subsystems like the atmosphere, oceans, land, and ice, requires accurate modeling in a changing climate. Foundation models have the potential to revolutionize the modeling of these subsystems and the entire Earth.
The atmosphere, being particularly data-rich, is ideal for pretraining a foundation model. Classical numerical weather prediction (NWP) models are costly and inefficient with large datasets. Recent deep learning methods are more cost-effective and flexible, showing promise in specific prediction tasks with abundant data. However, they struggle with scarce or heterogeneous data and lack robustness in predicting extremes. Foundation models, by learning generalizable representations from diverse data, can potentially address these challenges, as demonstrated in other domains.
Researchers from Microsoft Research AI for Science, Microsoft Corporation, JKU Linz, University of Cambridge, Poly Corporation, and the University of Amsterdam introduce Aurora, a foundation model for the atmosphere. Aurora can forecast a variety of atmospheric conditions, including those with limited data, heterogeneous variables, and extreme events. Aurora produces operational forecasts for global air pollution and high-resolution weather patterns, outperforming state-of-the-art simulation tools with much lower computational costs.
Aurora, a flexible 3D foundation model of the atmosphere, ingests and predicts various surface-level and meteorological variables at different pressure levels, resolutions, and fidelities. Aurora comprises an encoder that standardizes inputs, a Vision Transformer processor that evolves representations over time, and a decoder that translates representations into specific predictions. Pretrained on diverse datasets like ERA5, CMCC, IFS-HR, HRES Forecasts, GFS Analysis, and GFS Forecasts, Aurora minimizes the next-time step mean absolute error.
Aurora competes closely with CAMS, performing within 20% RMSE on 95% of targets and matching or surpassing CAMS on 74% of targets. Aurora matches or outperforms CAMS on 86% of variables. Aurora underperforms CAMS on ozone in the upper atmosphere and short-term predictions in the lower atmosphere, where anthropogenic factors play a role. In a case study of a severe sandstorm in Iraq on June 13, 2023, Aurora successfully predicted the event a day in advance, demonstrating its effectiveness in extreme weather prediction.
Aurora represents a major advancement in environmental prediction by utilizing AI foundation models to extract insights from extensive Earth system data. Enhancing predictive accuracy, resolution, and adaptability, demonstrating AI’s potential to improve operational weather forecasting and related fields. Continued investment in AI research is crucial for tackling complex Earth system modeling challenges. However, Aurora currently generates only deterministic forecasts. Future improvements include developing probabilistic forecasts, incorporating local high-resolution datasets, optimizing compute infrastructure, and enhancing model robustness and verification to replace traditional NWP systems potentially.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter..