Introduction
Regional or local analyses of bio-data (e.g., species diversity) often involve geo-data as explanatory variables (e.g., climate). Species-diversity data are typically at fine spatial scales, but in many circumstances, in situ measurements of climate data are either unavailable or sparse, because collecting them can be impractical or prohibitively expensive. Remote-sensing data (e.g., observations collected by satellites) provide useful alternatives, but such data sets can require specialised technical expertise to process and are usually available only at relatively coarse spatial resolutions, often coarser than species-diversity data.
Nevertheless, a common workflow in ecological studies, when fine-resolution climate data are not available, is to downscale coarse-resolution data given by numerical models, including general circulation models and climate reanalyses (e.g., ERA5; Hersbach et al., 2020). Downscaling explanatory variables from coarse resolution to fine resolution shows local details and spatial patterns that may not actually be present, and hence the downscaling comes with uncertainty. What is certain is that fine-resolution aggregates to the known coarse-resolution data. It is essential for drawing scientifically valid conclusions to quantify this uncertainty. Hence, we apply techniques from spatial statistics (Cressie, 1993; Ma et al., 2019; Zammit-Mangion et al., 2015) to statistically downscale data and provide valid uncertainty quantification, which represents the first stage of a two-stage protocol for inference on fine-scale eco-processes from coarse-scale geo-data.
Figure 1 illustrates various data sources commonly used for downscaling, and Figure 2 demonstrates the statistical challenges in downscaling coarse-resolution data. Figure 3 presents a real-world example, where moss data were collected at a fine ($1\times1\thinspace{\rm km}^2$) resolution from two expeditions to Bunger Hills in East Antarctica (Gore et al., 2023). In order to predict moss presence using climate, data from the coarser resolution (approximately $5\times11\thinspace{\rm km}^2$) ERA5 climate reanalysis product are used. Then statistical downscaling follows.
Figure 1: Data sources commonly used for downscaling: (a) A General Circulation Model (GCM) providing boundary conditions for an RCM (Regional Climate Model) (Image credit: Ambrizzi et al. (2019)); (b) A reanalysis climate product that uses assimilation and numerical models (e.g., ECMWF) together, for example, the ERA5 renalysis data (Image credit: ECMWF). In each case, their resolutions are usually too coarse for species-diversity studies.
Figure 2: (a) A true fine-resolution field, which is aggregated; (b) The coarse-resolution aggregated field; (c) Downscaling to recover (a) from (b).
Figure 3: (a) Moss-data sampling sites in Southern Bunger Hills (fine resolution); (b) ERA5 soil-temperature data (coarse resolution) with the sampling sites in the background.