-
Notifications
You must be signed in to change notification settings - Fork 2
Getting Started
- Simulated data
- When running
basd, you'll likely be wanting to adjust a simulated climate dataset. So we will just refer to the data that we want to adjust as the "simulated data".
- When running
- Observational data
- When running
basd, you need a reference dataset. Again, this will tend to be an observational dataset over a historical period, so we will refer to the reference data as the "observational data".
- When running
- Application period
-
basdadjust and downscales data over a given time period, which we will call the "application period"
-
- Target period
- Similarly,
basduses observational data over a given time period, to compare with simulated data over that same period. This period where we make the comparison is called the "target period".
- Similarly,
When starting with basd, we first need to decide what climate data we'd wish to adjust and over what period, and what observational dataset we'll use as reference. In the quickstarter notebook we make use of small example datasets provided here.
For example, let's say we want to bias adjust and downscale one run from the CanESM5 model from the CMIP6 experiments, and we want to start with precipitation (shorthand name pr). Plus, we can use the W5E5v2.0 observational dataset for bias adjustment and statistical downscaling created for ISIMIP3 for our reference. The W5E5v2.0 data covers from 1970-2014, so we'll use that as our target period, and the whole CMIP6 future period of 2015-2100 as our application period.
Then we can outline the steps we need to take to get our desired output.
-
Save the input data to your machine.
This will include the simulated data that covers your application period and target period, and the observational data which will covers the target period.
-
Choose key parameters.
Each climate variable uses slightly different steps/parameters during
basdthat model the distributions the best for that variable. These include things like the distribution family to use, lower/upper bounds and thresholds, the method of trend preservation, etc. You can learn more about each of these parameters on the Bias Adjustment and Statistical Downscaling wikie pages, or especially from Lange's paper. Also in that paper are default parameters for each climate variable that are good for nearly all use-cases. -
Apply
basdAt this point, the important decisions are made, you just need to create a Python script and call the
basdfunctions while supplying your data and settings/parameters. Of course, easier said than done, so the next section will be dedicated to this step.
- In a Python script, load the relevant packages:
-
basd- for bias adjustment and statistical downscaling -
xarray- for accessing, manipulating, and writing datasets -
dask- for parallelization and lazy evaluation-
LocalCluster- to spread processes over computing resources -
Client- for managing processes, threads, local cluster, etc.
-
-
os- optional for basic OS tasks
import basd
from dask.distributed import LocalCluster, Client
import xarray as xr- Read in your data, and extract the three key periods.
# Read in data
observational_data = xr.open_mfdataset(obs_input_file, chunks={'time': 365})
simulation_target = xr.open_mfdataset(sim_input_file, chunks={'time': 365})
simulation_application = simulation_target.copy()
# Slice time periods
observational_data = observational_data.sel(time = slice(f'{target_start_year}', f'{target_end_year}'))
simulation_target = simulation_target.sel(time = slice(f'{target_start_year}', f'{target_end_year}'))
simulation_application = simulation_application.sel(time = slice(f'{application_start_year}', f'{application_end_year}'))-
Set your settings and parameters with the
Parametersobject.Wet-day precipitation in a grid cell over time follows a gamma distribution, which has a lower bound of 0. Thus we set this in our parameters, and we set a lower threshold of 0.0000011574 since we specified "wet days". That exact value is referenced in Lange 2019 and so used here. We also specify that we want a "mixed" additive and multiplicative trend preservation. Again, learn more about what all of these mean in Lange 2019, or on the other Wiki pages. The
n_iterationsparameter is used to say how many times we should perform the downscaling fitting step.# Default Precipitation Parameters parameters = basd.Parameters(lower_bound=0, lower_threshold=0.0000011574, trend_preservation='mixed', distribution='gamma', n_iterations=10)
-
Applying the
basdfunctions.4.1. Bias Adjustment.
First we initialize the bias adjustment step by providing our data, parameters, and the name of the variable that we're adjusting.
# Initialize Bias Adjustment ba = basd.init_bias_adjustment( observational_data, simulation_target, simulation_application, 'pr', parameters )
Then we give the outputs of that initialization, as well as paths and names of files depending on if we want daily or monthly data output, or both like in this case. This performs the bias adjustment algorithm and saves the output as NetCDF.
# Perform Bias Adjustment basd.adjust_bias( init_output = ba, # Initialization output_dir = output_ba_path, # Output directory path day_file = output_day_ba_file_name, # Output daily data file name month_file = output_mon_ba_file_name # Output monthly data file name )
4.2. Statistical Downscaling
Again, we begin by initializing our downscaling object, which requires the input datasets, name of the climate variable, and parameter object. The input data sets are the observational data, and the bias-adjusted data created in the previous step.
We can read in our bias-adjusted data:
ba_file_name = os.path.join(output_ba_path, output_day_ba_file_name) ba_simulation_data = xr.open_mfdataset( ba_file_name, chunks={'time': 100} )
and then initialize our downscaling process:
ds = basd.init_downscaling( observational_data, ba_simulation_data, 'pr', parameters )
Again, just like the bias-adjustment we can now run the downscaling by supplying the output of that initialization, as well as paths and names of files depending on if we want daily or monthly data output, or both like in this case. This performs the downscaling algorithm and saves the output as NetCDF.
basd.downscale( init_output = ds, output_dir = output_basd_path, day_file = output_day_basd_file, month_file = output_mon_basd_file, encoding={'pr': fine_encoding} )
In this case we also supplied an encoding for the precipitation data as saved to NetCDF. This may look like the following for example:
# NetCDF output encoding fine_encoding = { 'zlib': True, # Compression engine 'shuffle': True, # Memory storage option often useful when losts of 0s present 'complevel': 5, # Compression level 'fletcher32': False, # Optional checksums 'contiguous': False, # Storage option good for large chunked data 'chunksizes': (1, 360, 720), # 1 point per chunk along time, 360 lat, 720 lon 'dtype': 'float32', 'missing_value': 1e+20, '_FillValue': 1e+20 }