Make your own forcing data from ERA5
Introduction: There are lots of forcing data can be used to drive the model, such as ERA5, NCEP (https://psl.noaa.gov/data/gridded/data.ncep.reanalysis.html), etc. Here, we recommend ERA5 to make your own single point forcing as the NCEP and others are too big to download. The ERA5 data can be obtained in a small scale, which usually light weight to download and store.
New updates: The ERA5 source updated time-series data which is faster and more efficient and get forcing from the ERA5 (grided) source.
We update the workflow to use the new ERA5 source, you can use the source='gee' and source='era5-land-ts' for getting the forcing data efficiently.
Notes: There are some differences in the calculation of source era5-land-ts and cds. This means you can use this dataset for experimenting with CLMU, but for prediction purposes, you will need to validate the output. Ideally, you should use your own forcing inputs for the best results.
1 Download the required variables from the GEE
Before running using source='gee', make sure to install the Earth Engine API: pip install earthengine-api
[1]:
%%time
import pyclmuapp
from pyclmuapp import get_forcing
print(f"pyclmuapp version: {pyclmuapp.__version__}")
lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
lat=lat, lon=lon, zbot=zbot,
start_year=start_year, end_year=end_year,
start_month=start_month, end_month=end_month,
source='gee')
pyclmuapp version: 0.0.2
Get ERA5 data from 2018-07-01 to 2018-10-01 for (51.5, 0.12)
- 2018-07-01 ~ 2018-08-01
- 2018-08-01 ~ 2018-09-01
- 2018-09-01 ~ 2018-10-01
CPU times: user 730 ms, sys: 181 ms, total: 911 ms
Wall time: 9.13 s
[1]:
'/home/junjie/github/pyclmuapp/docs/notebooks/own/era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc'
2 Download the required variables from the cdsapi
First, we will use the cdsapi package to download the data. If you don’t have the package installed, you can install it using the following command:
pip install cdsapi
cat <<EOF > ~/.cdsapirc
url: {api-url}
key: {uid}:{api-key}
EOF
note: it will take a long time to run this script, so you can run it in the background and check the output file later.
We also can use the interface to download the data. The code below is an example of how to download the data using the interface.
ref: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form
[2]:
import pyclmuapp
from pyclmuapp import get_forcing
print(f"pyclmuapp version: {pyclmuapp.__version__}")
pyclmuapp version: 0.0.2
[3]:
%%time
lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
lat=lat, lon=lon, zbot=zbot,
start_year=start_year, end_year=end_year,
start_month=start_month, end_month=end_month,
source='era5-land-ts')
Downloading data for 51.5, 0.1, 2018-07-01 to 2018-10-01...
2026-01-10 23:17:06,727 WARNING [2025-06-04T00:00:00] This dataset provides user-selected location timeseries of [ERA5 Land data](https://doi.org/10.24381/cds.e2161bac) for a limited set of variables. Its content may be undergo changes over time (e.g. file format, data file structure, deprecation etc) and is **not recommended for use in a production environment**.
For users interested in large regions, the original ERA5 Land catalogue entry remains the more efficient option.
We will make every effort to notify users of changes, either through this banner and/or the [Forum](https://forum.ecmwf.int/).
2026-01-10 23:17:06,727 INFO Request ID is cc48820b-f777-4f81-a0af-bca1178bb679
2026-01-10 23:17:07,735 INFO status has been updated to accepted
2026-01-10 23:17:22,599 INFO status has been updated to successful
CPU times: user 122 ms, sys: 13.7 ms, total: 135 ms
Wall time: 19.3 s
/home/junjie/github/pyclmuapp/pyclmuapp/era_forcing.py:68: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
ds = xr.merge(xr_ls)
/home/junjie/github/pyclmuapp/pyclmuapp/era_forcing.py:68: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
ds = xr.merge(xr_ls)
[3]:
'/home/junjie/github/pyclmuapp/docs/notebooks/own/era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc'
[4]:
%%time
lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
lat=lat, lon=lon, zbot=zbot,
start_year=start_year, end_year=end_year,
start_month=start_month, end_month=end_month,
source='cds')
download: ./era5_data/era5_single/era5_single_2018_07_51.5_0.12.zip
2026-01-10 23:17:26,038 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-01-10 23:17:26,039 INFO Request ID is 02fa3302-0fe9-4714-9c46-e1ca94b40160
2026-01-10 23:17:27,086 INFO status has been updated to accepted
2026-01-10 23:19:27,994 INFO status has been updated to successful
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
download: ./era5_data/era5_single/era5_single_2018_08_51.5_0.12.zip
2026-01-10 23:19:29,600 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-01-10 23:19:29,601 INFO Request ID is 0d8ffb15-92dd-41b0-ad90-d7f80af30db8
2026-01-10 23:19:29,647 INFO status has been updated to accepted
2026-01-10 23:19:40,724 INFO status has been updated to running
2026-01-10 23:25:56,899 INFO status has been updated to successful
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
download: ./era5_data/era5_single/era5_single_2018_09_51.5_0.12.zip
2026-01-10 23:25:59,335 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-01-10 23:25:59,335 INFO Request ID is 935048d3-0a4e-4428-a733-f2acf2e46e01
2026-01-10 23:25:59,391 INFO status has been updated to accepted
2026-01-10 23:26:09,228 INFO status has been updated to running
2026-01-10 23:32:20,975 INFO status has been updated to successful
CPU times: user 375 ms, sys: 30.2 ms, total: 405 ms
Wall time: 14min 57s
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
/home/junjie/github/pyclmuapp/pyclmuapp/era5_forcing.py:88: FutureWarning: In a future version of xarray the default value for compat will change from compat='no_conflicts' to compat='override'. This is likely to lead to different results when combining overlapping variables with the same name. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set compat explicitly.
single_ds = xr.merge(ds_ls)
[4]:
'/home/junjie/github/pyclmuapp/docs/notebooks/own/era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc'
command line
this is same as above
[5]:
! pyclmuapp --pyclmuapp_mode get_forcing \
--lat 51.5 --lon 0.12 --zbot 30 \
--start_year 2018 --end_year 2018 \
--start_month 7 --end_month 9 --source "era5-land-ts"
# will download and save in the default folder `./era5_forcing/`
# the output file will like `./era5_land_ts_forcing/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc`
Namespace(init=False, pwd='/home/junjie/github/pyclmuapp/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='get_forcing', has_container=True, usr_domain=None, usr_forcing=None, usr_surfdata=None, output_prefix='_clm.nc', case_name='usp_case', run_startdate=None, start_tod='00000', stop_option='ndays', stop_n='1', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean=False, surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=51.5, lon=0.12, outputname='surfdata.nc', zbot=30, start_year=2018, end_year=2018, start_month=7, end_month=9, source='era5-land-ts')
The forcing file era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc already exists.
test the forcing data
[6]:
! pyclmuapp \
--container_type docker \
--iflog True \
--logfile "pyclmuapp.log" \
--usr_forcing "era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
--usr_surfdata "surfdata.nc" \
--RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
--RUN_TYPE "coldstart" \
--hist_mfilt "1000000000" \
--output_prefix "_clm.nc" \
--CASE_NAME "pyclmuapp" \
--clean True
Namespace(init=False, pwd='/home/junjie/github/pyclmuapp/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is: pyclmuapp
The log file is: pyclmuapp.log
The output file is: {'original': ['/home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp.clm2.h0.2018-07-01-00000.nc']}
[7]:
! pyclmuapp \
--container_type docker \
--iflog True \
--logfile "pyclmuapp.log" \
--usr_forcing "era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
--usr_surfdata "surfdata.nc" \
--RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
--RUN_TYPE "coldstart" \
--hist_mfilt "1000000000" \
--output_prefix "_clm.nc" \
--CASE_NAME "pyclmuapp" \
--clean True
Namespace(init=False, pwd='/home/junjie/github/pyclmuapp/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is: pyclmuapp
The log file is: pyclmuapp.log
The output file is: {'original': ['/home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp.clm2.h0.2018-07-01-00000.nc']}
[8]:
! pyclmuapp \
--container_type docker \
--iflog True \
--logfile "pyclmuapp.log" \
--usr_forcing "era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
--usr_surfdata "surfdata.nc" \
--RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
--CASE_NAME "pyclmuapp" \
--clean True
Namespace(init=False, pwd='/home/junjie/github/pyclmuapp/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is: pyclmuapp
The log file is: pyclmuapp.log
The output file is: {'original': ['/home/junjie/github/pyclmuapp/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp.clm2.h0.2018-07-01-00000.nc']}