Make your own forcing data from ERA5

Introduction: There are lots of forcing data can be used to drive the model, such as ERA5, NCEP (https://psl.noaa.gov/data/gridded/data.ncep.reanalysis.html), etc. Here, we recommend ERA5 to make your own single point forcing as the NCEP and others are too big to download. The ERA5 data can be obtained in a small scale, which usually light weight to download and store.

New updates: The ERA5 source updated time-series data which is faster and more efficient and get forcing from the ERA5 (grided) source.

We update the workflow to use the new ERA5 source, you can use the source='gee' and source='era5-land-ts' for getting the forcing data efficiently.

Notes: There are some differences in the calculation of source era5-land-ts and cds. This means you can use this dataset for experimenting with CLMU, but for prediction purposes, you will need to validate the output. Ideally, you should use your own forcing inputs for the best results.

1 Download the required variables from the GEE

Before running using source='gee', make sure to install the Earth Engine API: pip install earthengine-api

[1]:
%%time

import pyclmuapp
from pyclmuapp import get_forcing

print(f"pyclmuapp version: {pyclmuapp.__version__}")

lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
    lat=lat, lon=lon, zbot=zbot,
    start_year=start_year, end_year=end_year,
    start_month=start_month, end_month=end_month,
    source='gee')
pyclmuapp version: 0.0.2
Get ERA5 data from 2018-07-01 to 2018-10-01 for (51.5, 0.12)
  - 2018-07-01 ~ 2018-08-01
  - 2018-08-01 ~ 2018-09-01
  - 2018-09-01 ~ 2018-10-01
CPU times: user 914 ms, sys: 296 ms, total: 1.21 s
Wall time: 11.1 s
[1]:
'/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc'

2 Download the required variables from the cdsapi

First, we will use the cdsapi package to download the data. If you don’t have the package installed, you can install it using the following command:

pip install cdsapi

cat <<EOF > ~/.cdsapirc
url: {api-url}
key: {uid}:{api-key}
EOF

How to get your CDS API?

note: it will take a long time to run this script, so you can run it in the background and check the output file later.

We also can use the interface to download the data. The code below is an example of how to download the data using the interface.

ref: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form

[2]:
import pyclmuapp
from pyclmuapp import get_forcing

print(f"pyclmuapp version: {pyclmuapp.__version__}")
pyclmuapp version: 0.0.2
[3]:
%%time

lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
    lat=lat, lon=lon, zbot=zbot,
    start_year=start_year, end_year=end_year,
    start_month=start_month, end_month=end_month,
    source='era5-land-ts')
Downloading data for 51.5, 0.1, 2018-07-01 to 2018-10-01...
2025-09-07 16:28:53,277 WARNING [2025-06-04T00:00:00] This dataset provides user-selected location timeseries of [ERA5 Land data](https://doi.org/10.24381/cds.e2161bac) for a limited set of variables. Its content may be undergo changes over time (e.g. file format, data file structure, deprecation etc) and is **not recommended for use in a production environment**.

For users interested in large regions, the original ERA5 Land catalogue entry remains the more efficient option.
We will make every effort to notify users of changes, either through this banner and/or the [Forum](https://forum.ecmwf.int/).
2025-09-07 16:28:53,278 WARNING [2025-09-07T15:28:53.255384] You are using a deprecated API endpoint. If you are using cdsapi, please upgrade to the latest version.
2025-09-07 16:28:53,279 INFO Request ID is b743ee3c-1455-4447-bf63-ddc21d2809bb
2025-09-07 16:28:53,356 INFO status has been updated to accepted
2025-09-07 16:29:00,766 INFO status has been updated to running
2025-09-07 16:29:13,598 INFO status has been updated to successful

CPU times: user 693 ms, sys: 242 ms, total: 935 ms
Wall time: 23.8 s
[3]:
'/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc'
[4]:
%%time

lat=51.5
lon=0.12
zbot=30
start_year=2018
end_year=2018
start_month=7
end_month=9
get_forcing(
    lat=lat, lon=lon, zbot=zbot,
    start_year=start_year, end_year=end_year,
    start_month=start_month, end_month=end_month,
    source='cds')
download: ./era5_data/era5_single/era5_single_2018_07_51.5_0.12.zip
2025-09-07 16:29:18,199 WARNING [2025-09-07T15:29:18.180357] You are using a deprecated API endpoint. If you are using cdsapi, please upgrade to the latest version.
2025-09-07 16:29:18,200 INFO Request ID is 10d5058b-44fb-43d4-b168-aef52f0ca94c
2025-09-07 16:29:18,262 INFO status has been updated to accepted
2025-09-07 16:29:30,737 INFO status has been updated to running
2025-09-07 16:37:37,299 INFO status has been updated to successful

download: ./era5_data/era5_single/era5_single_2018_08_51.5_0.12.zip
2025-09-07 16:37:39,234 WARNING [2025-09-07T15:37:39.214193] You are using a deprecated API endpoint. If you are using cdsapi, please upgrade to the latest version.
2025-09-07 16:37:39,234 INFO Request ID is 72f36502-15ca-4795-af8a-891e76c90535
2025-09-07 16:37:39,295 INFO status has been updated to accepted
2025-09-07 16:37:51,874 INFO status has been updated to running
2025-09-07 16:43:58,271 INFO status has been updated to successful

download: ./era5_data/era5_single/era5_single_2018_09_51.5_0.12.zip
2025-09-07 16:44:00,283 WARNING [2025-09-07T15:44:00.261522] You are using a deprecated API endpoint. If you are using cdsapi, please upgrade to the latest version.
2025-09-07 16:44:00,284 INFO Request ID is 3445edd0-4c41-4aea-a4cc-3de2a291396d
2025-09-07 16:44:00,387 INFO status has been updated to accepted
2025-09-07 16:44:12,881 INFO status has been updated to running
2025-09-07 16:52:19,252 INFO status has been updated to successful

CPU times: user 773 ms, sys: 607 ms, total: 1.38 s
Wall time: 23min 5s
[4]:
'/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc'

command line

this is same as above

[5]:
! pyclmuapp --pyclmuapp_mode get_forcing \
    --lat 51.5 --lon 0.12 --zbot 30 \
    --start_year 2018 --end_year 2018 \
    --start_month 7 --end_month 9 --source "era5-land-ts"
# will download and save in the default folder `./era5_forcing/`
# the output file will like `./era5_land_ts_forcing/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc`
Namespace(init=False, pwd='/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='get_forcing', has_container=True, usr_domain=None, usr_forcing=None, usr_surfdata=None, output_prefix='_clm.nc', case_name='usp_case', run_startdate=None, start_tod='00000', stop_option='ndays', stop_n='1', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean=False, surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=51.5, lon=0.12, outputname='surfdata.nc', zbot=30, start_year=2018, end_year=2018, start_month=7, end_month=9, source='era5-land-ts')
The forcing file era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc already exists.

test the forcing data

[6]:
! pyclmuapp \
    --container_type docker \
    --iflog True \
    --logfile "pyclmuapp.log" \
    --usr_forcing "era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
    --usr_surfdata "surfdata.nc" \
    --RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
    --RUN_TYPE "coldstart" \
    --hist_mfilt "1000000000" \
    --output_prefix "_clm.nc" \
    --CASE_NAME "pyclmuapp" \
    --clean True
Namespace(init=False, pwd='/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_gee_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is:  pyclmuapp
The log file is:  pyclmuapp.log
The output file is:  {'original': ['/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp_clm0_2025-09-07_16-52-52_clm.nc']}
[7]:
! pyclmuapp \
    --container_type docker \
    --iflog True \
    --logfile "pyclmuapp.log" \
    --usr_forcing "era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
    --usr_surfdata "surfdata.nc" \
    --RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
    --RUN_TYPE "coldstart" \
    --hist_mfilt "1000000000" \
    --output_prefix "_clm.nc" \
    --CASE_NAME "pyclmuapp" \
    --clean True
Namespace(init=False, pwd='/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_land_ts_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is:  pyclmuapp
The log file is:  pyclmuapp.log
The output file is:  {'original': ['/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp_clm0_2025-09-07_16-53-08_clm.nc']}
[8]:
! pyclmuapp \
    --container_type docker \
    --iflog True \
    --logfile "pyclmuapp.log" \
    --usr_forcing "era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc" \
    --usr_surfdata "surfdata.nc" \
    --RUN_STARTDATE "2018-07-01" --STOP_OPTION "ndays" --STOP_N "15" \
    --CASE_NAME "pyclmuapp" \
    --clean True
Namespace(init=False, pwd='/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own', container_type='docker', input_path=None, output_path=None, log_path=None, scripts_path=None, pyclmuapp_mode='usp', has_container=True, usr_domain=None, usr_forcing='era5_data/era5_forcing_51.5_0.12_30_2018_7_2018_9.nc', usr_surfdata='surfdata.nc', output_prefix='_clm.nc', case_name='pyclmuapp', run_startdate='2018-07-01', start_tod='00000', stop_option='ndays', stop_n='15', run_type='coldstart', run_refcase='None', run_refdate='None', run_reftod='00000', urban_hac='ON', iflog=True, logfile='pyclmuapp.log', hist_type='GRID', hist_nhtfrq=1, hist_mfilt=1000000000, clean='True', surf_var=None, surf_action=0, forcing_var=None, forcing_action=0, script=None, urbsurf=None, soildata=None, pct_urban=[0, 0, 100.0], lat=None, lon=None, outputname='surfdata.nc', zbot=30, start_year=2012, end_year=2012, start_month=1, end_month=12, source='cds')
Copying the file era5_forcing_51.5_0.12_30_2018_7_2018_9.nc to the /Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/inputfolder/usp
The domain file is not provided
The case is:  pyclmuapp
The log file is:  pyclmuapp.log
The output file is:  {'original': ['/Users/user/Documents/GitHub/pyclmuapp_env/docs/notebooks/own/workdir/outputfolder/lnd/hist/pyclmuapp_clm0_2025-09-07_16-53-22_clm.nc']}