-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add function for accessing MERRA-2 #2572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kandersolar
wants to merge
15
commits into
pvlib:main
Choose a base branch
from
kandersolar:merra2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+333
−1
Open
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
ff35e6e
function, tests, whatsnew, docs
kandersolar 880646c
lint
kandersolar 40415fd
simplify tz handling
kandersolar 1418273
make new earthdata secrets accessible to tests
kandersolar fbb7bb3
add test for tz-aware inputs
kandersolar e48901d
Merge branch 'main' into merra2
kandersolar 12fcd7a
Merge remote-tracking branch 'upstream/main' into merra2
kandersolar 5e80cba
add tests for HTTPError on bad inputs
kandersolar b63f621
Merge branch 'main' into merra2
kandersolar 5f7fc9e
add a few more variables to docstring
kandersolar cb728c1
Merge branch 'main' into merra2
kandersolar ecad43b
Apply suggestions from code review
kandersolar e88a8cc
add link to datasets; add table with variable description
kandersolar 6151946
tweak tests
kandersolar c4c72e0
add LWGNT, LWGEM to variable map and docstring
kandersolar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,193 @@ | ||
| import pandas as pd | ||
| import requests | ||
| from io import StringIO | ||
|
|
||
|
|
||
| VARIABLE_MAP = { | ||
| 'SWGDN': 'ghi', | ||
| 'SWGDNCLR': 'ghi_clear', | ||
| 'ALBEDO': 'albedo', | ||
| 'LWGNT': 'lwn', | ||
| 'LWGEM': 'lwu', | ||
| 'T2M': 'temp_air', | ||
| 'T2MDEW': 'temp_dew', | ||
| 'PS': 'pressure', | ||
| 'TOTEXTTAU': 'aod550', | ||
| } | ||
|
|
||
|
|
||
| def get_merra2(latitude, longitude, start, end, username, password, dataset, | ||
| variables, map_variables=True): | ||
| """ | ||
| Retrieve MERRA-2 time-series irradiance and meteorological reanalysis data | ||
| from NASA's GESDISC data archive. | ||
|
|
||
| MERRA-2 [1]_ offers modeled data for many atmospheric quantities at hourly | ||
| resolution on a 0.5° x 0.625° global grid. | ||
|
|
||
| Access must be granted to the GESDISC data archive before EarthData | ||
| credentials will work. See [2]_ for instructions. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| latitude : float | ||
| In decimal degrees, north is positive (ISO 19115). | ||
| longitude: float | ||
| In decimal degrees, east is positive (ISO 19115). | ||
| start : datetime like or str | ||
| First timestamp of the requested period. If a timezone is not | ||
| specified, UTC is assumed. | ||
| end : datetime like or str | ||
| Last timestamp of the requested period. If a timezone is not | ||
| specified, UTC is assumed. Must be in the same year as ``start``. | ||
| username : str | ||
| NASA EarthData username. | ||
| password : str | ||
| NASA EarthData password. | ||
| dataset : str | ||
| Dataset name (with version), e.g. "M2T1NXRAD.5.12.4". | ||
| variables : list of str | ||
| List of variable names to retrieve. See the documentation of the | ||
| specific dataset you are accessing for options. | ||
| map_variables : bool, default True | ||
| When true, renames columns of the DataFrame to pvlib variable names | ||
| where applicable. See variable :const:`VARIABLE_MAP`. | ||
|
|
||
| Raises | ||
| ------ | ||
| ValueError | ||
| If ``start`` and ``end`` are in different years, when converted to UTC. | ||
|
|
||
| Returns | ||
| ------- | ||
| data : pd.DataFrame | ||
| Time series data. The index corresponds to the middle of the interval. | ||
| meta : dict | ||
| Metadata. | ||
|
|
||
| Notes | ||
| ----- | ||
| The following datasets provide quantities useful for PV modeling: | ||
|
|
||
| +------------------------------------+-----------+------------+ | ||
| | Dataset | Variable | pvlib name | | ||
| +====================================+===========+============+ | ||
| | `M2T1NXRAD.5.12.4 <M2T1NXRAD_>`_ | SWGDN | ghi | | ||
| | +-----------+------------+ | ||
| | | SWGDNCLR | ghi_clear | | ||
| | +-----------+------------+ | ||
| | | ALBEDO | albedo | | ||
| | +-----------+------------+ | ||
| | | LWGNT | lwn | | ||
| | +-----------+------------+ | ||
| | | LWGEM | lwu | | ||
| +------------------------------------+-----------+------------+ | ||
| | `M2T1NXSLV.5.12.4 <M2T1NXSLV_>`_ | T2M | temp_air | | ||
| | +-----------+------------+ | ||
| | | U10 | n/a | | ||
| | +-----------+------------+ | ||
| | | V10 | n/a | | ||
| | +-----------+------------+ | ||
| | | T2MDEW | temp_dew | | ||
| | +-----------+------------+ | ||
| | | PS | pressure | | ||
| | +-----------+------------+ | ||
| | | TO3 | n/a | | ||
| | +-----------+------------+ | ||
| | | TQV | n/a | | ||
| +------------------------------------+-----------+------------+ | ||
| | `M2T1NXAER.5.12.4 <M2T1NXAER_>`_ | TOTEXTTAU | aod550 | | ||
| | +-----------+------------+ | ||
| | | TOTSCATAU | n/a | | ||
| | +-----------+------------+ | ||
| | | TOTANGSTR | n/a | | ||
| +------------------------------------+-----------+------------+ | ||
|
|
||
| .. _M2T1NXRAD: https://disc.gsfc.nasa.gov/datasets/M2T1NXRAD_5.12.4/summary | ||
| .. _M2T1NXSLV: https://disc.gsfc.nasa.gov/datasets/M2T1NXSLV_5.12.4/summary | ||
| .. _M2T1NXAER: https://disc.gsfc.nasa.gov/datasets/M2T1NXAER_5.12.4/summary | ||
|
|
||
| A complete list of datasets and their documentation is available at [3]_. | ||
|
|
||
| Note that MERRA2 does not currently provide DNI or DHI. | ||
|
|
||
| References | ||
| ---------- | ||
| .. [1] https://gmao.gsfc.nasa.gov/gmao-products/merra-2/ | ||
kandersolar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| .. [2] https://disc.gsfc.nasa.gov/earthdata-login | ||
| .. [3] https://disc.gsfc.nasa.gov/datasets?project=MERRA-2 | ||
| """ | ||
|
|
||
| # general API info here: | ||
| # https://docs.unidata.ucar.edu/tds/5.0/userguide/netcdf_subset_service_ref.html # noqa: E501 | ||
|
|
||
| def _to_utc_dt_notz(dt): | ||
| dt = pd.to_datetime(dt) | ||
| if dt.tzinfo is not None: | ||
| # convert to utc, then drop tz so that isoformat() is clean | ||
| dt = dt.tz_convert("UTC").tz_localize(None) | ||
| return dt | ||
|
|
||
| start = _to_utc_dt_notz(start) | ||
| end = _to_utc_dt_notz(end) | ||
|
|
||
| if (year := start.year) != end.year: | ||
| raise ValueError("start and end must be in the same year (in UTC)") | ||
|
|
||
| url = ( | ||
| "https://goldsmr4.gesdisc.eosdis.nasa.gov/thredds/ncss/grid/" | ||
| f"MERRA2_aggregation/{dataset}/{dataset}_Aggregation_{year}.ncml" | ||
| ) | ||
|
|
||
| parameters = { | ||
| 'var': ",".join(variables), | ||
| 'latitude': latitude, | ||
| 'longitude': longitude, | ||
| 'time_start': start.isoformat() + "Z", | ||
| 'time_end': end.isoformat() + "Z", | ||
| 'accept': 'csv', | ||
| } | ||
|
|
||
| auth = (username, password) | ||
|
|
||
| with requests.Session() as session: | ||
| session.auth = auth | ||
| login = session.request('get', url, params=parameters) | ||
| response = session.get(login.url, auth=auth, params=parameters) | ||
|
|
||
| response.raise_for_status() | ||
|
|
||
| content = response.content.decode('utf-8') | ||
| buffer = StringIO(content) | ||
| df = pd.read_csv(buffer) | ||
|
|
||
| df.index = pd.to_datetime(df['time']) | ||
|
|
||
| meta = {} | ||
| meta['dataset'] = dataset | ||
| meta['station'] = df['station'].values[0] | ||
| meta['latitude'] = df['latitude[unit="degrees_north"]'].values[0] | ||
| meta['longitude'] = df['longitude[unit="degrees_east"]'].values[0] | ||
|
|
||
| # drop the non-data columns | ||
| dropcols = ['time', 'station', 'latitude[unit="degrees_north"]', | ||
| 'longitude[unit="degrees_east"]'] | ||
| df = df.drop(columns=dropcols) | ||
|
|
||
| # column names are like T2M[unit="K"] by default. extract the unit | ||
| # for the metadata, then rename col to just T2M | ||
| units = {} | ||
| rename = {} | ||
| for col in df.columns: | ||
| name, _ = col.split("[", maxsplit=1) | ||
| unit = col.split('"')[1] | ||
| units[name] = unit | ||
| rename[col] = name | ||
|
|
||
| meta['units'] = units | ||
| df = df.rename(columns=rename) | ||
|
|
||
| if map_variables: | ||
| df = df.rename(columns=VARIABLE_MAP) | ||
|
|
||
| return df, meta | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| """ | ||
| tests for pvlib/iotools/merra2.py | ||
| """ | ||
|
|
||
| import pandas as pd | ||
| import pytest | ||
| import pvlib | ||
| import os | ||
| import requests | ||
| from tests.conftest import RERUNS, RERUNS_DELAY, requires_earthdata_credentials | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def params(): | ||
| earthdata_username = os.environ["EARTHDATA_USERNAME"] | ||
| earthdata_password = os.environ["EARTHDATA_PASSWORD"] | ||
|
|
||
| return { | ||
| 'latitude': 40.01, 'longitude': -80.01, | ||
| 'start': '2020-06-01 15:00', 'end': '2020-06-01 20:00', | ||
| 'dataset': 'M2T1NXRAD.5.12.4', 'variables': ['ALBEDO', 'SWGDN'], | ||
| 'username': earthdata_username, 'password': earthdata_password, | ||
| } | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def expected(): | ||
| index = pd.date_range("2020-06-01 15:30", "2020-06-01 20:30", freq="h", | ||
| tz="UTC") | ||
| index.name = 'time' | ||
| albedo = [0.163931, 0.1609407, 0.1601474, 0.1612476, 0.164664, 0.1711341] | ||
| ghi = [ 930., 1002.75, 1020.25, 981.25, 886.5, 743.5] | ||
| df = pd.DataFrame({'albedo': albedo, 'ghi': ghi}, index=index) | ||
| return df | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def expected_meta(): | ||
| return { | ||
| 'dataset': 'M2T1NXRAD.5.12.4', | ||
| 'station': 'GridPointRequestedAt[40.010N_80.010W]', | ||
| 'latitude': 40.0, | ||
| 'longitude': -80.0, | ||
| 'units': {'ALBEDO': '1', 'SWGDN': 'W m-2'} | ||
| } | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2(params, expected, expected_meta): | ||
| df, meta = pvlib.iotools.get_merra2(**params) | ||
| pd.testing.assert_frame_equal(df, expected, check_freq=False) | ||
| assert meta == expected_meta | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2_map_variables(params, expected, expected_meta): | ||
| df, meta = pvlib.iotools.get_merra2(**params, map_variables=False) | ||
| expected = expected.rename(columns={'albedo': 'ALBEDO', 'ghi': 'SWGDN'}) | ||
| pd.testing.assert_frame_equal(df, expected, check_freq=False) | ||
| assert meta == expected_meta | ||
|
|
||
|
|
||
| def test_get_merra2_error(): | ||
| with pytest.raises(ValueError, match='must be in the same year'): | ||
| pvlib.iotools.get_merra2(40, -80, '2019-12-31', '2020-01-02', | ||
| username='anything', password='anything', | ||
| dataset='anything', variables=[]) | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2_timezones(params, expected, expected_meta): | ||
| # check with tz-aware start/end inputs | ||
| for key in ['start', 'end']: | ||
| dt = pd.to_datetime(params[key]) | ||
| params[key] = dt.tz_localize('UTC').tz_convert('Etc/GMT+5') | ||
| df, meta = pvlib.iotools.get_merra2(**params) | ||
| pd.testing.assert_frame_equal(df, expected, check_freq=False) | ||
| assert meta == expected_meta | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2_bad_credentials(params, expected, expected_meta): | ||
| params['username'] = 'nonexistent' | ||
| with pytest.raises(requests.exceptions.HTTPError, match='Unauthorized'): | ||
| pvlib.iotools.get_merra2(**params) | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2_bad_dataset(params, expected, expected_meta): | ||
| params['dataset'] = 'nonexistent' | ||
| with pytest.raises(requests.exceptions.HTTPError, match='404'): | ||
| pvlib.iotools.get_merra2(**params) | ||
|
|
||
|
|
||
| @requires_earthdata_credentials | ||
| @pytest.mark.remote_data | ||
| @pytest.mark.flaky(reruns=RERUNS, reruns_delay=RERUNS_DELAY) | ||
| def test_get_merra2_bad_variables(params, expected, expected_meta): | ||
| params['variables'] = ['nonexistent'] | ||
| with pytest.raises(requests.exceptions.HTTPError, match='400'): | ||
| pvlib.iotools.get_merra2(**params) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to include all these in the variable map? And make that visible in the documentation?
Also, if there is down-welling long-wave, please add it for module temperature modeling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the remaining items do not have pvlib names.
Several long-wave variables are available (click to the "Variables" tab): https://disc.gsfc.nasa.gov/datasets/M2T1NXRAD_5.12.4/summary
I guess
LWGNTmay be the most relevant. Perhaps someone more familiar with this area can confirm?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also guess that
LWGNT(surface net downward longwave flux) is most relevant. Someone else confirming would still be good.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "net" description usually denotes downwelling minus upwelling, but in this case the variable is also denoted as downwelling.
The parameter is described in a NASA appendix here:
Thus, this is indeed the correct variable of interest.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take that statement back. A quick comparison with a BSRN station showed very very different results.

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have
ir_downas a parameter infaiman_rad-- how aboutir_netandir_upfor these two?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longwave irradiance (> 4500 nm) is a subset of infrared (> 780 nm); therefore, in my opinion, calling it infrared is imprecise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
'LWGNT': 'lwn', 'LWGEM': 'lwu'in latest commitThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a first for me: I advocate for longer names here! At least
lw_downor in more typical pvlib style:longwave_down. The net data is optional.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was confusing "downward" with "net downward". Looks like it's all cleared up, now.