ICESat-2 AWS cloud data access
This notebook (download
) illustrates the use of icepyx for accessing ICESat-2 data currently available through the AWS (Amazon Web Services) us-west2 hub s3 data bucket.
Notes
ICESat-2 data became publicly available on the cloud on 29 September 2022. Thus, access methods and example workflows are still being developed by NSIDC, and the underlying code in icepyx will need to be updated now that these data (and the associated metadata) are available. We appreciate your patience and contributions (e.g. reporting bugs, sharing your code, etc.) during this transition!
This example and the code it describes are part of ongoing development. Current limitations to using these features are described throughout the example, as appropriate.
You MUST be working within an AWS instance. Otherwise, you will get a permissions error.
Cloud authentication is still more user-involved than we’d like. We’re working to address this - let us know if you’d like to join the conversation!
import earthaccess
import icepyx as ipx
Create an icepyx Query object
# bounding box
# "producerGranuleId": "ATL03_20191130221008_09930503_004_01.h5",
short_name = 'ATL03'
spatial_extent = [-45, 58, -35, 75]
date_range = ['2019-11-30','2019-11-30']
reg=ipx.Query(short_name, spatial_extent, date_range)
Get the granule s3 urls
You must specify cloud=True
to get the needed s3 urls.
This function returns a list containing the list of the granule IDs and a list of the corresponding urls.
gran_ids = reg.avail_granules(ids=True, cloud=True)
gran_ids
Log in to Earthdata and generate an s3 token
You can use icepyx’s existing login functionality to generate your s3 data access token, which will be valid for one hour. The icepyx module will renew the token for you after an hour, but if viewing your token over the course of several hours you may notice the values will change.
You can access your s3 credentials using:
# uncommenting the line below will print your temporary login credentials
# reg.s3login_credentials
Important Authentication Update
Previously, icepyx required you to explicitly use the .earthdata_login()
function to login. Running this function is no longer required, as icepyx will call the login function as needed. The user will still need to provide their credentials using one of the three methods decribed in the ICESat-2 Data Access Notebook example. The .earthdata_login()
function is still available for backwards compatibility.
If you are unable to remove earthdata_login()
calls from your workflow, note that certain inputs, such as earthdata_uid
and email
, are no longer required. e.g. region_a.earthdata_login(earthdata_uid, email)
becomes region_a.earthdata_login()
Set up your s3 file system using your credentials
s3 = earthaccess.get_s3fs_session(daac='NSIDC', provider=reg.s3login_credentials)
Select an s3 url and access the data
Data read in capabilities for cloud data are coming soon in icepyx (targeted Spring 2023). Stay tuned and we’d love for you to join us and contribute!
Note: If you get a PermissionDenied Error when trying to read in the data, you may not be sending your request from an AWS hub in us-west2. We’re currently working on how to alert users if they will not be able to access ICESat-2 data in the cloud for this reason
# the first index, [1], gets us into the list of s3 urls
# the second index, [0], gets us the first entry in that list.
s3url = gran_ids[1][0]
# s3url = 's3://nsidc-cumulus-prod-protected/ATLAS/ATL03/004/2019/11/30/ATL03_20191130221008_09930503_004_01.h5'
import h5py
import numpy as np
%time f = h5py.File(s3.open(s3url,'rb'),'r')
Credits
notebook by: Jessica Scheick
historic source material: is2-nsidc-cloud.py by Brad Lipovsky