ICESat-2 AWS cloud data access (BETA ONLY)

This notebook (download) illustrates the use of icepyx for accessing ICESat-2 data currently available through the AWS (Amazon Web Services) us-west2 hub s3 data bucket.

Critical Caveats

Please do not contact us saying this does not work until you have read this section in detail

  1. ICESat-2 data is not currently publicly available on the cloud (and will not likely be until at least the end of 2021). A limited subset is currently available in an s3 bucket to developers and beta testers who have been registered with NSIDC.

  2. This example and the code it describes are part of ongoing development. Current limitations to using these features are described throughout the example, as appropriate.

  3. You MUST be working within an AWS instance. Otherwise, you will get a permissions error.

import icepyx as ipx

Create an icepyx Query object

In order to develop and test cloud data access functionality, here we search for an arbitrary granule over Greenland that was previously determined to be available on s3 using Earthdata Search. s3 availability is not yet included in CMR metadata, so it cannot be determined programmatically.

# bounding box
# "producerGranuleId": "ATL03_20191130221008_09930503_004_01.h5",
short_name = 'ATL03'
spatial_extent = [-45, 58, -35, 75]
date_range = ['2019-11-30','2019-11-30']
reg=ipx.Query(short_name, spatial_extent, date_range)

Construct the granule s3 urls

Since cloud data available is not yet included as part of the standard granule metadata, there is no way for us to check whether or not these s3 bucket urls are valid, since they are constructed from other granule metadata. Thus, you may get FileNotFound Errors when trying to use these urls.

gran_ids = reg.avail_granules(ids=True, s3urls=True)
gran_ids

Log in to Earthdata and generate an s3 token

You can use icepyx’s existing login functionality to generate your s3 data access token, which should be good for five hours. We currently do not have this set up to automatically renew, but if you’re interested in adding this functionality please get in touch or submit a PR!

reg.earthdata_login("icepyx_dev","icepyx_dev@gmail.com", s3token=True)
credentials = reg._s3login_credentials

Set up your s3 access using your credentials

import s3fs
s3 = s3fs.S3FileSystem(key=credentials['accessKeyId'],
                       secret=credentials['secretAccessKey'],
                       token=credentials['sessionToken'])

Select an s3 url and access the data

Development is underway for data read in capabilities, which will include options for cloud data access. Stay tuned and we’d love for you to join us and contribute!

Note: If you get a PermissionDenied Error when trying to read in the data, you may not be sending your request from an AWS hub in us-west2. We’re currently working on how to alert users if they will not be able to access ICESat-2 data in the cloud for this reason

s3url = gran_ids[1][0]
# s3url =  's3://nsidc-cumulus-prod-protected/ATLAS/ATL03/004/2019/11/30/ATL03_20191130221008_09930503_004_01.h5'
import h5py
import numpy as np
%time f = h5py.File(s3.open(s3url,'rb'),'r')

Credits