ICESat-2’s Nested Variables
This notebook (download
) illustrates the use of icepyx for managing lists of available and wanted ICESat-2 data variables.
The two use cases for variable management within your workflow are:
During the data access process, whether that’s via order and download (e.g. via NSIDC DAAC) or remote (e.g. via the cloud).
When reading in data to a Python object (whether from local files or the cloud).
A given ICESat-2 product may have over 200 variable + path combinations.
icepyx includes a custom Variables
module that is “aware” of the ATLAS sensor and how the ICESat-2 data products are stored.
The module can be accessed independently, but is optimally used as a component of a Query
object (Case 1) or Read
object (Case 2).
This notebook illustrates in detail how the Variables
module behaves using a Query
data access example.
However, module usage is analogous through an icepyx ICESat-2 Read
object.
More detailed example workflows specifically for the query and read tools within icepyx are available as separate Jupyter Notebooks.
Questions? Be sure to check out the FAQs throughout this notebook, indicated as italic headings.
Why do ICESat-2 products need a custom variable manager?
It can be confusing and cumbersome to comb through the 200+ variable and path combinations contained in ICESat-2 data products.
The icepyx Variables
module makes it easier for users to quickly find and extract the specific variables they would like to work with across multiple beams, keywords, and variables and provides reader-friendly formatting to browse variables.
A future development goal for icepyx
includes developing an interactive widget to further improve the user experience.
For data read-in, additional tools are available to target specific beam characteristics (e.g. strong versus weak beams).
Some technical details about the Variables module
For those eager to push the limits or who want to know more implementation details…
The only required input to the Variables
module is vartype
.
vartype
has two acceptible string values, ‘order’ and ‘file’.
If you use the module as shown in icepyx examples (namely through a Read
or Query
object), then this flag will be passed automatically.
It simply tells the software how to generate the list of possible variable values - either by pinging NSIDC for a list of available variables (query
) or from the user-supplied file (read
).
Import packages, including icepyx
import icepyx as ipx
from pprint import pprint
Interacting with ICESat-2 Data Variables
Each variables instance (which is actually an associated Variables class object) contains two variable list attributes.
One is the list of possible or available variables (avail
attribute) and is unmutable, or unchangeable, as it is based on the input product specifications or files.
The other is the list of variables you’d like to actually have (in your downloaded file or data object) from all the potential options (wanted
attribute) and is updateable.
Thus, your avail
list depends on your data source and whether you are accessing or reading data, while your wanted
list may change for each analysis you are working on or depending on what variables you want to see.
The variables parameter has methods to:
get a list of all available variables, either available from the NSIDC or the file (
avail()
method).append new variables to the wanted list (
append()
method).remove variables from the wanted list (
remove()
method).
We’ll showcase the use of all of these methods and attributes below using an icepyx.Query
object.
Usage is identical in the case of an icepyx.Read
object.
More detailed example workflows specifically for the query and read tools within icepyx are available as separate Jupyter Notebooks.
Create a query object and log in to Earthdata
For this example, we’ll be working with a land ice product (ATL06) for an area along West Greenland (Disko Bay). A second option for an atmospheric product (ATL09) that uses profiles instead of the ground track (gt) categorization is also provided.
region_a = ipx.Query('ATL06',[-55, 68, -48, 71],['2019-02-22','2019-02-28'], \
start_time='00:00:00', end_time='23:59:59')
# Uncomment and run the code in this cell to use the second variable subsetting suite of examples,
# with the beam specifier containing "profile" instead of "gt#l"
# region_a = ipx.Query('ATL09',[-55, 68, -48, 71],['2019-02-22','2019-02-28'], \
# start_time='00:00:00', end_time='23:59:59')
region_a.earthdata_login('icepyx_devteam','icepyx.dev@gmail.com')
ICESat-2 data variables
ICESat-2 data is natively stored in a nested file format called hdf5.
Much like a directory-file system on a computer, each variable (file) has a unique path through the heirarchy (directories) within the file.
Thus, some variables (e.g. 'latitude'
, 'longitude'
) have multiple paths (one for each of the six beams in most products).
Determine what variables are available
region_a.order_vars.avail
will return a list of all valid path+variable strings.
region_a.order_vars.avail()
To increase readability, you can use built in functions to show the 200+ variable + path combinations as a dictionary where the keys are variable names and the values are the paths to that variable.
region_a.order_vars.parse_var_list(region_a.order_vars.avail())
will return a dictionary of variable:paths key:value pairs.
region_a.order_vars.parse_var_list(region_a.order_vars.avail())
By passing the boolean options=True
to the avail
method, you can obtain lists of unique possible variable inputs (var_list inputs) and path subdirectory inputs (keyword_list and beam_list inputs) for your data product. These can be helpful for building your wanted variable list.
region_a.order_vars.avail(options=True)
Building your wanted variable list
Now that you know which variables and path components are available, you need to build a list of the ones you’d like included. There are several options for generating your initial list as well as modifying it, giving the user complete control.
The options for building your initial list are:
Use a default list for the product (not yet fully implemented across all products. Have a default variable list for your field/product? Submit a pull request or post it as an issue on GitHub!)
Provide a list of variable names
Provide a list of profiles/beams or other path keywords, where “keywords” are simply the unique subdirectory names contained in the full variable paths of the product. A full list of available keywords for the product is displayed in the error message upon entering
keyword_list=['']
into theappend
function (see below for an example) or by runningregion_a.order_vars.avail(options=True)
, as above.
Note: all products have a short list of “mandatory” variables/paths (containing spacecraft orientation and time information needed to convert the data’s delta_time
to a readable datetime) that are automatically added to any built list. If you have any recommendations for other variables that should always be included (e.g. uncertainty information), please let us know!
Examples of using each method to build and modify your wanted variable list are below.
region_a.order_vars.wanted
region_a.order_vars.append(defaults=True)
pprint(region_a.order_vars.wanted)
The keywords available for this product are shown in the error message upon entering a blank keyword_list, as seen in the next cell.
region_a.order_vars.append(keyword_list=[''])
Modifying your wanted variable list
Generating and modifying your variable request list, which is stored in region_a.order_vars.wanted
, is controlled by the append
and remove
functions that operate on region_a.order_vars.wanted
. The input options to append
are as follows (the full documentation for this function can be found by executing help(region_a.order_vars.append)
).
defaults
(default False) - include the default variable list for your product (not yet fully implemented for all products; please submit your default variable list for inclusion!)var_list
(default None) - list of variables (entered as strings)beam_list
(default None) - list of beams/profiles (entered as strings)keyword_list
(default None) - list of keywords (entered as strings); usekeyword_list=['']
to obtain a list of available keywords
Similarly, the options for remove
are:
all
(default False) - resetregion_a.order_vars.wanted
to Nonevar_list
(as above)beam_list
(as above)keyword_list
(as above)
region_a.order_vars.remove(all=True)
pprint(region_a.order_vars.wanted)
Examples (Overview)
Below are a series of examples to show how you can use append
and remove
to modify your wanted variable list.
For clarity, region_a.order_vars.wanted
is cleared at the start of many examples.
However, multiple append
and remove
commands can be called in succession to build your wanted variable list (see Examples 3+).
There are two example tracks. The first is for land ice (ATL06) data that is separated into beams. The second is for atmospheric data (ATL09) that is separated into profiles. Both example tracks showcase the same functionality and are provided for users of both data types.
Example Track 1 (Land Ice - run with ATL06 dataset)
Example 1.1: choose variables
Add all latitude
and longitude
variables across all six beam groups. Note that the additional required variables for time and spacecraft orientation are included by default.
region_a.order_vars.append(var_list=['latitude','longitude'])
pprint(region_a.order_vars.wanted)
Example 1.2: specify beams and variable
Add latitude
for only gt1l
and gt2l
region_a.order_vars.remove(all=True)
pprint(region_a.order_vars.wanted)
var_dict = region_a.order_vars.append(beam_list=['gt1l', 'gt2l'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)
Example 1.3: add/remove selected beams+variables
Add latitude
for gt3l
and remove it for gt2l
region_a.order_vars.append(beam_list=['gt3l'],var_list=['latitude'])
region_a.order_vars.remove(beam_list=['gt2l'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)
Example 1.4: keyword_list
Add latitude
and longitude
for all beams and with keyword land_ice_segments
region_a.order_vars.append(var_list=['latitude', 'longitude'],keyword_list=['land_ice_segments'])
pprint(region_a.order_vars.wanted)
Example 1.5: target a specific variable + path
Remove gt1r/land_ice_segments/longitude
(but keep gt1r/land_ice_segments/latitude
)
region_a.order_vars.remove(beam_list=['gt1r'], var_list=['longitude'], keyword_list=['land_ice_segments'])
pprint(region_a.order_vars.wanted)
Example 1.6: add variables not specific to beams/profiles
Add rgt
under orbit_info
.
region_a.order_vars.append(keyword_list=['orbit_info'],var_list=['rgt'])
pprint(region_a.order_vars.wanted)
Example 1.7: add all variables+paths of a group
In addition to adding specific variables and paths, we can filter all variables with a specific keyword as well. Here, we add all variables under orbit_info
. Note that paths already in region_a.order_vars.wanted
, such as 'orbit_info/rgt'
, are not duplicated.
region_a.order_vars.append(keyword_list=['orbit_info'])
pprint(region_a.order_vars.wanted)
Example 1.8: add all possible values for variables+paths
Append all longitude
paths and all variables/paths with keyword land_ice_segments
.
Similarly to what is shown in Example 4, if you submit only one append
call as region_a.order_vars.append(var_list=['longitude'], keyword_list=['land_ice_segments'])
rather than the two append
calls shown below, you will only add the variable longitude
and only paths containing land_ice_segments
, not ALL paths for longitude
and ANY variables with land_ice_segments
in their path.
region_a.order_vars.append(var_list=['longitude'])
region_a.order_vars.append(keyword_list=['land_ice_segments'])
pprint(region_a.order_vars.wanted)
Example 1.9: remove all variables+paths associated with a beam
Remove all paths for gt1l
and gt3r
region_a.order_vars.remove(beam_list=['gt1l','gt3r'])
pprint(region_a.order_vars.wanted)
Example 1.10: generate a default list for the rest of the tutorial
Generate a reasonable variable list prior to download
region_a.order_vars.remove(all=True)
region_a.order_vars.append(defaults=True)
pprint(region_a.order_vars.wanted)
Example Track 2 (Atmosphere - run with ATL09 dataset commented out at the start of the notebook)
Example 2.1: choose variables
Add all latitude
and longitude
variables
region_a.order_vars.append(var_list=['latitude','longitude'])
pprint(region_a.order_vars.wanted)
Example 2.2: specify beams/profiles and variable
Add latitude
for only profile_1
and profile_2
region_a.order_vars.remove(all=True)
pprint(region_a.order_vars.wanted)
var_dict = region_a.order_vars.append(beam_list=['profile_1','profile_2'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)
Example 2.3: add/remove selected beams+variables
Add latitude
for profile_3
and remove it for profile_2
region_a.order_vars.append(beam_list=['profile_3'],var_list=['latitude'])
region_a.order_vars.remove(beam_list=['profile_2'], var_list=['latitude'])
pprint(region_a.order_vars.wanted)
Example 2.4: keyword_list
Add latitude
for all profiles and with keyword low_rate
region_a.order_vars.append(var_list=['latitude'],keyword_list=['low_rate'])
pprint(region_a.order_vars.wanted)
Example 2.5: target a specific variable + path
Remove 'profile_1/high_rate/latitude'
(but keep 'profile_3/high_rate/latitude'
)
region_a.order_vars.remove(beam_list=['profile_1'], var_list=['latitude'], keyword_list=['high_rate'])
pprint(region_a.order_vars.wanted)
Example 2.6: add variables not specific to beams/profiles
Add rgt
under orbit_info
.
region_a.order_vars.append(keyword_list=['orbit_info'],var_list=['rgt'])
pprint(region_a.order_vars.wanted)
Example 2.7: add all variables+paths of a group
In addition to adding specific variables and paths, we can filter all variables with a specific keyword as well. Here, we add all variables under orbit_info
. Note that paths already in region_a.order_vars.wanted
, such as 'orbit_info/rgt'
, are not duplicated.
region_a.order_vars.append(keyword_list=['orbit_info'])
pprint(region_a.order_vars.wanted)
Example 2.8: add all possible values for variables+paths
Append all longitude
paths and all variables/paths with keyword high_rate
.
Simlarly to what is shown in Example 4, if you submit only one append
call as region_a.order_vars.append(var_list=['longitude'], keyword_list=['high_rate'])
rather than the two append
calls shown below, you will only add the variable longitude
and only paths containing high_rate
, not ALL paths for longitude
and ANY variables with high_rate
in their path.
region_a.order_vars.append(var_list=['longitude'])
region_a.order_vars.append(keyword_list=['high_rate'])
pprint(region_a.order_vars.wanted)
Example 2.9: remove all variables+paths associated with a profile
Remove all paths for profile_1
and profile_3
region_a.order_vars.remove(beam_list=['profile_1','profile_3'])
pprint(region_a.order_vars.wanted)
Example 2.10: generate a default list for the rest of the tutorial
Generate a reasonable variable list prior to download
region_a.order_vars.remove(all=True)
region_a.order_vars.append(defaults=True)
pprint(region_a.order_vars.wanted)
Using your wanted variable list
Now that you have your wanted variables list, you need to use it within your icepyx object (Query
or Read
) will automatically use it.
With a Query
object
In order to have your wanted variable list included with your order, you must pass it as a keyword argument to the subsetparams()
attribute or the order_granules()
or download_granules()
(which calls order_granules
under the hood if you have not already placed your order) functions.
region_a.subsetparams(Coverage=region_a.order_vars.wanted)
Or, you can put the Coverage
parameter directly into order_granules
:
region_a.order_granules(Coverage=region_a.order_vars.wanted)
However, then you cannot view your subset parameters (region_a.subsetparams
) prior to submitting your order.
region_a.order_granules()# <-- you do not need to include the 'Coverage' kwarg to
# order if you have already included it in a call to subsetparams
region_a.download_granules('/home/jovyan/icepyx/dev-notebooks/vardata') # <-- you do not need to include the 'Coverage' kwarg to
# download if you have already submitted it with your order
With a Read
object
Calling the load()
method on your Read
object will automatically look for your wanted variable list and use it.
Please see the read-in example Jupyter Notebook for a complete example of this usage.
Credits
based on the subsetting notebook by: Jessica Scheick and Zheng Liu