Skip to content
Snippets Groups Projects
Commit 73f75543 authored by Ian's avatar Ian Committed by Ian
Browse files

docs: add guide for process_gedi_granules

docs: update guide to use process_gedi_granules CLI for batch processing
parent 43635c87
No related branches found
No related tags found
No related merge requests found
# Guide for process_gedi_granules.py
This script processes a single set of GEDI L1B, L2A, and L4A granules locally to calculate the Ni-Meister Biomass Index (NMBI) for each waveform footprint. It supports both serial and parallel execution modes.
## 1. Installation
Ensure you have the required dependencies installed:
```bash
pip install click h5py pyyaml rasterio
pip install nmbim # if installing from PyPI or editable install
```
## 2. Basic CLI Usage
```bash
python process_gedi_granules.py \
<l1b_path> <l2a_path> <l4a_path> \
<hse_path> <k_allom_path> <output_dir> \
--config config.yaml \
--parallel \
--n_workers 4 \
--boundary boundary.gpkg \
--date_range "YYYY-MM-DD,YYYY-MM-DD"
```
Arguments:
- `<l1b_path>`, `<l2a_path>`, `<l4a_path>`: GEDI HDF5 files (Level 1B, 2A, 4A).
- `<hse_path>`, `<k_allom_path>`: GeoTIFF rasters named `hse.tif` and `k_allom.tif`.
- `<output_dir>`: Directory to write the output GeoPackage (`.gpkg`).
Options:
- `--config, -c`: Path to filter configuration YAML.
- `--parallel, -p`: Enable multiprocessing mode.
- `--n_workers, -n`: Number of parallel workers (default: 4).
- `--boundary`: Path to spatial boundary file (GeoPackage or Shapefile).
- `--date_range`: Temporal filter range (`start,end`).
## 3. Example CLI Invocation
```bash
python process_gedi_granules.py \
GEDI01_B_20181001_034643_003_01.h5 \
GEDI02_A_20181001_034643_003_02.h5 \
GEDI04_A_20181001_034643_003_04.h5 \
hse.tif k_allom.tif ./output \
--config filters.yaml \
--parallel --n_workers 4 \
--boundary boundary.gpkg \
--date_range "2018-10-01,2018-10-02"
```
## 4. Programmatic Batch Processing via process_gedi_granules
You can invoke the `process_gedi_granules.py` CLI script from Python to process each granule triplet in serial or parallel with `concurrent.futures`.
```python
import os
import concurrent.futures
import subprocess
# Define input granule triplets
triplets = [
{"l1b": "l1b1.h5", "l2a": "l2a1.h5", "l4a": "l4a1.h5"},
{"l1b": "l1b2.h5", "l2a": "l2a2.h5", "l4a": "l4a2.h5"},
]
hse = "hse.tif"
k_allom = "k_allom.tif"
output_dir = "output"
os.makedirs(output_dir, exist_ok=True)
def run_triplet(t):
cmd = [
"python", "process_gedi_granules.py",
t["l1b"], t["l2a"], t["l4a"],
hse, k_allom, output_dir,
"--config", "config.yaml"
]
subprocess.run(cmd, check=True)
# Serial execution
for t in triplets:
run_triplet(t)
# Or parallelize with ProcessPoolExecutor
with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:
executor.map(run_triplet, triplets)
```
This approach lets you easily script batch processing of multiple granule sets using the same CLI entrypoint.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment