This script will figure out what GEDI files are necessary to cover the query and submit the necessary jobs to the MAAP DPS.
This script will figure out what GEDI files are necessary to cover the query and submit the necessary jobs to the MAAP DPS.
...
@@ -101,25 +102,31 @@ With these credentials, you can use the provided `download_from_workspace.py` (
...
@@ -101,25 +102,31 @@ With these credentials, you can use the provided `download_from_workspace.py` (
--algorithm nmbim_biomass_index \
--algorithm nmbim_biomass_index \
--version main \
--version main \
--tag{unique_processing_id}
--tag{unique_processing_id}
# Decompress downloaded GeoPackages
bunzip2 run_results/*.gpkg.bz2
```
```
Alternatively, you can still use AWS CLI directly:
9. Post-processing
```bash
The result of the download will be a directory structure under your 'output-dir' that mirrors the output structure on MAAP--outputs will be organized hierarchically by date and time. The files you're interested in are the output GeoPackages, which are compressed by default to ease huge downloads. To decompress them, use an option like the following:
# List output files
aws s3 ls s3://maap-ops-workspace/{username}/dps_output/nmbim_biomass_index/main/{unique_processing_id}/ --recursive | grep'.gpkg.bz2$'
Or, if there are thousands of files and you're on a powerful computer, do it in parallel:
bunzip2 run_results/*.gpkg.bz2
```
```
find . -name "*.gpkg.bz2" | parallel bunzip2
```
Further processing is up to you. It may be advantageous to combine all the output GeoPackages one (typically huge) GeoPackage for visualization:
```
ogrmerge.py -progress -single -o <output path for combined GPKG> $(find <dir with results downloaded from MAAP> -name "*.gpkg")
```
However, it may also be more efficient to do some processing tasks without first combining the results, as this allows parallelization over the output files.
## Detailed description of arguments for run_on_maap.py