Draft: [DO NOT MERGE] Setting up preprocessing pipelines
Compare changes
This Merge Request tries to reorganize the code to see how much work we can do on the initial files without needing to know anything about what happened at other timesteps. So start from the pixel and see how far we can get.
So far what we've got is:
description | preceeds | per-region | per-ingest-file | per-t | timing (WesternUS, Aug 2022) | |
---|---|---|---|---|---|---|
preprocess_region |
takes outline of region and the static_sources file and creates the "swiss cheese" shape | preprocess_region_t |
x | 30 seconts | ||
preprocess_monthly_file |
takes the monthly file, normalizes columns, and splits into half-day | preprocess_region_t |
x | 30 seconds | ||
preprocess_region_t |
takes the half-day file and the region and does filtering and initial clustering | x | x | 5 minutes |
Guiding principles: