mud2dust — strategy plan (rev. 2026-05-06)
Status: plan stage. Empty repo. Domains secured. Hardware in hand. METER meeting scheduled 2026-05-07. Phase 0 implementation begins after METER meeting and AWS/repo scaffolding.
Brand:
mud2dustis locked. Domains owned:mud2dustandmudtodustacross.com, .net, .org, .io, .dev, .ag, .farm, .ai, .earth, .co(20 domains total). Primarymud2dust.com. Defensive coverage is comprehensive — no further TLD acquisition needed.
Hardware in hand: 6× Sentek 36" drill-and-drop probes (multi-depth VWC + temp + EC) on YDOC ML-417ADS data loggers. Plus Tempest weather stations and Vaisala WXT520. Campbell Scientific in conversation. METER hardware to be purchased; logger choice (YDOC vs ZENTRA Cloud) deferred per deployment.
Context
mud2dust is two products built on one platform, deliberately:
- An open, calibrated, high-resolution soil-moisture map of US agricultural land — daily 5–30m VWC raster fused from satellite + soil-texture priors + a contributor probe network + (when available) airborne L-Band SAR cal/val campaigns.
- A multi-vendor contribution platform — farmers, researchers, hobbyists, and partner applications connect not just soil moisture and weather time-series but seven first-class data shapes: observations, profiles, samples, events, collections, annotations, and boundaries. mud2dust normalizes, stores, visualizes, and exposes all of it via API. High-trust contributions train the calibration model. Everyone benefits from the calibrated output.
The platform is what makes the map possible; the platform is also the immediate carrot. Users can browse the public map for free without signing up; contributors sign up to connect their hardware, lab samples, drone flights, agronomist annotations, or field boundaries; partner applications (Farming Game first) integrate via OAuth.
Two existing planning documents in the sibling farminggame project describe overlapping pieces. mud2dust unifies them; Farming Game becomes the first OAuth partner application. The 6× Sentek deployment originally scoped under farminggame Phase 13a becomes mud2dust's anchor training stations.
This plan supersedes both source documents on the soil-map question. Farming Game's Phase 13 should be retitled to "Integrate with mud2dust": deploy Sentek + Vaisala + Tempest sensors as anchor stations, register them via mud2dust's OAuth API, replace AGROMONITORING_API_KEY with mud2dust API calls, drop the in-product SAR pipeline.
1. Vision
An open, calibrated, high-resolution soil-moisture map of US agricultural land — free for research and small operators, paid only at the bulk-extraction tier — built on a multi-shape contribution platform that any farmer, researcher, drone operator, or partner app can connect to.
1a. The map
Daily 5–30m volumetric water content (VWC) raster, fused from Sentinel-1 SAR backscatter + Sentinel-2 vegetation indices + soil-texture priors + the contributor station network. Around it, the same infrastructure serves NDVI/NDWI/NDRE, thermal LST + ET, precipitation, static priors. Pitch the soil moisture map; ship the rest because the pipeline already passes it.
1b. The platform
Multi-shape, multi-vendor contribution surface — contributors connect ZENTRA Cloud, WeatherFlow, WeatherLink, FieldClimate accounts; push from raw loggers (YDOC, Campbell, Davis); upload drone or aircraft scenes; submit lab sample results; mark field boundaries; add agronomist annotations. Cross-source unified dashboard. Per-contribution provenance and trust scoring. Free for everyone. Partner applications integrate via OAuth.
1c. The cal/val story
Distributed L-Band drone SAR campaigns over instrumented fields create triple-validated calibration anchors (in-situ probes + airborne L-Band + Sentinel-1 C-Band). This is SMAPVEX-class cal/val data, distributed across contributors instead of NASA campaigns only. Distinct, fundable research narrative for NASA-CSDA / NSF.
Pitch — to map users
Stop paying $300–1500/mo for someone else's wrapper around free public data. Use the same data, calibrated against ground truth, with an open governance model and an attribution-only license.
Pitch — to platform contributors
One place to see all your soil and weather sensors regardless of vendor — and your drone flights, lab results, field boundaries, and agronomist notes alongside them. We handle the protocols. If you have well-installed research-grade hardware or calibrated airborne instruments, your data improves the public moisture model and you get higher API tiers. If you don't, you still get the dashboard, the cross-source exports, and the calibrated companion data — free.
Pitch — to partner-app developers
Integrate once via OAuth, get sensor data, drone uploads, and calibrated outputs for any user who connects to mud2dust. No per-vendor adapter code in your app. Same tokens give your users access to their own data and to the public model.
Why "open" matters strategically (not ideologically)
- Network effects on calibration. Every contribution makes the model better for everyone in that soil/canopy class. Closed competitors can't accept arbitrary contributor data because their license forbids redistributing improvements.
- Default citation status. Academic papers cite open infrastructure (OpenStreetMap, Zenodo, OpenET). Once cited 100 times, you're permanent.
- Grant and consortium funding. NSF, USDA NIFA, NRCS Conservation Innovation Grants, NASA Western Water Applications all fund open-data projects. None fund closed SaaS.
- Vendor-agnostic platform attracts users locked into single-vendor portals. Onset HOBOlink, ZENTRA, WeatherFlow each lock data in. mud2dust attacks the lock-in.
Competitive landscape
- Wrappers around free satellite data — Agromonitoring, EOS Crop Monitoring, OneSoil. Thin convenience layers.
- Single-vendor sensor platforms — Onset HOBOlink, METER ZENTRA Cloud, WeatherFlow Tempest, Davis WeatherLink. Each is a closed silo per vendor.
- Drone data platforms — DroneDeploy, Pix4D, OpenDroneMap (open). Imagery processing and storage; not a calibration target, not a multi-shape platform.
- Real proprietary products — Climate FieldView, Granular. Genuinely private data and bundle it. Not the target.
2. Why this is feasible now (and not in 2018)
2a. AWS Open Data Sponsorship (matured ~2020). AWS pays storage + intra-region egress.
2b. COG / STAC tooling (matured ~2023). GDAL 3.x, rasterio 1.3+, titiler, rio-tiler, pgstac, pystac-client. STAC is the lingua franca for episodic raster, including drone data — the same catalog handles satellite scenes and contributor-uploaded drone flights.
2c. Indigo Ag's RTC bucket. Indigo invested ~$2M/year preprocessing Sentinel-1 to terrain-corrected γ⁰ COGs in AWS Open Data. ~80% of the satellite engineering work is already done.
2d. Ground-truth network is bootstrappable. Sentek anchor stations at JMR + per-contributor data + (when available) METER research network + episodic L-Band drone campaigns. Multiple sources of training data, not one vendor.
2e. Drone L-Band SAR is now commercially available. ImSAR, AeroVironment, others. Was $5M+ aircraft-only in 2018; now sub-$100K drone-mounted. Distributed cal/val is feasible.
3. Architecture
3a. System diagram
┌─────────────────────────────────────────────────────────────────────────┐
│ SATELLITE INGEST (Fargate / Lambda) │
│ Sentinel-1, Sentinel-2/HLS, Landsat, HRRR, NEXRAD, GOES, SMAP, │
│ ECOSTRESS, MODIS, TROPOMI, POLARIS, DEM, SSURGO, OpenET, etc. │
└──────────────────────────────────┬──────────────────────────────────────┘
▼
PROCESSING: σ⁰ → VWC, NDVI/NDWI, fusion model, etc.
▼
S3 (us-west-2) — output COGs
▼
CloudFront + titiler + STAC catalog (pgstac/RDS)
▼
┌──────────────────┬──────────┴──────────┬──────────────────┐
▼ ▼ ▼ ▼
Public web Python/R/QGIS Partner apps Contributor
(free map browse) via STAC via OAuth dashboards
┌──────────────────────────────────────────────────────────────────────────┐
│ CONTRIBUTION BRIDGE (mud2dust/sensor-bridge) │
│ │
│ Seven first-class object types — one auth/trust/privacy stack │
│ │
│ Observation Profile Sample Event Collection Annotation Boundary │
│ (time- (multi- (lab/ (drone/ (multi- (geotag (field │
│ series depth) discrete flight) flight note) bounds) │
│ point) sample) survey) │
│ │ │ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Adapter registry (per vendor × per shape) │ │
│ │ YDOC, Campbell, ZENTRA Cloud, WeatherFlow, WeatherLink, │ │
│ │ FieldClimate, Onset, generic JSON, generic CSV/SFTP, │ │
│ │ drone-COG-upload, soil-lab-CSV, shapefile/GeoJSON, │ │
│ │ CRNP, AmeriFlux pull, ... │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Intake gateways: │ │
│ │ • HTTPS webhook • SFTP server • MQTT broker │ │
│ │ • Presigned-multipart S3 upload (for Events) │ │
│ │ • OAuth-pull workers (per contributor token) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Per-shape storage: │ │
│ │ Observation/Profile → TimescaleDB │ │
│ │ Sample → Postgres (relational) │ │
│ │ Event/Collection → S3 + pgstac │ │
│ │ Annotation → Postgres + PostGIS │ │
│ │ Boundary → Postgres + PostGIS │ │
│ │ + Secrets Manager (per-contributor vendor tokens) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
▲ ▲
│ │
Nightly retrain ──┘ └── Calibration consumer
(training-tier contributions) (everyone gets corrections)
3b. Region rationale (two AWS regions, two worker pools)
- us-west-2 — most Tier 1 + Tier 2 satellite buckets (~70%); also primary for sensor-bridge, dashboards, and APIs.
- us-east-1 — entire NOAA Big Data Program (~25%); workers only.
- Tier 3 (non-AWS) — POLARIS at UC Davis, SoilGrids at ISRIC, USDA CropScape FTP, OpenET via Earth Engine, TROPOMI via Copernicus.
Intra-region S3 reads free; cross-region $0.02/GB.
3c. Worker types
| Type | Use case | Limits | Cost shape |
|---|---|---|---|
| Lambda | Per-scene index calc, COG mosaicking, STAC, intake webhooks, sensor pull workers | 15 min, ~10 GB RAM, 5 GB ephemeral | Pay-per-invoke |
| Fargate | Cross-region pulls, large mosaicking, GDAL native deps, drone preprocessing, retraining, historical-backfill | Any duration, any RAM | Per-second billing |
| EC2 spot | Only if Fargate gets expensive | Spot interruption | Cheaper at sustained load |
Failure isolation rule: each satellite source and each adapter (vendor × shape) gets its own Lambda or Fargate task, triggered independently by EventBridge. SMAP outage doesn't block Sentinel-1; ZENTRA outage doesn't block YDOC push; a stuck drone-preprocessing job doesn't block Observation ingest.
3d. Processing layer
- σ⁰ → VWC index per pixel. Empirical model trained against high-trust contributions (Sentek anchor + verified researchers + L-Band drone campaigns + METER if/when accessible).
- v1: linear
index = (sigma0 - sigma0_dry) / (sigma0_wet - sigma0_dry). - v2: scikit-learn random forest trained only on training-tier contributions.
- v3+: XGBoost or small NN; Fargate or SageMaker.
- v1: linear
- NDVI / NDWI / NDRE / EVI. Trivial pixel math from HLS reflectance.
- ET ensemble. Consume OpenET for v1.
- Per-pixel calibration. Conditions on POLARIS soil texture, HRRR precipitation, DEM topographic wetness index. The research contribution that justifies the project's existence.
- Correction surfaces. For every contributor station, every nightly retrain produces
(satellite_estimate, ground_truth, delta)series. Surfaced on the contributor's dashboard regardless of trust tier. - Drone L-Band cal/val anchors. When a drone Event is tagged as L-Band SAR with proper radiometric cal, the preprocessor extracts pixel-level σ⁰ over the flight extent and timestamps it; nightly retrain treats those as high-weight anchor scenes, similar to in-situ profiles.
3e. Output bucket structure
s3://mud2dust-cogs/
├── moisture-surface/ moisture-rootzone/
├── ndvi/ ndwi/ ndre/ lst/ et-daily/ precip-7d/ ...
├── sigma0/ (raw, 90-day retention)
├── priors/
│ ├── polaris/ soilgrids/ ssurgo/ dem-30m/ (static, indefinite)
└── ...
s3://mud2dust-contributions/
├── events/{contributor_id}/{flight_id}/scene.tif (drone, aircraft)
├── collections/{contributor_id}/{survey_id}/
├── samples-attachments/{contributor_id}/{sample_id}/lab-report.pdf
└── boundaries/{contributor_id}/{boundary_id}.geojson
Lifecycle: raw satellite layers 90d, fused/calibrated indefinite, priors indefinite. Contributor raster Standard 90d → IA → Glacier IR (policy details deferred per §12l).
3f. Tile serving — titiler on Lambda + CloudFront
Standard pattern. Cold start ~500ms, warm ~30ms per tile. Two-tier caching (CloudFront edge + titiler render). Same titiler also serves contributor-uploaded drone COGs that opted to public visibility — /tiles/contrib/{event_id}/{z}/{x}/{y}.png.
3g. STAC catalog — pgstac on Aurora Serverless v2
Sub-100ms search. Two collection types:
mud2dust-{layer}— official calibrated layers (anonymous public)contrib/{contributor_id}/{collection_id}— contributor-uploaded Events and Collections (public if opted; private otherwise)
The same pgstac instance handles both — STAC was designed for this.
3h. Contribution bridge — seven first-class object types
One repo (mud2dust/sensor-bridge), MIT-licensed. All seven shapes share auth, multi-tenancy, trust scoring, privacy controls, and adapter registry. Per-shape modules differ only in schema and validators.
| Object type | Covers | Schema basis | Storage | Ingest |
|---|---|---|---|---|
| Observation | Time-series point readings (soil moisture, weather, CRNP, GNSS-IR, sap flow, eddy-cov fluxes, stream gauge) | OGC SensorThings (Observation) | TimescaleDB | HTTPS push, REST pull, SFTP, MQTT |
| Profile | Multi-depth time-series (Sentek 9-depth, soil temp profiles, lysimeters) | SensorThings extended with depth_cm[] | TimescaleDB | HTTPS push, REST pull |
| Sample | Discrete lab results (gravimetric VWC, bulk density, texture, OM, plant tissue, LAI) | Custom (sample location/depth/method/lab) | Postgres + S3 attachments | HTTPS POST + optional PDF/CSV upload |
| Event | Single drone flight, single aircraft pass, single PhenoCam capture, irrigation event, planting event | STAC Item | S3 (raster) + pgstac | Presigned S3 multipart + STAC item POST |
| Collection | Multi-flight drone survey, multi-pass aircraft campaign, weekly PhenoCam series | STAC Collection | pgstac | POST /v1/collections then add Events |
| Annotation | Geotagged agronomist note, citizen-science observation, photo with timestamp + location + free text | GeoJSON Feature (structured properties) | Postgres + PostGIS | HTTPS POST |
| Boundary | Field boundary, management zone, EC-mapped zone, irrigation prescription extent | GeoJSON Feature (Polygon/MultiPolygon) | Postgres + PostGIS | HTTPS POST or shapefile/GeoJSON upload |
Common cross-cutting concerns (single implementation, applied to all shapes):
- Per-contributor credential vault — AWS Secrets Manager, namespaced
contributor/{id}/{vendor}. Encrypted, scoped IAM, rotation supported. Contributor revocation deletes the token; in-flight pulls drain. - Trust model — see §3i. Every contribution lands with
sensor_class,operator_class, and computedtrust_weight. - Privacy / coordinate fuzzing — every shape has
geom_internal(exact, model-internal) andgeom_public(jittered ±N km or aggregated). Default fuzzed; opt-in to exact precision per object. Boundaries respect avisibilityflag (private/aggregate-only/public). - Provenance — every contribution carries
source_adapter,raw_ref(pointer to the original payload in cold storage),submitter_user_id,submitting_app_id(for OAuth-submitted), creation timestamp, last-edit timestamp. - QC — per-shape validators. Observation: range checks, constant-value sensor failure. Profile: depth-monotonicity sanity. Sample: unit + range. Event: COG conformance + radiometric cal flag (for SAR/multispec). Annotation: schema validation. Boundary: topology validity (no self-intersection, valid CRS).
- Adapter registry — adapters declare which shapes they can produce. Some adapters output multiple shapes (a Campbell logger emits Observations + Profiles; a drone upload pipeline emits Event + auto-derived Annotations like NDVI summary).
Vendor adapter set (initial, see §5b for full table):
- Sentek + YDOC, Campbell HTTP push, Vaisala-on-logger, METER on YDOC or ZENTRA Cloud
- Tempest WeatherFlow, Davis WeatherLink, FieldClimate, Onset HOBOlink
- Drone uploads: generic COG, OpenDroneMap-processed output, common drone vendor exports (DJI Terra, Pix4D, DroneDeploy)
- Lab CSV / NRCS report PDF (Sample)
- Shapefile / GeoJSON (Boundary)
- AmeriFlux pull (Observation, eddy-cov fluxes — at training-contributor request)
- USDA SCAN/SNOTEL pull (Observation)
3i. Trust model — applies to every shape
Two independent dimensions combine into a per-contribution trust_weight:
sensor_class (intrinsic to the source):
| Class | Examples |
|---|---|
research | METER Teros 12 with cal docs; Campbell CS655 fresh-cal; L-Band drone with corner reflectors and radiometric calibration; lab analysis from accredited lab; AmeriFlux eddy-cov tower |
professional | Sentek drill-and-drop, Vaisala WXT520, agronomist-grade EC mapping, multispectral drone with calibration panel |
consumer | Tempest, Ambient Weather, basic capacitance probes, hand-flown DJI RGB without panels |
diy | Arduino sensors, citizen-science photo annotations |
operator_class (who installed/collected/submitted):
| Class | Verification |
|---|---|
researcher | Institutional email + ORCID/ROR |
agronomist_supported | Self-declared, optional pro reference |
farmer | Self-declared, address geocodes to ag land |
hobbyist | Self-declared homeowner |
unknown | Anonymous |
training_weight = sensor_class × operator_class × installation_quality_factor (the third factor is earned over time — contributions that track neighbors and pass QC accumulate quality; outliers lose it).
Two paths through the system:
- Training contributions (
training_weight ≥ 0.5): used in nightly model retrain. Few hundred training stations + a few dozen calibrated drone campaigns per year at maturity. - Correction-takers (everyone else): full feature parity, dashboard, exports, calibrated companion data. Not used in retrain, but bulk anomaly signal informs diagnostics.
UX rule: never show users their trust tier as a number or rank. What we surface instead is contribution health — drift detection, cross-source consistency, sensor-failure or scene-quality flags. Helpful, not pejorative.
3j. Identity, auth, and partner apps
Three identity types:
- User — a person (email + password, optional MFA via Cognito or Auth.js).
- Organization — farm, lab, agency. Owns contributions collectively.
- Partner application — third-party app integrating via OAuth 2.0 + PKCE.
OAuth scopes:
| Scope | Allows |
|---|---|
stations:read stations:write | Manage stations |
observations:read observations:write | Time-series I/O |
profiles:read profiles:write | Multi-depth time-series I/O |
samples:read samples:write | Lab samples |
events:read events:write | Drone/aircraft scene I/O (write requires presigned upload flow) |
collections:read collections:write | Survey grouping |
annotations:read annotations:write | Geotagged notes |
boundaries:read boundaries:write | Field/zone vector I/O |
corrections:read | Calibrated companion data |
alerts:read | Frost/anomaly events |
public:read | Public tile/STAC API (no user context) |
Verification mechanisms:
operator_class = researcher: ORCID iD with affiliation; manual review for borderline cases.operator_class = farmer: station coords geocode to USDA-mapped agricultural land (CDL/CSB).operator_class = hobbyist: default.
3k. Where the existing hardware fits
The 6× Sentek 36" drill-and-drop probes on YDOC ML-417ADS go in at JMR as the anchor training station set — multi-depth profiles ideal for calibrating both surface and rootzone VWC outputs. Tempest + Vaisala WXT520 round out on-farm meteorology. METER hardware (planned purchase) lands via either YDOC or ZENTRA Cloud per deployment.
The Sentek drill-and-drop is better training data than the originally-planned single-depth Teros 12 — multi-depth profiles let the calibration model learn rootzone integration directly.
3l. L-Band drone cal/val — the headline research story
When a contributor flies calibrated L-Band SAR over a field that has live in-situ probes during a Sentinel-1 overpass window, three sources of truth are collocated:
- In-situ point truth — Sentek profile, METER probes, etc.
- Airborne high-res truth — drone L-Band σ⁰ over the flight extent (typically 5–50 cm resolution).
- The C-Band layer being calibrated — Sentinel-1 σ⁰ at 10 m.
This is SMAPVEX-class cal/val data, distributed across contributors instead of NASA-campaign-only. The platform's drone Event ingest path treats radiometrically-calibrated L-Band uploads as high-weight anchor scenes in the nightly retrain. NASA-CSDA, NSF-NRT, and DOE Atmospheric Radiation Measurement programs all fund distributed cal/val — this is a fundable research narrative independent of the data-platform story.
Operationally: an L-Band drone Event uploads as a STAC Item with metadata flags sensor=l_band_sar, radiometric_cal_method=corner_reflector, cal_target_in_scene=true, flight_window_overlaps_s1_overpass=true (auto-computed from timestamp). Preprocessor verifies the calibration metadata; contribution lands as sensor_class=research if intact, demoted otherwise.
3m. Where Farming Game's role fits
Farming Game is the first OAuth partner application. JMR's stations register through /v1/stations on behalf of the JMR farm user. Field boundaries, irrigation events, and any drone flights JMR runs over its blocks all flow through the same OAuth scopes. Calibrated corrections come back through /v1/stations/{id}/corrections and the AOI extract endpoints. See §11.
4. AWS account structure
Phase 0–1: single account
One AWS account, root locked, MFA on, IAM Identity Center.
Phase 4+: multi-account AWS Organizations
| Account | Role | Notes |
|---|---|---|
mud2dust-billing | Org root, no workloads | Consolidated billing only |
mud2dust-prod | Pipeline + tiles + API + bridge | Default region us-west-2 |
mud2dust-dev | Sandbox, breakable | Same regions as prod |
mud2dust-data | Output S3 + contributor S3 + Secrets Manager | Defense in depth — compromised compute can't delete archive |
SCPs:
- Prevent disabling CloudTrail
- Prevent deleting backup snapshots
- Restrict deployable regions to us-west-2 and us-east-1
- Restrict access to
mud2dust-dataSecrets Manager namespaces by IAM role only
Networking
- VPC with private subnets per region.
- Lambda outside VPC for simplicity.
- Fargate in private subnets with VPC interface endpoints for S3, STS, Secrets Manager, CloudWatch Logs.
Cost guards from day one
- AWS Budget at $200/mo with alarm at 80%.
- AWS Cost Anomaly Detection enabled.
- Tag every resource:
project=mud2dust,env=prod|dev,layer=ingest|process|serve|bridge.
5. Ingest job catalog
5a. Satellite + raster ingest
The "first six" — phase 1–2 — get 80% of value:
- Sentinel-1 RTC (moisture primary signal)
- HLS Sentinel-2 (NDVI, NDWI, NDRE)
- HLS Landsat (thermal LST, longer-archive optical)
- HRRR (precipitation accumulation, soil temp)
- POLARIS (static soil prior)
- Copernicus DEM (static topography prior)
Second wave — phase 5: ECOSTRESS, SMAP L4, OpenET, GOES-18 thermal, VIIRS Active Fire. Third wave — phase 6+: TROPOMI, GEDI, MODIS LST, GPM IMERG, USDA CDL/CSB, SSURGO/gNATSGO.
| Source | Tier | Cadence | Worker | Output | Retention |
|---|---|---|---|---|---|
| Sentinel-1 RTC | 1 (anonymous S3) | 6-day; nightly STAC poll | Lambda | σ⁰ + VWC COG | 90d raw, ∞ fused |
| Sentinel-2 (HLS) | 2 (Earthdata) | 2–3 day; daily | Lambda + auth | NDVI/NDWI/NDRE COG | 90d raw |
| Landsat 8/9 (HLS-L30) | 2 | 8-day; daily | Lambda + auth | thermal LST + reflectance | 90d |
| HRRR | 1 | hourly | Lambda | precip-7d + soil temp COG | 30d |
| NEXRAD MRMS | 1 | 5-min (sample hourly) | Lambda | precip-hr COG | 30d |
| GOES-18 thermal | 1 | 10-min (sample 30-min) | Lambda | LST COG | 14d |
| SMAP L4 | 2 | daily | Lambda + auth | regional VWC reference | 30d |
| ECOSTRESS | 2 | irregular ISS revisit | Lambda + auth | high-res LST scenes | 90d |
| MODIS LST + Snow | 2 | daily | Lambda + auth | continuity gap-fill | 30d |
| Sentinel-5P TROPOMI | 3 (Copernicus) | daily | Fargate (cross-region) | air quality / methane | 30d |
| Copernicus DEM | 1 | one-shot | Lambda | static prior | indefinite |
| POLARIS | 3 (UC Davis HTTP) | one-shot, refresh annually | Fargate | static soil prior | indefinite |
| SSURGO / gNATSGO | 3 (USDA NRCS) | annual | Fargate | static soil prior | indefinite |
| USDA CDL / CSB | 3 (USDA FTP) | annual | Fargate | crop classification | indefinite |
| OpenET | 3 (GEE/REST) | monthly | Lambda + GEE auth | ET CONUS | 12 months |
| GPM IMERG | 2 | 30-min (sample hourly) | Lambda + auth | global precip | 14d |
5b. Contribution bridge — adapters by shape
| Adapter / source | Shapes produced | Direction | Protocol |
|---|---|---|---|
| Sentek + YDOC ML-417 | Observation, Profile | Push | HTTPS POST (HMAC-signed) |
| Campbell Scientific | Observation, Profile | Push | HTTPS POST or SFTP (CRBasic template) |
| Vaisala WXT520 | Observation | Push | via host logger (Campbell adapter) |
| METER on YDOC | Observation, Profile | Push | YDOC HTTP POST |
| METER on ZENTRA Cloud | Observation, Profile | Pull | REST (per-contributor token) |
| Tempest WeatherFlow | Observation | Pull | REST + UDP local |
| Davis WeatherLink | Observation | Pull | REST (per-contributor token) |
| FieldClimate (METOS) | Observation, Profile | Pull | REST (per-contributor token) |
| Onset HOBOlink | Observation | Pull | REST (per-contributor token) |
| WeatherBug | Observation (regional) | Pull | REST (regional context, not on-farm) |
| Generic JSON webhook | Observation | Push | HTTPS POST |
| Generic CSV/SFTP | Observation, Sample | Push | SFTP |
| AmeriFlux pull | Observation (eddy-cov) | Pull | REST (training-contributor request) |
| USDA SCAN/SNOTEL pull | Observation, Profile | Pull | NRCS Awdb API |
| CRNP / COSMOS-USA | Observation | Pull | REST (where exposed) |
| Soil-lab CSV / PDF | Sample | Push | HTTPS POST + S3 multipart for attachment |
| Drone COG upload | Event | Push | Presigned S3 multipart + STAC POST |
| OpenDroneMap output | Event | Push | Same as above; auto-detect outputs |
| DJI Terra / Pix4D / DroneDeploy export | Event | Push | Presigned S3 multipart + STAC POST |
| Aircraft campaign upload | Event, Collection | Push | Presigned S3 multipart + STAC POST |
| PhenoCam-style series | Collection (auto-grouped Events) | Push | Periodic HTTPS POST |
| Shapefile / GeoJSON | Boundary | Push | HTTPS POST or multipart upload |
| Agronomist note (mobile) | Annotation | Push | HTTPS POST (mobile/web) |
| Citizen-science photo | Annotation | Push | HTTPS POST (multipart with EXIF) |
Failure handling: every adapter emits CloudWatch metric IngestSuccess{adapter=X, contributor=Y, shape=Z}. Alarm on >2 consecutive failures. Failed contributions DLQ'd to SQS for retry. Contributor sees a status indicator per source on their dashboard.
Earthdata auth wrapper (Tier 2 satellite): built once as a shared Lambda layer.
6. Output products / API surface
6a. Public raster tile API (anonymous tier)
GET /tiles/{layer}/{date}/{z}/{x}/{y}.{png|webp}?colormap=...
Standard XYZ tiles. Default PNG; WebP optional; raw GeoTIFF via ?format=tif. CloudFront long TTL on (layer, date). WAF rate limit 600 req/hr per IP. Attribution headers on every response.
6b. Public STAC catalog (anonymous tier)
GET /stac/collections → list official + public-opted contrib collections
GET /stac/collections/{id}/items?bbox=... → search
GET /stac/collections/{id}/items/{id} → single item (links presigned COG, 1-hour TTL)
6c. AOI extraction API (authenticated free tier)
POST /v1/extract
{ "geom": <GeoJSON or boundary_id>, "layer": "moisture-rootzone",
"from": "2026-04-01", "to": "2026-04-28", "format": "parquet" }
→ 202 with job_id; poll /v1/jobs/{job_id}
Async via Step Functions. The geom field accepts a contributor's saved Boundary by ID — natural ergonomics for "extract over my field."
6d. Public station / event browse (anonymous tier)
GET /v1/public/stations?bbox=... → public-opted stations (jittered coords)
GET /v1/public/stations/{id}/observations?...
GET /v1/public/events?bbox=...&type=drone → public-opted Events
6e. Contributor dashboard API (authenticated user tier)
GET /v1/me/stations → all my stations across vendors
GET /v1/me/stations/{id}/observations?...
GET /v1/me/stations/{id}/profiles?...
GET /v1/me/samples?...
GET /v1/me/events?...
GET /v1/me/collections?...
GET /v1/me/annotations?...
GET /v1/me/boundaries?...
GET /v1/me/corrections?stations=...&from=... → satellite vs station companion
GET /v1/me/exports?shapes=...&format=parquet → cross-shape unified export
GET /v1/me/alerts → frost / anomaly / contribution-health
6f. Partner-app API (OAuth scoped tokens)
Endpoints for each shape, all gated by OAuth scopes from §3j:
POST /v1/stations → register a station (scope: stations:write)
POST /v1/observations → push time-series readings (scope: observations:write)
POST /v1/profiles → push multi-depth readings (scope: profiles:write)
POST /v1/samples → submit a lab sample (scope: samples:write)
POST /v1/uploads/initiate → start a presigned upload (scope: events:write)
POST /v1/uploads/complete → finalize, create STAC item (scope: events:write)
POST /v1/collections → group flights/passes (scope: collections:write)
POST /v1/annotations → geotagged note (scope: annotations:write)
POST /v1/boundaries → field/zone vector (scope: boundaries:write)
GET /v1/stations/{id}/corrections → calibrated companion (scope: corrections:read)
GET /v1/alerts → frost / anomaly events (scope: alerts:read)
The in-product onboarding wizard is a thin shell over these endpoints — the API is the product surface, not an afterthought.
6g. Bulk download (paid tier — defer to phase 6)
7. Access tiers
| Audience | Auth | Rate limit | Can do | Can't do |
|---|---|---|---|---|
| Anonymous public | none | 600 req/hr per IP | Browse calibrated map, public STAC, public-opted stations + events, AOI extract (small) | Connect contributions; access raw contributor data |
Signed-in viewer (pk_view) | email + password | 5,000 req/hr | All anonymous + saved AOIs, alerts, multi-source dashboard for any public contributions | Connect contributions |
Connected contributor (pk_contrib_pending) | as above + ≥1 contribution | 10,000 req/hr | All viewer + their own dashboard across all seven shapes, raw API for their contributions, cross-shape export | Train the public model |
Validated contributor (pk_contrib) | as above + 30 days validated data | 50,000 req/hr | All contributor + recognition badge | — |
Training contributor (pk_contrib_train) | as above + sensor_class ≥ professional + verified install | 50,000 req/hr | All validated + their data trains the public model + "training contributor" badge | — |
Partner app (client_id + user OAuth token) | OAuth 2.0 PKCE | per-scope, per-user | Scoped on user's behalf | Anything outside granted scope |
| Commercial bulk | API key + signed agreement | unlimited | Bulk Parquet, no rate limit | (defer to phase 6) |
Tier transitions: sign up → pk_view → connect first contribution → pk_contrib_pending → 30 days validated → pk_contrib → research-grade hardware + verified ORCID + verified install → pk_contrib_train (manual review for first cohort).
Validated = passes per-shape QC + non-degenerate values + plausible location. Stops fake contributions from harvesting tier bumps.
8. Cost estimate
Phase 1 — regional satellite + JMR anchor stations
| Item | Low | High | Driver |
|---|---|---|---|
| S3 storage (COGs) | $5 | $15 | 200–600 GB rolling |
| S3 PUT/GET | $1 | $3 | |
| Lambda compute | $5 | $15 | Satellite ingest + bridge intake + pull workers |
| Fargate compute | $50 | $100 | Cross-region NOAA, tier-3, drone preprocessing |
| CloudFront egress | $20 | $200 | Variable |
| Cross-region transfer | $5 | $15 | NOAA us-east-1 → us-west-2 |
| API Gateway | $3 | $10 | |
| RDS Aurora Serverless (pgstac + TimescaleDB + relational + PostGIS) | $80 | $130 | Combined or split |
| Secrets Manager | $2 | $10 | Per-contributor vendor tokens |
| Cognito (or equivalent) | $0 | $5 | Free tier covers <50k MAU |
| CloudWatch logs/metrics | $5 | $15 | |
| EventBridge + SQS + DLQ | $1 | $5 | |
| WAF | $5 | $10 | |
| Total (Phase 1) | $182 | $533 | Doesn't include contributor-uploaded Event volume (§12l) |
Plus contributor probe ops (separate budget line):
- YDOC cellular SIMs, 6 anchor loggers: ~$30–60/mo
- Tempest data plan: $0 (WeatherFlow API free for personal use)
- Vaisala / Campbell logger comms: TBD
- Sentek hardware (sunk): already purchased
- METER hardware: TBD pending purchase
Phase 4+ adders — contribution-driven
When the bridge opens to outside contributors:
- Drone Event storage: 100 contributors × 10 flights/yr × 1 GB avg = 1 TB/yr. Standard tier $23/mo first year; tiering to IA/Glacier per §12l.
- Drone preprocessing compute: Fargate spikes per upload. Budget $50–200/mo at moderate volume.
- Contributor pull workers: Lambda invocations scale with contributor count × poll frequency. ~$0.05/contributor/mo at hourly poll; trivial.
- Total Phase 4 platform overhead: $80–300/mo on top of Phase 1 baseline.
CONUS scaling (phase 6)
Storage ~5–8×. Compute ~3–5×. Realistic CONUS-with-traction: $1,500–4,000/mo.
9. Phased build plan
Phase 0 — Foundation (1 week)
Done: brand locked, domains registered, hardware purchased.
Day 1–2: AWS account. MFA on root, IAM Identity Center, budget alarms, CloudTrail, tag policy.
Day 3–4: Repository scaffolding under GitHub org mud2dust/:
mud2dust/sensor-bridge— multi-shape contribution bridgemud2dust/pipeline— satellite workers, CDK or Terraform inframud2dust/titiler— tile server, customized titiler configmud2dust/site— Next.js dashboard + public landing + onboarding wizard
MIT license on all four.
Day 5: Auth scaffolding placeholder in mud2dust/site — Cognito user pool created (or Auth.js setup), OAuth app-registration table stubbed in DB, no UI yet.
Day 6: Vercel project for the landing site. mud2dust.com placeholder.
Day 7: Buffer / METER follow-up / JMR conversation per §14.
Phase 1 — Anchor station + one satellite layer end-to-end (3 weeks)
Week 1: Sentinel-1 RTC ingest. Lambda on EventBridge daily cron @ 03:00 UTC. Mosaic clipped to Eastern WA bbox, write s3://mud2dust-cogs/sigma0/yyyy/mm/dd/eastern-wa.tif.
Week 2: Bridge MVP — Observation + Profile shape only, YDOC adapter only. HTTPS endpoint with HMAC validation. JMR's YDOC ML-417ADS configured to POST. Land observations in TimescaleDB. Bare-bones internal dashboard rendering JMR Sentek depths.
Week 3: Tile route + station overlay. Titiler on Lambda + CloudFront. /tiles/sigma0/{date}/{z}/{x}/{y}.png. Demo page with Mapbox raster + JMR Sentek anchor station overlay.
Phase 1 deliverable: internal URL renders Sentinel-1 backscatter over Franklin County with the JMR Sentek anchor station live-overlaid. σ⁰ change correlates with rain events from Tempest readings.
Phase 2 — Multi-source fusion (4 weeks)
Week 1: HLS ingest with Earthdata auth. NDVI/NDWI COGs.
Week 2: HRRR ingest in us-east-1. 7-day rolling precip.
Week 3: Static priors. POLARIS from UC Davis, Copernicus DEM from S3.
Week 4: First fusion model. RF on JMR Sentek + Tempest + HLS NDVI + Sentinel-1 σ⁰. Pickle, ship to Lambda. /tiles/moisture-rootzone/{date}/... runs the model per pixel. Holdout-validate against held-back Sentek depths.
Phase 2 deliverable: calibrated moisture map. Farming Game can switch off AGROMONITORING_API_KEY.
Phase 3 — Open the public API + dashboard (3 weeks)
Week 1: WAF, attribution headers, STAC catalog populated.
Week 2: Public dashboard at mud2dust.com — calibrated map browse, public-opted contributions visible, no signup.
Week 3: Soft launch — blog, social, HN, Awesome-Geospatial.
Phase 3 deliverable: strangers using the public map.
Phase 4 — Full contribution platform (~15 weeks, split into 4a–4e)
Phase 4a — Bridge expansion to all time-series sources (3 weeks)
Week 1: ZENTRA Cloud pull adapter (per-contributor token + Secrets Manager). Tested against METER hardware once purchased; tested against any METER research-network access from §14. Week 2: WeatherFlow Tempest pull, Davis WeatherLink pull, FieldClimate pull, Onset HOBOlink pull. Week 3: Campbell HTTP-push adapter + CRBasic template publication. Generic JSON webhook adapter. SFTP intake gateway. AmeriFlux + USDA SCAN pull.
Phase 4b — Event / Collection (drone + aircraft) (4 weeks)
Week 1: Presigned S3 multipart upload flow. Storage layout under s3://mud2dust-contributions/events/.
Week 2: Drone preprocessing pipeline (Fargate). COG-conformance validation, metadata extraction, STAC item generation. Generic COG, OpenDroneMap output, DJI Terra / Pix4D / DroneDeploy exports.
Week 3: L-Band SAR Event handling. Radiometric-cal metadata schema. Cal-target detection flag. Auto-cross-reference with Sentinel-1 overpass windows.
Week 4: Collection grouping. Public-opt-in for STAC catalog. Tile route for public Events.
Phase 4c — Sample, Annotation, Boundary (2 weeks)
Week 1: Sample shape — schema, lab-CSV/PDF intake, attachment storage. Annotation shape — geotagged notes, citizen-science photo intake. Week 2: Boundary shape — shapefile/GeoJSON upload, PostGIS storage, integration with AOI extract endpoint.
Phase 4d — Onboarding UX + cross-shape dashboard (3 weeks)
Week 1: User signup flow. Onboarding wizard scaffolding ("what do you have?" branching across all seven shapes). Week 2: Vendor-specific onboarding paths. Per-station/event/etc registration UI with sensor_class / operator_class / privacy controls. Week 3: Cross-shape unified dashboard — stations + flights + samples + boundaries on one map. Anomaly / drift / contribution-health surfacing. Cross-shape Parquet/CSV export.
Phase 4e — Trust model + retrain + partner-app OAuth (3 weeks)
Week 1: Trust model fully wired across all shapes. ORCID verification flow. Tier transitions automated. Week 2: Nightly retraining job. Pulls last-30-days training-tier contributions across Observation/Profile/Event (L-Band drone scenes weighted as anchors). A/B test, promote if better. Week 3: OAuth 2.0 partner-app flow. App registration UI. Farming Game integrated as first partner app — registers JMR's stations + boundaries through the API, pushes Tempest + Vaisala observations, reads calibrated corrections.
Phase 4 deliverable: any farmer, researcher, or drone operator can connect; partner apps integrate via OAuth; calibration model improves visibly per month with multi-shape contributions.
Phase 5 — Add layers + grant momentum (6 weeks)
ECOSTRESS, OpenET, TROPOMI, GOES frost, VIIRS Active Fire. Each ~1 week. Sequence by user demand.
Phase 6 — National + paid tier (8 weeks)
Weeks 1–4: scale satellite ingest to CONUS. Weeks 5–6: Stripe paid tier. Weeks 7–8: first grant application (NSF Pathways, USDA NIFA, NASA CSDA).
10. Key architecture decisions
| Decision | Choice | Why |
|---|---|---|
| Brand | mud2dust | Locked. Domains owned across major TLDs. |
| License | CC-BY for tiles + STAC outputs, MIT for code | Lowest-friction with attribution required. |
| Bridge object types | Seven first-class shapes (Observation, Profile, Sample, Event, Collection, Annotation, Boundary) | Most platforms pick one or two; designing for all seven is the differentiator. Shared auth/trust/privacy/storage layer keeps cost manageable. |
| Drone L-Band as anchor | Treat radiometrically-calibrated L-Band drone Events as training-tier anchor scenes | Distributed SMAPVEX-class cal/val. Distinct, fundable research narrative. |
| Bridge layout | One repo, modular by shape | Shapes share 80% of plumbing; fork modules within the repo, not the codebase. |
| Trust model | sensor_class × operator_class → training_weight, invisible to users | Honest weighting without alienating low-trust contributors. |
| Training/correction split | Few hundred training contributions; everyone else consumes corrections | Calibration needs quality, not quantity. |
| Partner-app integration | OAuth 2.0 + PKCE with per-shape scopes | Standard pattern; lets Farming Game and any future app integrate. |
| Vendor token storage | AWS Secrets Manager, namespaced per contributor | Encrypted at rest; per-contributor IAM; rotation; revocation. |
| Coordinate privacy | Two coord fields per object — internal (exact) and public (jittered) | Default fuzzed for STAC/public; opt-in per object. |
| Event upload mechanism | Presigned S3 multipart, not API POST | Drone scenes are too big for API POST; presigned multipart is the AWS-native pattern. |
| STAC backend | pgstac on Aurora Serverless v2 (one instance, official + contrib collections) | Sub-100ms search; STAC is the lingua franca for both satellite and contributor raster. |
| Sensor-data backend | TimescaleDB on RDS (multi-tenant by contributor_id) | Easier model-training queries, easier contributor SQL. |
| Tile renderer | titiler on Lambda + CloudFront | Idiomatic; copies OpenET / Planetary Computer. |
| Failure isolation | One Lambda/Fargate per source/adapter, EventBridge-triggered | A SMAP outage doesn't block Sentinel-1; ZENTRA outage doesn't block YDOC; drone preprocessing doesn't block Observation ingest. |
| Account split | Single account through phase 3, multi-account org from phase 4 | Defense-in-depth once data exists. |
| Showcase customer | Farming Game as first OAuth partner app | Validates partner-app API; useful for grants. |
| Anchor stations | 6× Sentek 36" drill-and-drop on YDOC ML-417ADS (already purchased) | Multi-depth profile sensors better than single-depth. |
11. What this means for Farming Game
Four concrete changes in the farminggame repo, sequenced after mud2dust phase 4:
- Become the first OAuth partner app on mud2dust. Register JMR's Sentek + Tempest + Vaisala stations through
/v1/stations. Stream observations through/v1/observations. Read calibrated corrections through/v1/stations/{id}/corrections. - Push field boundaries. JMR's block boundaries become Boundary contributions on mud2dust via
/v1/boundaries. AOI extracts and per-block calibrated outputs become trivial. - Push irrigation events as Annotations or domain-specific Events. Closes the loop between "what was applied" and "what the satellite + sensors see."
- Drop Agromonitoring. Replace
AGROMONITORING_API_KEYwith mud2dust API calls via OAuth user token. Saves $30–300/mo per farm. - Retitle Phase 13. Rename "Direct Satellite Pipeline & Soil Calibration" to "Integrate with mud2dust." Drop subphases 13b–13d. Keep 13a (sensor deployment) but reframe as "deploy stations into mud2dust as the first OAuth partner app." Subphases 13e–13g stay in farminggame.
- Become the showcase customer. mud2dust's website links to Farming Game as the working partner-app example. Farming Game's website credits mud2dust as the data layer.
The two projects share an AWS organization but separate accounts/billing.
12. Open questions
12a. Brand name + domain — Resolved. mud2dust locked; both mud2dust and mudtodust owned across .com, .net, .org, .io, .dev, .ag, .farm, .ai, .earth, .co (20 domains). Primary mud2dust.com. Defensive coverage complete.
12b. Legal entity
LLC for phase 0–3, hybrid LLC-owned-by-501(c)(3) when contributor revenue exceeds $20k/yr. LLC keeps you nimble; 501(c)(3) unlocks grants; hybrid (OpenStreetMap Foundation pattern) adds ~$3K/yr legal/accounting overhead.
12c. METER co-brand terms
See §14 for the three-tier ask. Standard structure for the biggest ask: logo + advisory seat + 30-day model first-look + joint paper opportunity. No exclusivity.
12d. Anchor station funding split
6× Sentek + YDOC + Tempest + Vaisala (hardware sunk). Ongoing comms ~$30–60/mo cellular + ZENTRA Cloud if used. Options:
- mud2dust absorbs fully.
- JMR co-funds as founding contributor (gets
pk_contrib_trainin perpetuity). - Hybrid: mud2dust pays cellular + bridge ops, JMR pays any ZENTRA Cloud subscription.
12e. Initial advisors (recruit before phase 3 launch)
- Northwest credibility — USDA-ARS Pendleton or WSU Prosser.
- Federal-process knowhow — OpenET / NASA open-data ag programs alum.
- Commercial validation — Bayer / Climate / Granular alum.
12f. Funding strategy (sequenced)
| Phase | Source | Amount | Why |
|---|---|---|---|
| 0–3 | self-fund | ~$3–5K | AWS + domains (sunk) + marginal time |
| 4 | METER + 2–3 universities at $5K/yr | $20K | Consortium fee covers ops |
| 5 | NIFA SBIR Phase I + NASA-CSDA L-Band cal/val | $175–500K | Two distinct narratives |
| 6 | NRCS CIG or Climate Smart Commodities | $500K+ | National rollout |
| 6+ | Paid bulk-tier customers | $50–500K/yr | Recurring revenue |
12g. Data-use agreement language
Contributors own their raw data. mud2dust gets a license to (a) render their dashboard, (b) use their data in retrain if they're at training tier, (c) produce aggregate/derived public outputs under CC-BY. Revocable. One-paragraph plain-English summary at signup + longer version reviewed by counsel. Tied to §12b legal entity timing.
12h. Researcher verification mechanism
ORCID iD with affiliation at v1; manual review for borderline cases (institutional email without ORCID; non-academic researchers with publication record).
12i. Hardware in hand — reflects updated plan
6× Sentek 36" drill-and-drop on YDOC ML-417ADS, plus Tempest + Vaisala WXT520 + Campbell-in-conversation + METER-to-purchase. Multi-depth profile sensors are better training data than single-depth Teros 12.
12j. Contributor freemium / dashboard-only path
Contributors who only want the dashboard support pk_contrib with all stations/events marked geom_public_mode = "private". Their data is excluded from public model and public browse. Costs almost nothing to support. Keeps funnel wide.
12k. METER hardware logger choice
For each METER probe purchase, choose YDOC (existing fleet, push) vs ZENTRA Cloud (vendor portal, pull) per deployment.
12l. Contributor raster storage policy — deferred
At what volume should contributor-uploaded Events trigger storage policy (per-tier quotas, aggressive Glacier lifecycle, COG-only at upload, etc.)? Defer until phase 5 when real volume signal is available. Track contributor-storage as a separate budget line from day one so the trigger point is visible.
12m. L-Band drone cal/val — recruitment strategy
Who flies L-Band SAR over instrumented fields and would partner? Candidates: NASA AirMOSS alumni, USDA-ARS Beltsville, university radar labs (CSU, OU, Univ. Michigan), commercial L-Band drone vendors (ImSAR pilot programs). At least one before phase 5 makes the cal/val story concrete enough to write into a NASA-CSDA proposal.
12n. Cross-shape exports — what does "unified" mean?
A contributor with stations + flights + samples + boundaries on one farm wants a single coherent export. What's the format? Options:
- One Parquet per shape, zipped together with a manifest.
- A single STAC catalog where stations are STAC items with assets pointing to Parquet.
- A custom mud2dust archive format.
Recommendation: Parquet-per-shape + STAC manifest; defer custom format unless users ask. Decide during phase 4d.
13. Things deliberately deferred
| Item | Defer until | Why |
|---|---|---|
| Disaster recovery (cross-region replication, RDS backups, full IaC) | Phase 4 | Single account is recoverable enough during build |
| Observability beyond CloudWatch | Phase 3 launch | CloudWatch is fine until you have users |
| GDPR compliance | Phase 3 | Trivial — only collecting emails for API keys |
| SOC 2 | Phase 6 | Only matters for paid-tier enterprise customers |
| Versioning (model_version on tiles, reproducible old outputs) | Phase 4 | Once retraining starts, old tiles need to be reproducible |
| Internationalization | Phase 6+ | Algorithms generalize globally; only POLARIS + SSURGO are US-only |
| MFA enforcement on all users | Phase 4e | Encouraged at training-contributor tier; required there; optional elsewhere until then |
| Mobile app | Phase 6+ | Web dashboard responsive enough; native app waits for product/market fit |
| Water-quality / nitrate runoff samples | Phase 6+ | Adjacent industry; could fragment focus |
| Stream gauges / hydrology beyond awareness | Phase 6+ | Not on the soil-moisture critical path |
| In-cab / equipment telemetry (John Deere, AgLeader, Trimble) | Phase 6+ | Valuable but a different integration class; revisit once partner-app pattern is proven |
| Climate model output as contribution | Phase 6+ | Drought projection is a separate product; defer |
| Contributor raster storage policy | Phase 5 | See §12l |
14. Next concrete steps
METER meeting (2026-05-07) — three asks ranked by ease
- (Easy / must-have) Confirm ZENTRA Cloud REST API can be used as a per-contributor pull adapter — any METER customer who authorizes mud2dust with their own ZENTRA token grants us read access to their devices and observations. Standard third-party API use; should not require formal agreement, but worth confirming there's no ToS restriction.
- (Medium) Co-publish the ZENTRA Cloud adapter under MIT in
mud2dust/sensor-bridge. Gives METER a co-contributor citation; gives any open-data project a reusable adapter. - (Big) METER's research network contributes as a Training Contributor on the platform, with co-branding (logo + tile attribution; advisory seat; 30-day model first-look; joint paper opportunity).
If their developer-relations / API team is the right counterpart for asks 1 and 2, while research/sales handles ask 3, try to bring both groups in.
Other Phase 0 prerequisites
- Stand up an AWS account. 15 minutes if not already done — root locked, MFA, IAM Identity Center.
- Conversation with Jackass Mountain Ranch to confirm willingness to host the anchor stations and decide §12d funding split.
- Legal-entity decision (LLC formation) — gate for any contributor-data-use agreement at scale.
- Repo scaffolding: create
mud2dust/sensor-bridge,mud2dust/pipeline,mud2dust/titiler,mud2dust/siteunder the GitHub org. - Domain DNS: point
mud2dust.comat Vercel (landing) andapi.mud2dust.comat API Gateway (placeholder). - L-Band drone cal/val partnership scouting — first conversation per §12m, before phase 5 grant-writing.
Once those are done, phase 0 closes and phase 1 starts.
Verification
| Phase | Verification gate |
|---|---|
| 0 | Domains registered (done). AWS account exists with budget alarm + tags. Four repos scaffolded with MIT license. Auth scaffolding placeholder in site. Placeholder landing live at mud2dust.com. |
| 1 | Internal URL renders Sentinel-1 backscatter tiles over Franklin County, WA. JMR Sentek station registered via API; observations flowing into TimescaleDB; visible on internal dashboard. σ⁰ change correlates with rain events from Tempest readings. |
| 2 | /tiles/moisture-rootzone/{date} returns calibrated VWC. Holdout JMR Sentek depths validated within ±3% VWC RMSE. Farming Game can switch off AGROMONITORING_API_KEY and field-detail panel still renders moisture. |
| 3 | Anonymous tile + STAC API live behind WAF. Public dashboard browseable without signup. Five external users have made pystac-client requests. Attribution headers present on all responses. |
| 4a | Bridge supports ≥6 vendor adapters end-to-end across Observation + Profile shapes (YDOC push, ZENTRA pull, WeatherFlow pull, WeatherLink pull, Campbell push, AmeriFlux pull). Per-contributor credential vault working. |
| 4b | Drone Event upload works end-to-end (presigned multipart → preprocessing → STAC item → tile route). At least one L-Band SAR upload validated with calibration metadata. Public tile route for opted Events live. |
| 4c | Sample, Annotation, and Boundary shapes all support upload + retrieval + dashboard rendering. Boundary integrates with AOI extract. |
| 4d | Onboarding wizard live. Farmer can connect their hardware in <10 min from signup. Cross-shape dashboard renders mixed sources correctly. Cross-shape Parquet/STAC export produces a coherent archive. |
| 4e | OAuth partner-app flow live. Farming Game integrated as first partner app — registers stations + boundaries, pushes observations + irrigation events, reads corrections. Trust model weighting confirmed in retrain logs. Nightly retrain promotes a new model with measurable accuracy gain. |
| 5+ | Each new layer ships with a STAC collection, a tile route, and a demo notebook. |
| 6 | Stripe-billed paid customer pulls bulk Parquet. Grant application submitted (preferably both NASA-CSDA L-Band cal/val and NIFA open ag data infra). |
Critical files / paths
This plan currently lives at /Users/willmachugh/.claude/plans/we-were-working-on-lovely-widget.md and the prior revision at /Users/willmachugh/.claude/plans/glittery-stirring-origami.md. After approval, copy into the new umbrella repo as mud2dust/PLAN.md (or mud2dust/.claude/plans/mud2dust-plan.md to mirror farminggame convention) and mark farminggame/.claude/plans/openagdata-architecture.md superseded with a pointer here.
Files to be created during Phase 0:
mud2dust/PLAN.md(copy of this file)mud2dust/README.mdmud2dust/LICENSE(MIT)- Four sub-repos under the GitHub org:
sensor-bridge,pipeline,titiler,site.
No existing functions or utilities to reuse — mud2dust/ is empty. Phase 13 references in farminggame/.claude/plans/farminggame-plan.md need to be retitled per §11 once this plan is approved.