The full strategy doc, kept in the repo so it evolves with the work. The status banner below is auto-generated from BUILD-PROGRESS.md on every build.

mud2dust — strategy plan (rev. 2026-05-06)

Status: plan stage. Empty repo. Domains secured. Hardware in hand. METER meeting scheduled 2026-05-07. Phase 0 implementation begins after METER meeting and AWS/repo scaffolding.

Brand: mud2dust is locked. Domains owned: mud2dust and mudtodust across .com, .net, .org, .io, .dev, .ag, .farm, .ai, .earth, .co (20 domains total). Primary mud2dust.com. Defensive coverage is comprehensive — no further TLD acquisition needed.

Hardware in hand: 6× Sentek 36" drill-and-drop probes (multi-depth VWC + temp + EC) on YDOC ML-417ADS data loggers. Plus Tempest weather stations and Vaisala WXT520. Campbell Scientific in conversation. METER hardware to be purchased; logger choice (YDOC vs ZENTRA Cloud) deferred per deployment.

Context

mud2dust is two products built on one platform, deliberately:

An open, calibrated, high-resolution soil-moisture map of US agricultural land — daily 5–30m VWC raster fused from satellite + soil-texture priors + a contributor probe network + (when available) airborne L-Band SAR cal/val campaigns.
A multi-vendor contribution platform — farmers, researchers, hobbyists, and partner applications connect not just soil moisture and weather time-series but seven first-class data shapes: observations, profiles, samples, events, collections, annotations, and boundaries. mud2dust normalizes, stores, visualizes, and exposes all of it via API. High-trust contributions train the calibration model. Everyone benefits from the calibrated output.

The platform is what makes the map possible; the platform is also the immediate carrot. Users can browse the public map for free without signing up; contributors sign up to connect their hardware, lab samples, drone flights, agronomist annotations, or field boundaries; partner applications (Farming Game first) integrate via OAuth.

Two existing planning documents in the sibling farminggame project describe overlapping pieces. mud2dust unifies them; Farming Game becomes the first OAuth partner application. The 6× Sentek deployment originally scoped under farminggame Phase 13a becomes mud2dust's anchor training stations.

This plan supersedes both source documents on the soil-map question. Farming Game's Phase 13 should be retitled to "Integrate with mud2dust": deploy Sentek + Vaisala + Tempest sensors as anchor stations, register them via mud2dust's OAuth API, replace AGROMONITORING_API_KEY with mud2dust API calls, drop the in-product SAR pipeline.

1. Vision

An open, calibrated, high-resolution soil-moisture map of US agricultural land — free for research and small operators, paid only at the bulk-extraction tier — built on a multi-shape contribution platform that any farmer, researcher, drone operator, or partner app can connect to.

1a. The map

Daily 5–30m volumetric water content (VWC) raster, fused from Sentinel-1 SAR backscatter + Sentinel-2 vegetation indices + soil-texture priors + the contributor station network. Around it, the same infrastructure serves NDVI/NDWI/NDRE, thermal LST + ET, precipitation, static priors. Pitch the soil moisture map; ship the rest because the pipeline already passes it.

1b. The platform

Multi-shape, multi-vendor contribution surface — contributors connect ZENTRA Cloud, WeatherFlow, WeatherLink, FieldClimate accounts; push from raw loggers (YDOC, Campbell, Davis); upload drone or aircraft scenes; submit lab sample results; mark field boundaries; add agronomist annotations. Cross-source unified dashboard. Per-contribution provenance and trust scoring. Free for everyone. Partner applications integrate via OAuth.

1c. The cal/val story

Distributed L-Band drone SAR campaigns over instrumented fields create triple-validated calibration anchors (in-situ probes + airborne L-Band + Sentinel-1 C-Band). This is SMAPVEX-class cal/val data, distributed across contributors instead of NASA campaigns only. Distinct, fundable research narrative for NASA-CSDA / NSF.

Pitch — to map users

Stop paying $300–1500/mo for someone else's wrapper around free public data. Use the same data, calibrated against ground truth, with an open governance model and an attribution-only license.

Pitch — to platform contributors

One place to see all your soil and weather sensors regardless of vendor — and your drone flights, lab results, field boundaries, and agronomist notes alongside them. We handle the protocols. If you have well-installed research-grade hardware or calibrated airborne instruments, your data improves the public moisture model and you get higher API tiers. If you don't, you still get the dashboard, the cross-source exports, and the calibrated companion data — free.

Pitch — to partner-app developers

Integrate once via OAuth, get sensor data, drone uploads, and calibrated outputs for any user who connects to mud2dust. No per-vendor adapter code in your app. Same tokens give your users access to their own data and to the public model.

Why "open" matters strategically (not ideologically)

Network effects on calibration. Every contribution makes the model better for everyone in that soil/canopy class. Closed competitors can't accept arbitrary contributor data because their license forbids redistributing improvements.
Default citation status. Academic papers cite open infrastructure (OpenStreetMap, Zenodo, OpenET). Once cited 100 times, you're permanent.
Grant and consortium funding. NSF, USDA NIFA, NRCS Conservation Innovation Grants, NASA Western Water Applications all fund open-data projects. None fund closed SaaS.
Vendor-agnostic platform attracts users locked into single-vendor portals. Onset HOBOlink, ZENTRA, WeatherFlow each lock data in. mud2dust attacks the lock-in.

Competitive landscape

Wrappers around free satellite data — Agromonitoring, EOS Crop Monitoring, OneSoil. Thin convenience layers.
Single-vendor sensor platforms — Onset HOBOlink, METER ZENTRA Cloud, WeatherFlow Tempest, Davis WeatherLink. Each is a closed silo per vendor.
Drone data platforms — DroneDeploy, Pix4D, OpenDroneMap (open). Imagery processing and storage; not a calibration target, not a multi-shape platform.
Real proprietary products — Climate FieldView, Granular. Genuinely private data and bundle it. Not the target.

2. Why this is feasible now (and not in 2018)

2a. AWS Open Data Sponsorship (matured ~2020). AWS pays storage + intra-region egress.

2b. COG / STAC tooling (matured ~2023). GDAL 3.x, rasterio 1.3+, titiler, rio-tiler, pgstac, pystac-client. STAC is the lingua franca for episodic raster, including drone data — the same catalog handles satellite scenes and contributor-uploaded drone flights.

2c. Indigo Ag's RTC bucket. Indigo invested ~$2M/year preprocessing Sentinel-1 to terrain-corrected γ⁰ COGs in AWS Open Data. ~80% of the satellite engineering work is already done.

2d. Ground-truth network is bootstrappable. Sentek anchor stations at JMR + per-contributor data + (when available) METER research network + episodic L-Band drone campaigns. Multiple sources of training data, not one vendor.

2e. Drone L-Band SAR is now commercially available. ImSAR, AeroVironment, others. Was $5M+ aircraft-only in 2018; now sub-$100K drone-mounted. Distributed cal/val is feasible.

3. Architecture

3a. System diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                    SATELLITE INGEST (Fargate / Lambda)                  │
│  Sentinel-1, Sentinel-2/HLS, Landsat, HRRR, NEXRAD, GOES, SMAP,         │
│  ECOSTRESS, MODIS, TROPOMI, POLARIS, DEM, SSURGO, OpenET, etc.          │
└──────────────────────────────────┬──────────────────────────────────────┘
                                   ▼
                 PROCESSING: σ⁰ → VWC, NDVI/NDWI, fusion model, etc.
                                   ▼
                      S3 (us-west-2) — output COGs
                                   ▼
                CloudFront + titiler + STAC catalog (pgstac/RDS)
                                   ▼
        ┌──────────────────┬──────────┴──────────┬──────────────────┐
        ▼                  ▼                     ▼                  ▼
   Public web        Python/R/QGIS         Partner apps      Contributor
   (free map browse) via STAC              via OAuth         dashboards

┌──────────────────────────────────────────────────────────────────────────┐
│                CONTRIBUTION BRIDGE (mud2dust/sensor-bridge)              │
│                                                                          │
│  Seven first-class object types — one auth/trust/privacy stack          │
│                                                                          │
│  Observation  Profile  Sample   Event    Collection  Annotation Boundary │
│  (time-       (multi-  (lab/    (drone/  (multi-     (geotag    (field   │
│   series      depth)   discrete  flight)  flight       note)    bounds)  │
│   point)               sample)            survey)                        │
│      │         │         │        │         │           │         │      │
│      ▼         ▼         ▼        ▼         ▼           ▼         ▼      │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │         Adapter registry (per vendor × per shape)               │   │
│  │  YDOC, Campbell, ZENTRA Cloud, WeatherFlow, WeatherLink,        │   │
│  │  FieldClimate, Onset, generic JSON, generic CSV/SFTP,           │   │
│  │  drone-COG-upload, soil-lab-CSV, shapefile/GeoJSON,             │   │
│  │  CRNP, AmeriFlux pull, ...                                      │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                          ▼                                              │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Intake gateways:                                                │   │
│  │   • HTTPS webhook  • SFTP server   • MQTT broker                │   │
│  │   • Presigned-multipart S3 upload (for Events)                  │   │
│  │   • OAuth-pull workers (per contributor token)                  │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                          ▼                                              │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Per-shape storage:                                              │   │
│  │   Observation/Profile → TimescaleDB                             │   │
│  │   Sample             → Postgres (relational)                    │   │
│  │   Event/Collection   → S3 + pgstac                              │   │
│  │   Annotation         → Postgres + PostGIS                       │   │
│  │   Boundary           → Postgres + PostGIS                       │   │
│  │  + Secrets Manager (per-contributor vendor tokens)              │   │
│  └─────────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────────┘
                                ▲                ▲
                                │                │
              Nightly retrain ──┘                └── Calibration consumer
              (training-tier contributions)         (everyone gets corrections)

3b. Region rationale (two AWS regions, two worker pools)

us-west-2 — most Tier 1 + Tier 2 satellite buckets (~70%); also primary for sensor-bridge, dashboards, and APIs.
us-east-1 — entire NOAA Big Data Program (~25%); workers only.
Tier 3 (non-AWS) — POLARIS at UC Davis, SoilGrids at ISRIC, USDA CropScape FTP, OpenET via Earth Engine, TROPOMI via Copernicus.

Intra-region S3 reads free; cross-region $0.02/GB.

3c. Worker types

Type	Use case	Limits	Cost shape
Lambda	Per-scene index calc, COG mosaicking, STAC, intake webhooks, sensor pull workers	15 min, ~10 GB RAM, 5 GB ephemeral	Pay-per-invoke
Fargate	Cross-region pulls, large mosaicking, GDAL native deps, drone preprocessing, retraining, historical-backfill	Any duration, any RAM	Per-second billing
EC2 spot	Only if Fargate gets expensive	Spot interruption	Cheaper at sustained load

Failure isolation rule: each satellite source and each adapter (vendor × shape) gets its own Lambda or Fargate task, triggered independently by EventBridge. SMAP outage doesn't block Sentinel-1; ZENTRA outage doesn't block YDOC push; a stuck drone-preprocessing job doesn't block Observation ingest.

3d. Processing layer

σ⁰ → VWC index per pixel. Empirical model trained against high-trust contributions (Sentek anchor + verified researchers + L-Band drone campaigns + METER if/when accessible).
- v1: linear index = (sigma0 - sigma0_dry) / (sigma0_wet - sigma0_dry).
- v2: scikit-learn random forest trained only on training-tier contributions.
- v3+: XGBoost or small NN; Fargate or SageMaker.
NDVI / NDWI / NDRE / EVI. Trivial pixel math from HLS reflectance.
ET ensemble. Consume OpenET for v1.
Per-pixel calibration. Conditions on POLARIS soil texture, HRRR precipitation, DEM topographic wetness index. The research contribution that justifies the project's existence.
Correction surfaces. For every contributor station, every nightly retrain produces (satellite_estimate, ground_truth, delta) series. Surfaced on the contributor's dashboard regardless of trust tier.
Drone L-Band cal/val anchors. When a drone Event is tagged as L-Band SAR with proper radiometric cal, the preprocessor extracts pixel-level σ⁰ over the flight extent and timestamps it; nightly retrain treats those as high-weight anchor scenes, similar to in-situ profiles.

3e. Output bucket structure

s3://mud2dust-cogs/
  ├── moisture-surface/   moisture-rootzone/
  ├── ndvi/  ndwi/  ndre/  lst/  et-daily/  precip-7d/  ...
  ├── sigma0/                          (raw, 90-day retention)
  ├── priors/
  │   ├── polaris/ soilgrids/ ssurgo/ dem-30m/ (static, indefinite)
  └── ...

s3://mud2dust-contributions/
  ├── events/{contributor_id}/{flight_id}/scene.tif    (drone, aircraft)
  ├── collections/{contributor_id}/{survey_id}/
  ├── samples-attachments/{contributor_id}/{sample_id}/lab-report.pdf
  └── boundaries/{contributor_id}/{boundary_id}.geojson

Lifecycle: raw satellite layers 90d, fused/calibrated indefinite, priors indefinite. Contributor raster Standard 90d → IA → Glacier IR (policy details deferred per §12l).

3f. Tile serving — titiler on Lambda + CloudFront

Standard pattern. Cold start ~500ms, warm ~30ms per tile. Two-tier caching (CloudFront edge + titiler render). Same titiler also serves contributor-uploaded drone COGs that opted to public visibility — /tiles/contrib/{event_id}/{z}/{x}/{y}.png.

3g. STAC catalog — pgstac on Aurora Serverless v2

Sub-100ms search. Two collection types:

mud2dust-{layer} — official calibrated layers (anonymous public)
contrib/{contributor_id}/{collection_id} — contributor-uploaded Events and Collections (public if opted; private otherwise)

The same pgstac instance handles both — STAC was designed for this.

3h. Contribution bridge — seven first-class object types

One repo (mud2dust/sensor-bridge), MIT-licensed. All seven shapes share auth, multi-tenancy, trust scoring, privacy controls, and adapter registry. Per-shape modules differ only in schema and validators.

Object type	Covers	Schema basis	Storage	Ingest
Observation	Time-series point readings (soil moisture, weather, CRNP, GNSS-IR, sap flow, eddy-cov fluxes, stream gauge)	OGC SensorThings (Observation)	TimescaleDB	HTTPS push, REST pull, SFTP, MQTT
Profile	Multi-depth time-series (Sentek 9-depth, soil temp profiles, lysimeters)	SensorThings extended with `depth_cm[]`	TimescaleDB	HTTPS push, REST pull
Sample	Discrete lab results (gravimetric VWC, bulk density, texture, OM, plant tissue, LAI)	Custom (sample location/depth/method/lab)	Postgres + S3 attachments	HTTPS POST + optional PDF/CSV upload
Event	Single drone flight, single aircraft pass, single PhenoCam capture, irrigation event, planting event	STAC Item	S3 (raster) + pgstac	Presigned S3 multipart + STAC item POST
Collection	Multi-flight drone survey, multi-pass aircraft campaign, weekly PhenoCam series	STAC Collection	pgstac	POST `/v1/collections` then add Events
Annotation	Geotagged agronomist note, citizen-science observation, photo with timestamp + location + free text	GeoJSON Feature (structured properties)	Postgres + PostGIS	HTTPS POST
Boundary	Field boundary, management zone, EC-mapped zone, irrigation prescription extent	GeoJSON Feature (Polygon/MultiPolygon)	Postgres + PostGIS	HTTPS POST or shapefile/GeoJSON upload

Common cross-cutting concerns (single implementation, applied to all shapes):

Per-contributor credential vault — AWS Secrets Manager, namespaced contributor/{id}/{vendor}. Encrypted, scoped IAM, rotation supported. Contributor revocation deletes the token; in-flight pulls drain.
Trust model — see §3i. Every contribution lands with sensor_class, operator_class, and computed trust_weight.
Privacy / coordinate fuzzing — every shape has geom_internal (exact, model-internal) and geom_public (jittered ±N km or aggregated). Default fuzzed; opt-in to exact precision per object. Boundaries respect a visibility flag (private / aggregate-only / public).
Provenance — every contribution carries source_adapter, raw_ref (pointer to the original payload in cold storage), submitter_user_id, submitting_app_id (for OAuth-submitted), creation timestamp, last-edit timestamp.
QC — per-shape validators. Observation: range checks, constant-value sensor failure. Profile: depth-monotonicity sanity. Sample: unit + range. Event: COG conformance + radiometric cal flag (for SAR/multispec). Annotation: schema validation. Boundary: topology validity (no self-intersection, valid CRS).
Adapter registry — adapters declare which shapes they can produce. Some adapters output multiple shapes (a Campbell logger emits Observations + Profiles; a drone upload pipeline emits Event + auto-derived Annotations like NDVI summary).

Vendor adapter set (initial, see §5b for full table):

Sentek + YDOC, Campbell HTTP push, Vaisala-on-logger, METER on YDOC or ZENTRA Cloud
Tempest WeatherFlow, Davis WeatherLink, FieldClimate, Onset HOBOlink
Drone uploads: generic COG, OpenDroneMap-processed output, common drone vendor exports (DJI Terra, Pix4D, DroneDeploy)
Lab CSV / NRCS report PDF (Sample)
Shapefile / GeoJSON (Boundary)
AmeriFlux pull (Observation, eddy-cov fluxes — at training-contributor request)
USDA SCAN/SNOTEL pull (Observation)

3i. Trust model — applies to every shape

Two independent dimensions combine into a per-contribution trust_weight:

sensor_class (intrinsic to the source):

Class	Examples
`research`	METER Teros 12 with cal docs; Campbell CS655 fresh-cal; L-Band drone with corner reflectors and radiometric calibration; lab analysis from accredited lab; AmeriFlux eddy-cov tower
`professional`	Sentek drill-and-drop, Vaisala WXT520, agronomist-grade EC mapping, multispectral drone with calibration panel
`consumer`	Tempest, Ambient Weather, basic capacitance probes, hand-flown DJI RGB without panels
`diy`	Arduino sensors, citizen-science photo annotations

operator_class (who installed/collected/submitted):

Class	Verification
`researcher`	Institutional email + ORCID/ROR
`agronomist_supported`	Self-declared, optional pro reference
`farmer`	Self-declared, address geocodes to ag land
`hobbyist`	Self-declared homeowner
`unknown`	Anonymous

training_weight = sensor_class × operator_class × installation_quality_factor (the third factor is earned over time — contributions that track neighbors and pass QC accumulate quality; outliers lose it).

Two paths through the system:

Training contributions (training_weight ≥ 0.5): used in nightly model retrain. Few hundred training stations + a few dozen calibrated drone campaigns per year at maturity.
Correction-takers (everyone else): full feature parity, dashboard, exports, calibrated companion data. Not used in retrain, but bulk anomaly signal informs diagnostics.

UX rule: never show users their trust tier as a number or rank. What we surface instead is contribution health — drift detection, cross-source consistency, sensor-failure or scene-quality flags. Helpful, not pejorative.

3j. Identity, auth, and partner apps

Three identity types:

User — a person (email + password, optional MFA via Cognito or Auth.js).
Organization — farm, lab, agency. Owns contributions collectively.
Partner application — third-party app integrating via OAuth 2.0 + PKCE.

OAuth scopes:

Scope	Allows
`stations:read` `stations:write`	Manage stations
`observations:read` `observations:write`	Time-series I/O
`profiles:read` `profiles:write`	Multi-depth time-series I/O
`samples:read` `samples:write`	Lab samples
`events:read` `events:write`	Drone/aircraft scene I/O (write requires presigned upload flow)
`collections:read` `collections:write`	Survey grouping
`annotations:read` `annotations:write`	Geotagged notes
`boundaries:read` `boundaries:write`	Field/zone vector I/O
`corrections:read`	Calibrated companion data
`alerts:read`	Frost/anomaly events
`public:read`	Public tile/STAC API (no user context)

Verification mechanisms:

operator_class = researcher: ORCID iD with affiliation; manual review for borderline cases.
operator_class = farmer: station coords geocode to USDA-mapped agricultural land (CDL/CSB).
operator_class = hobbyist: default.

3k. Where the existing hardware fits

The 6× Sentek 36" drill-and-drop probes on YDOC ML-417ADS go in at JMR as the anchor training station set — multi-depth profiles ideal for calibrating both surface and rootzone VWC outputs. Tempest + Vaisala WXT520 round out on-farm meteorology. METER hardware (planned purchase) lands via either YDOC or ZENTRA Cloud per deployment.

The Sentek drill-and-drop is better training data than the originally-planned single-depth Teros 12 — multi-depth profiles let the calibration model learn rootzone integration directly.

3l. L-Band drone cal/val — the headline research story

When a contributor flies calibrated L-Band SAR over a field that has live in-situ probes during a Sentinel-1 overpass window, three sources of truth are collocated:

In-situ point truth — Sentek profile, METER probes, etc.
Airborne high-res truth — drone L-Band σ⁰ over the flight extent (typically 5–50 cm resolution).
The C-Band layer being calibrated — Sentinel-1 σ⁰ at 10 m.

This is SMAPVEX-class cal/val data, distributed across contributors instead of NASA-campaign-only. The platform's drone Event ingest path treats radiometrically-calibrated L-Band uploads as high-weight anchor scenes in the nightly retrain. NASA-CSDA, NSF-NRT, and DOE Atmospheric Radiation Measurement programs all fund distributed cal/val — this is a fundable research narrative independent of the data-platform story.

Operationally: an L-Band drone Event uploads as a STAC Item with metadata flags sensor=l_band_sar, radiometric_cal_method=corner_reflector, cal_target_in_scene=true, flight_window_overlaps_s1_overpass=true (auto-computed from timestamp). Preprocessor verifies the calibration metadata; contribution lands as sensor_class=research if intact, demoted otherwise.

3m. Where Farming Game's role fits

Farming Game is the first OAuth partner application. JMR's stations register through /v1/stations on behalf of the JMR farm user. Field boundaries, irrigation events, and any drone flights JMR runs over its blocks all flow through the same OAuth scopes. Calibrated corrections come back through /v1/stations/{id}/corrections and the AOI extract endpoints. See §11.

4. AWS account structure

Phase 0–1: single account

One AWS account, root locked, MFA on, IAM Identity Center.

Phase 4+: multi-account AWS Organizations

Account	Role	Notes
`mud2dust-billing`	Org root, no workloads	Consolidated billing only
`mud2dust-prod`	Pipeline + tiles + API + bridge	Default region us-west-2
`mud2dust-dev`	Sandbox, breakable	Same regions as prod
`mud2dust-data`	Output S3 + contributor S3 + Secrets Manager	Defense in depth — compromised compute can't delete archive

SCPs:

Prevent disabling CloudTrail
Prevent deleting backup snapshots
Restrict deployable regions to us-west-2 and us-east-1
Restrict access to mud2dust-data Secrets Manager namespaces by IAM role only

Networking

VPC with private subnets per region.
Lambda outside VPC for simplicity.
Fargate in private subnets with VPC interface endpoints for S3, STS, Secrets Manager, CloudWatch Logs.

Cost guards from day one

AWS Budget at $200/mo with alarm at 80%.
AWS Cost Anomaly Detection enabled.
Tag every resource: project=mud2dust, env=prod|dev, layer=ingest|process|serve|bridge.

5. Ingest job catalog

5a. Satellite + raster ingest

The "first six" — phase 1–2 — get 80% of value:

Sentinel-1 RTC (moisture primary signal)
HLS Sentinel-2 (NDVI, NDWI, NDRE)
HLS Landsat (thermal LST, longer-archive optical)
HRRR (precipitation accumulation, soil temp)
POLARIS (static soil prior)
Copernicus DEM (static topography prior)

Second wave — phase 5: ECOSTRESS, SMAP L4, OpenET, GOES-18 thermal, VIIRS Active Fire. Third wave — phase 6+: TROPOMI, GEDI, MODIS LST, GPM IMERG, USDA CDL/CSB, SSURGO/gNATSGO.

Source	Tier	Cadence	Worker	Output	Retention
Sentinel-1 RTC	1 (anonymous S3)	6-day; nightly STAC poll	Lambda	σ⁰ + VWC COG	90d raw, ∞ fused
Sentinel-2 (HLS)	2 (Earthdata)	2–3 day; daily	Lambda + auth	NDVI/NDWI/NDRE COG	90d raw
Landsat 8/9 (HLS-L30)	2	8-day; daily	Lambda + auth	thermal LST + reflectance	90d
HRRR	1	hourly	Lambda	precip-7d + soil temp COG	30d
NEXRAD MRMS	1	5-min (sample hourly)	Lambda	precip-hr COG	30d
GOES-18 thermal	1	10-min (sample 30-min)	Lambda	LST COG	14d
SMAP L4	2	daily	Lambda + auth	regional VWC reference	30d
ECOSTRESS	2	irregular ISS revisit	Lambda + auth	high-res LST scenes	90d
MODIS LST + Snow	2	daily	Lambda + auth	continuity gap-fill	30d
Sentinel-5P TROPOMI	3 (Copernicus)	daily	Fargate (cross-region)	air quality / methane	30d
Copernicus DEM	1	one-shot	Lambda	static prior	indefinite
POLARIS	3 (UC Davis HTTP)	one-shot, refresh annually	Fargate	static soil prior	indefinite
SSURGO / gNATSGO	3 (USDA NRCS)	annual	Fargate	static soil prior	indefinite
USDA CDL / CSB	3 (USDA FTP)	annual	Fargate	crop classification	indefinite
OpenET	3 (GEE/REST)	monthly	Lambda + GEE auth	ET CONUS	12 months
GPM IMERG	2	30-min (sample hourly)	Lambda + auth	global precip	14d

5b. Contribution bridge — adapters by shape

Adapter / source	Shapes produced	Direction	Protocol
Sentek + YDOC ML-417	Observation, Profile	Push	HTTPS POST (HMAC-signed)
Campbell Scientific	Observation, Profile	Push	HTTPS POST or SFTP (CRBasic template)
Vaisala WXT520	Observation	Push	via host logger (Campbell adapter)
METER on YDOC	Observation, Profile	Push	YDOC HTTP POST
METER on ZENTRA Cloud	Observation, Profile	Pull	REST (per-contributor token)
Tempest WeatherFlow	Observation	Pull	REST + UDP local
Davis WeatherLink	Observation	Pull	REST (per-contributor token)
FieldClimate (METOS)	Observation, Profile	Pull	REST (per-contributor token)
Onset HOBOlink	Observation	Pull	REST (per-contributor token)
WeatherBug	Observation (regional)	Pull	REST (regional context, not on-farm)
Generic JSON webhook	Observation	Push	HTTPS POST
Generic CSV/SFTP	Observation, Sample	Push	SFTP
AmeriFlux pull	Observation (eddy-cov)	Pull	REST (training-contributor request)
USDA SCAN/SNOTEL pull	Observation, Profile	Pull	NRCS Awdb API
CRNP / COSMOS-USA	Observation	Pull	REST (where exposed)
Soil-lab CSV / PDF	Sample	Push	HTTPS POST + S3 multipart for attachment
Drone COG upload	Event	Push	Presigned S3 multipart + STAC POST
OpenDroneMap output	Event	Push	Same as above; auto-detect outputs
DJI Terra / Pix4D / DroneDeploy export	Event	Push	Presigned S3 multipart + STAC POST
Aircraft campaign upload	Event, Collection	Push	Presigned S3 multipart + STAC POST
PhenoCam-style series	Collection (auto-grouped Events)	Push	Periodic HTTPS POST
Shapefile / GeoJSON	Boundary	Push	HTTPS POST or multipart upload
Agronomist note (mobile)	Annotation	Push	HTTPS POST (mobile/web)
Citizen-science photo	Annotation	Push	HTTPS POST (multipart with EXIF)

Failure handling: every adapter emits CloudWatch metric IngestSuccess{adapter=X, contributor=Y, shape=Z}. Alarm on >2 consecutive failures. Failed contributions DLQ'd to SQS for retry. Contributor sees a status indicator per source on their dashboard.

Earthdata auth wrapper (Tier 2 satellite): built once as a shared Lambda layer.

6. Output products / API surface

6a. Public raster tile API (anonymous tier)

GET /tiles/{layer}/{date}/{z}/{x}/{y}.{png|webp}?colormap=...

Standard XYZ tiles. Default PNG; WebP optional; raw GeoTIFF via ?format=tif. CloudFront long TTL on (layer, date). WAF rate limit 600 req/hr per IP. Attribution headers on every response.

6b. Public STAC catalog (anonymous tier)

GET /stac/collections                          → list official + public-opted contrib collections
GET /stac/collections/{id}/items?bbox=...      → search
GET /stac/collections/{id}/items/{id}          → single item (links presigned COG, 1-hour TTL)

6c. AOI extraction API (authenticated free tier)

POST /v1/extract
{ "geom": <GeoJSON or boundary_id>, "layer": "moisture-rootzone",
  "from": "2026-04-01", "to": "2026-04-28", "format": "parquet" }
→ 202 with job_id; poll /v1/jobs/{job_id}

Async via Step Functions. The geom field accepts a contributor's saved Boundary by ID — natural ergonomics for "extract over my field."

6d. Public station / event browse (anonymous tier)

GET /v1/public/stations?bbox=...               → public-opted stations (jittered coords)
GET /v1/public/stations/{id}/observations?...
GET /v1/public/events?bbox=...&type=drone       → public-opted Events

6e. Contributor dashboard API (authenticated user tier)

GET /v1/me/stations                             → all my stations across vendors
GET /v1/me/stations/{id}/observations?...
GET /v1/me/stations/{id}/profiles?...
GET /v1/me/samples?...
GET /v1/me/events?...
GET /v1/me/collections?...
GET /v1/me/annotations?...
GET /v1/me/boundaries?...
GET /v1/me/corrections?stations=...&from=...    → satellite vs station companion
GET /v1/me/exports?shapes=...&format=parquet    → cross-shape unified export
GET /v1/me/alerts                                → frost / anomaly / contribution-health

6f. Partner-app API (OAuth scoped tokens)

Endpoints for each shape, all gated by OAuth scopes from §3j:

POST /v1/stations            → register a station         (scope: stations:write)
POST /v1/observations        → push time-series readings  (scope: observations:write)
POST /v1/profiles            → push multi-depth readings  (scope: profiles:write)
POST /v1/samples             → submit a lab sample        (scope: samples:write)
POST /v1/uploads/initiate    → start a presigned upload   (scope: events:write)
POST /v1/uploads/complete    → finalize, create STAC item (scope: events:write)
POST /v1/collections         → group flights/passes       (scope: collections:write)
POST /v1/annotations         → geotagged note             (scope: annotations:write)
POST /v1/boundaries          → field/zone vector          (scope: boundaries:write)
GET  /v1/stations/{id}/corrections    → calibrated companion (scope: corrections:read)
GET  /v1/alerts              → frost / anomaly events     (scope: alerts:read)

The in-product onboarding wizard is a thin shell over these endpoints — the API is the product surface, not an afterthought.

6g. Bulk download (paid tier — defer to phase 6)

7. Access tiers

Audience	Auth	Rate limit	Can do	Can't do
Anonymous public	none	600 req/hr per IP	Browse calibrated map, public STAC, public-opted stations + events, AOI extract (small)	Connect contributions; access raw contributor data
Signed-in viewer (`pk_view`)	email + password	5,000 req/hr	All anonymous + saved AOIs, alerts, multi-source dashboard for any public contributions	Connect contributions
Connected contributor (`pk_contrib_pending`)	as above + ≥1 contribution	10,000 req/hr	All viewer + their own dashboard across all seven shapes, raw API for their contributions, cross-shape export	Train the public model
Validated contributor (`pk_contrib`)	as above + 30 days validated data	50,000 req/hr	All contributor + recognition badge	—
Training contributor (`pk_contrib_train`)	as above + sensor_class ≥ professional + verified install	50,000 req/hr	All validated + their data trains the public model + "training contributor" badge	—
Partner app (`client_id` + user OAuth token)	OAuth 2.0 PKCE	per-scope, per-user	Scoped on user's behalf	Anything outside granted scope
Commercial bulk	API key + signed agreement	unlimited	Bulk Parquet, no rate limit	(defer to phase 6)

Tier transitions: sign up → pk_view → connect first contribution → pk_contrib_pending → 30 days validated → pk_contrib → research-grade hardware + verified ORCID + verified install → pk_contrib_train (manual review for first cohort).

Validated = passes per-shape QC + non-degenerate values + plausible location. Stops fake contributions from harvesting tier bumps.

8. Cost estimate

Phase 1 — regional satellite + JMR anchor stations

Item	Low	High	Driver
S3 storage (COGs)	$5	$15	200–600 GB rolling
S3 PUT/GET	$1	$3
Lambda compute	$5	$15	Satellite ingest + bridge intake + pull workers
Fargate compute	$50	$100	Cross-region NOAA, tier-3, drone preprocessing
CloudFront egress	$20	$200	Variable
Cross-region transfer	$5	$15	NOAA us-east-1 → us-west-2
API Gateway	$3	$10
RDS Aurora Serverless (pgstac + TimescaleDB + relational + PostGIS)	$80	$130	Combined or split
Secrets Manager	$2	$10	Per-contributor vendor tokens
Cognito (or equivalent)	$0	$5	Free tier covers <50k MAU
CloudWatch logs/metrics	$5	$15
EventBridge + SQS + DLQ	$1	$5
WAF	$5	$10
Total (Phase 1)	$182	$533	Doesn't include contributor-uploaded Event volume (§12l)

Plus contributor probe ops (separate budget line):

YDOC cellular SIMs, 6 anchor loggers: ~$30–60/mo
Tempest data plan: $0 (WeatherFlow API free for personal use)
Vaisala / Campbell logger comms: TBD
Sentek hardware (sunk): already purchased
METER hardware: TBD pending purchase

Phase 4+ adders — contribution-driven

When the bridge opens to outside contributors:

Drone Event storage: 100 contributors × 10 flights/yr × 1 GB avg = 1 TB/yr. Standard tier $23/mo first year; tiering to IA/Glacier per §12l.
Drone preprocessing compute: Fargate spikes per upload. Budget $50–200/mo at moderate volume.
Contributor pull workers: Lambda invocations scale with contributor count × poll frequency. ~$0.05/contributor/mo at hourly poll; trivial.
Total Phase 4 platform overhead: $80–300/mo on top of Phase 1 baseline.

CONUS scaling (phase 6)

Storage ~5–8×. Compute ~3–5×. Realistic CONUS-with-traction: $1,500–4,000/mo.

9. Phased build plan

Phase 0 — Foundation (1 week)

Done: brand locked, domains registered, hardware purchased.

Day 1–2: AWS account. MFA on root, IAM Identity Center, budget alarms, CloudTrail, tag policy.

Day 3–4: Repository scaffolding under GitHub org mud2dust/:

mud2dust/sensor-bridge — multi-shape contribution bridge
mud2dust/pipeline — satellite workers, CDK or Terraform infra
mud2dust/titiler — tile server, customized titiler config
mud2dust/site — Next.js dashboard + public landing + onboarding wizard

MIT license on all four.

Day 5: Auth scaffolding placeholder in mud2dust/site — Cognito user pool created (or Auth.js setup), OAuth app-registration table stubbed in DB, no UI yet.

Day 6: Vercel project for the landing site. mud2dust.com placeholder.

Day 7: Buffer / METER follow-up / JMR conversation per §14.

Phase 1 — Anchor station + one satellite layer end-to-end (3 weeks)

Week 1: Sentinel-1 RTC ingest. Lambda on EventBridge daily cron @ 03:00 UTC. Mosaic clipped to Eastern WA bbox, write s3://mud2dust-cogs/sigma0/yyyy/mm/dd/eastern-wa.tif.

Week 2: Bridge MVP — Observation + Profile shape only, YDOC adapter only. HTTPS endpoint with HMAC validation. JMR's YDOC ML-417ADS configured to POST. Land observations in TimescaleDB. Bare-bones internal dashboard rendering JMR Sentek depths.

Week 3: Tile route + station overlay. Titiler on Lambda + CloudFront. /tiles/sigma0/{date}/{z}/{x}/{y}.png. Demo page with Mapbox raster + JMR Sentek anchor station overlay.

Phase 1 deliverable: internal URL renders Sentinel-1 backscatter over Franklin County with the JMR Sentek anchor station live-overlaid. σ⁰ change correlates with rain events from Tempest readings.

Phase 2 — Multi-source fusion (4 weeks)

Week 1: HLS ingest with Earthdata auth. NDVI/NDWI COGs. Week 2: HRRR ingest in us-east-1. 7-day rolling precip. Week 3: Static priors. POLARIS from UC Davis, Copernicus DEM from S3. Week 4: First fusion model. RF on JMR Sentek + Tempest + HLS NDVI + Sentinel-1 σ⁰. Pickle, ship to Lambda. /tiles/moisture-rootzone/{date}/... runs the model per pixel. Holdout-validate against held-back Sentek depths.

Phase 2 deliverable: calibrated moisture map. Farming Game can switch off AGROMONITORING_API_KEY.

Phase 3 — Open the public API + dashboard (3 weeks)

Week 1: WAF, attribution headers, STAC catalog populated. Week 2: Public dashboard at mud2dust.com — calibrated map browse, public-opted contributions visible, no signup. Week 3: Soft launch — blog, social, HN, Awesome-Geospatial.

Phase 3 deliverable: strangers using the public map.

Phase 4 — Full contribution platform (~15 weeks, split into 4a–4e)

Phase 4a — Bridge expansion to all time-series sources (3 weeks)

Week 1: ZENTRA Cloud pull adapter (per-contributor token + Secrets Manager). Tested against METER hardware once purchased; tested against any METER research-network access from §14. Week 2: WeatherFlow Tempest pull, Davis WeatherLink pull, FieldClimate pull, Onset HOBOlink pull. Week 3: Campbell HTTP-push adapter + CRBasic template publication. Generic JSON webhook adapter. SFTP intake gateway. AmeriFlux + USDA SCAN pull.

Phase 4b — Event / Collection (drone + aircraft) (4 weeks)

Week 1: Presigned S3 multipart upload flow. Storage layout under s3://mud2dust-contributions/events/. Week 2: Drone preprocessing pipeline (Fargate). COG-conformance validation, metadata extraction, STAC item generation. Generic COG, OpenDroneMap output, DJI Terra / Pix4D / DroneDeploy exports. Week 3: L-Band SAR Event handling. Radiometric-cal metadata schema. Cal-target detection flag. Auto-cross-reference with Sentinel-1 overpass windows. Week 4: Collection grouping. Public-opt-in for STAC catalog. Tile route for public Events.

Phase 4c — Sample, Annotation, Boundary (2 weeks)

Week 1: Sample shape — schema, lab-CSV/PDF intake, attachment storage. Annotation shape — geotagged notes, citizen-science photo intake. Week 2: Boundary shape — shapefile/GeoJSON upload, PostGIS storage, integration with AOI extract endpoint.

Phase 4d — Onboarding UX + cross-shape dashboard (3 weeks)

Week 1: User signup flow. Onboarding wizard scaffolding ("what do you have?" branching across all seven shapes). Week 2: Vendor-specific onboarding paths. Per-station/event/etc registration UI with sensor_class / operator_class / privacy controls. Week 3: Cross-shape unified dashboard — stations + flights + samples + boundaries on one map. Anomaly / drift / contribution-health surfacing. Cross-shape Parquet/CSV export.

Phase 4e — Trust model + retrain + partner-app OAuth (3 weeks)

Week 1: Trust model fully wired across all shapes. ORCID verification flow. Tier transitions automated. Week 2: Nightly retraining job. Pulls last-30-days training-tier contributions across Observation/Profile/Event (L-Band drone scenes weighted as anchors). A/B test, promote if better. Week 3: OAuth 2.0 partner-app flow. App registration UI. Farming Game integrated as first partner app — registers JMR's stations + boundaries through the API, pushes Tempest + Vaisala observations, reads calibrated corrections.

Phase 4 deliverable: any farmer, researcher, or drone operator can connect; partner apps integrate via OAuth; calibration model improves visibly per month with multi-shape contributions.

Phase 5 — Add layers + grant momentum (6 weeks)

ECOSTRESS, OpenET, TROPOMI, GOES frost, VIIRS Active Fire. Each ~1 week. Sequence by user demand.

Phase 6 — National + paid tier (8 weeks)

Weeks 1–4: scale satellite ingest to CONUS. Weeks 5–6: Stripe paid tier. Weeks 7–8: first grant application (NSF Pathways, USDA NIFA, NASA CSDA).

10. Key architecture decisions

Decision	Choice	Why
Brand	`mud2dust`	Locked. Domains owned across major TLDs.
License	CC-BY for tiles + STAC outputs, MIT for code	Lowest-friction with attribution required.
Bridge object types	Seven first-class shapes (Observation, Profile, Sample, Event, Collection, Annotation, Boundary)	Most platforms pick one or two; designing for all seven is the differentiator. Shared auth/trust/privacy/storage layer keeps cost manageable.
Drone L-Band as anchor	Treat radiometrically-calibrated L-Band drone Events as training-tier anchor scenes	Distributed SMAPVEX-class cal/val. Distinct, fundable research narrative.
Bridge layout	One repo, modular by shape	Shapes share 80% of plumbing; fork modules within the repo, not the codebase.
Trust model	sensor_class × operator_class → training_weight, invisible to users	Honest weighting without alienating low-trust contributors.
Training/correction split	Few hundred training contributions; everyone else consumes corrections	Calibration needs quality, not quantity.
Partner-app integration	OAuth 2.0 + PKCE with per-shape scopes	Standard pattern; lets Farming Game and any future app integrate.
Vendor token storage	AWS Secrets Manager, namespaced per contributor	Encrypted at rest; per-contributor IAM; rotation; revocation.
Coordinate privacy	Two coord fields per object — internal (exact) and public (jittered)	Default fuzzed for STAC/public; opt-in per object.
Event upload mechanism	Presigned S3 multipart, not API POST	Drone scenes are too big for API POST; presigned multipart is the AWS-native pattern.
STAC backend	pgstac on Aurora Serverless v2 (one instance, official + contrib collections)	Sub-100ms search; STAC is the lingua franca for both satellite and contributor raster.
Sensor-data backend	TimescaleDB on RDS (multi-tenant by contributor_id)	Easier model-training queries, easier contributor SQL.
Tile renderer	titiler on Lambda + CloudFront	Idiomatic; copies OpenET / Planetary Computer.
Failure isolation	One Lambda/Fargate per source/adapter, EventBridge-triggered	A SMAP outage doesn't block Sentinel-1; ZENTRA outage doesn't block YDOC; drone preprocessing doesn't block Observation ingest.
Account split	Single account through phase 3, multi-account org from phase 4	Defense-in-depth once data exists.
Showcase customer	Farming Game as first OAuth partner app	Validates partner-app API; useful for grants.
Anchor stations	6× Sentek 36" drill-and-drop on YDOC ML-417ADS (already purchased)	Multi-depth profile sensors better than single-depth.

11. What this means for Farming Game

Four concrete changes in the farminggame repo, sequenced after mud2dust phase 4:

Become the first OAuth partner app on mud2dust. Register JMR's Sentek + Tempest + Vaisala stations through /v1/stations. Stream observations through /v1/observations. Read calibrated corrections through /v1/stations/{id}/corrections.
Push field boundaries. JMR's block boundaries become Boundary contributions on mud2dust via /v1/boundaries. AOI extracts and per-block calibrated outputs become trivial.
Push irrigation events as Annotations or domain-specific Events. Closes the loop between "what was applied" and "what the satellite + sensors see."
Drop Agromonitoring. Replace AGROMONITORING_API_KEY with mud2dust API calls via OAuth user token. Saves $30–300/mo per farm.
Retitle Phase 13. Rename "Direct Satellite Pipeline & Soil Calibration" to "Integrate with mud2dust." Drop subphases 13b–13d. Keep 13a (sensor deployment) but reframe as "deploy stations into mud2dust as the first OAuth partner app." Subphases 13e–13g stay in farminggame.
Become the showcase customer. mud2dust's website links to Farming Game as the working partner-app example. Farming Game's website credits mud2dust as the data layer.

The two projects share an AWS organization but separate accounts/billing.

12. Open questions

12a. Brand name + domain — Resolved. `mud2dust` locked; both `mud2dust` and `mudtodust` owned across `.com, .net, .org, .io, .dev, .ag, .farm, .ai, .earth, .co` (20 domains). Primary `mud2dust.com`. Defensive coverage complete.

12b. Legal entity

LLC for phase 0–3, hybrid LLC-owned-by-501(c)(3) when contributor revenue exceeds $20k/yr. LLC keeps you nimble; 501(c)(3) unlocks grants; hybrid (OpenStreetMap Foundation pattern) adds ~$3K/yr legal/accounting overhead.

12c. METER co-brand terms

See §14 for the three-tier ask. Standard structure for the biggest ask: logo + advisory seat + 30-day model first-look + joint paper opportunity. No exclusivity.

12d. Anchor station funding split

6× Sentek + YDOC + Tempest + Vaisala (hardware sunk). Ongoing comms ~$30–60/mo cellular + ZENTRA Cloud if used. Options:

mud2dust absorbs fully.
JMR co-funds as founding contributor (gets pk_contrib_train in perpetuity).
Hybrid: mud2dust pays cellular + bridge ops, JMR pays any ZENTRA Cloud subscription.

12e. Initial advisors (recruit before phase 3 launch)

Northwest credibility — USDA-ARS Pendleton or WSU Prosser.
Federal-process knowhow — OpenET / NASA open-data ag programs alum.
Commercial validation — Bayer / Climate / Granular alum.

12f. Funding strategy (sequenced)

Phase	Source	Amount	Why
0–3	self-fund	~$3–5K	AWS + domains (sunk) + marginal time
4	METER + 2–3 universities at $5K/yr	$20K	Consortium fee covers ops
5	NIFA SBIR Phase I + NASA-CSDA L-Band cal/val	$175–500K	Two distinct narratives
6	NRCS CIG or Climate Smart Commodities	$500K+	National rollout
6+	Paid bulk-tier customers	$50–500K/yr	Recurring revenue

12g. Data-use agreement language

Contributors own their raw data. mud2dust gets a license to (a) render their dashboard, (b) use their data in retrain if they're at training tier, (c) produce aggregate/derived public outputs under CC-BY. Revocable. One-paragraph plain-English summary at signup + longer version reviewed by counsel. Tied to §12b legal entity timing.

12h. Researcher verification mechanism

ORCID iD with affiliation at v1; manual review for borderline cases (institutional email without ORCID; non-academic researchers with publication record).

12i. Hardware in hand — reflects updated plan

6× Sentek 36" drill-and-drop on YDOC ML-417ADS, plus Tempest + Vaisala WXT520 + Campbell-in-conversation + METER-to-purchase. Multi-depth profile sensors are better training data than single-depth Teros 12.

12j. Contributor freemium / dashboard-only path

Contributors who only want the dashboard support pk_contrib with all stations/events marked geom_public_mode = "private". Their data is excluded from public model and public browse. Costs almost nothing to support. Keeps funnel wide.

12k. METER hardware logger choice

For each METER probe purchase, choose YDOC (existing fleet, push) vs ZENTRA Cloud (vendor portal, pull) per deployment.

12l. Contributor raster storage policy — deferred

At what volume should contributor-uploaded Events trigger storage policy (per-tier quotas, aggressive Glacier lifecycle, COG-only at upload, etc.)? Defer until phase 5 when real volume signal is available. Track contributor-storage as a separate budget line from day one so the trigger point is visible.

12m. L-Band drone cal/val — recruitment strategy

Who flies L-Band SAR over instrumented fields and would partner? Candidates: NASA AirMOSS alumni, USDA-ARS Beltsville, university radar labs (CSU, OU, Univ. Michigan), commercial L-Band drone vendors (ImSAR pilot programs). At least one before phase 5 makes the cal/val story concrete enough to write into a NASA-CSDA proposal.

12n. Cross-shape exports — what does "unified" mean?

A contributor with stations + flights + samples + boundaries on one farm wants a single coherent export. What's the format? Options:

One Parquet per shape, zipped together with a manifest.
A single STAC catalog where stations are STAC items with assets pointing to Parquet.
A custom mud2dust archive format.

Recommendation: Parquet-per-shape + STAC manifest; defer custom format unless users ask. Decide during phase 4d.

13. Things deliberately deferred

Item	Defer until	Why
Disaster recovery (cross-region replication, RDS backups, full IaC)	Phase 4	Single account is recoverable enough during build
Observability beyond CloudWatch	Phase 3 launch	CloudWatch is fine until you have users
GDPR compliance	Phase 3	Trivial — only collecting emails for API keys
SOC 2	Phase 6	Only matters for paid-tier enterprise customers
Versioning (model_version on tiles, reproducible old outputs)	Phase 4	Once retraining starts, old tiles need to be reproducible
Internationalization	Phase 6+	Algorithms generalize globally; only POLARIS + SSURGO are US-only
MFA enforcement on all users	Phase 4e	Encouraged at training-contributor tier; required there; optional elsewhere until then
Mobile app	Phase 6+	Web dashboard responsive enough; native app waits for product/market fit
Water-quality / nitrate runoff samples	Phase 6+	Adjacent industry; could fragment focus
Stream gauges / hydrology beyond awareness	Phase 6+	Not on the soil-moisture critical path
In-cab / equipment telemetry (John Deere, AgLeader, Trimble)	Phase 6+	Valuable but a different integration class; revisit once partner-app pattern is proven
Climate model output as contribution	Phase 6+	Drought projection is a separate product; defer
Contributor raster storage policy	Phase 5	See §12l

14. Next concrete steps

METER meeting (2026-05-07) — three asks ranked by ease

(Easy / must-have) Confirm ZENTRA Cloud REST API can be used as a per-contributor pull adapter — any METER customer who authorizes mud2dust with their own ZENTRA token grants us read access to their devices and observations. Standard third-party API use; should not require formal agreement, but worth confirming there's no ToS restriction.
(Medium) Co-publish the ZENTRA Cloud adapter under MIT in mud2dust/sensor-bridge. Gives METER a co-contributor citation; gives any open-data project a reusable adapter.
(Big) METER's research network contributes as a Training Contributor on the platform, with co-branding (logo + tile attribution; advisory seat; 30-day model first-look; joint paper opportunity).

If their developer-relations / API team is the right counterpart for asks 1 and 2, while research/sales handles ask 3, try to bring both groups in.

Other Phase 0 prerequisites

Stand up an AWS account. 15 minutes if not already done — root locked, MFA, IAM Identity Center.
Conversation with Jackass Mountain Ranch to confirm willingness to host the anchor stations and decide §12d funding split.
Legal-entity decision (LLC formation) — gate for any contributor-data-use agreement at scale.
Repo scaffolding: create mud2dust/sensor-bridge, mud2dust/pipeline, mud2dust/titiler, mud2dust/site under the GitHub org.
Domain DNS: point mud2dust.com at Vercel (landing) and api.mud2dust.com at API Gateway (placeholder).
L-Band drone cal/val partnership scouting — first conversation per §12m, before phase 5 grant-writing.

Once those are done, phase 0 closes and phase 1 starts.

Verification

Phase	Verification gate
0	Domains registered (done). AWS account exists with budget alarm + tags. Four repos scaffolded with MIT license. Auth scaffolding placeholder in `site`. Placeholder landing live at `mud2dust.com`.
1	Internal URL renders Sentinel-1 backscatter tiles over Franklin County, WA. JMR Sentek station registered via API; observations flowing into TimescaleDB; visible on internal dashboard. σ⁰ change correlates with rain events from Tempest readings.
2	`/tiles/moisture-rootzone/{date}` returns calibrated VWC. Holdout JMR Sentek depths validated within ±3% VWC RMSE. Farming Game can switch off `AGROMONITORING_API_KEY` and field-detail panel still renders moisture.
3	Anonymous tile + STAC API live behind WAF. Public dashboard browseable without signup. Five external users have made `pystac-client` requests. Attribution headers present on all responses.
4a	Bridge supports ≥6 vendor adapters end-to-end across Observation + Profile shapes (YDOC push, ZENTRA pull, WeatherFlow pull, WeatherLink pull, Campbell push, AmeriFlux pull). Per-contributor credential vault working.
4b	Drone Event upload works end-to-end (presigned multipart → preprocessing → STAC item → tile route). At least one L-Band SAR upload validated with calibration metadata. Public tile route for opted Events live.
4c	Sample, Annotation, and Boundary shapes all support upload + retrieval + dashboard rendering. Boundary integrates with AOI extract.
4d	Onboarding wizard live. Farmer can connect their hardware in <10 min from signup. Cross-shape dashboard renders mixed sources correctly. Cross-shape Parquet/STAC export produces a coherent archive.
4e	OAuth partner-app flow live. Farming Game integrated as first partner app — registers stations + boundaries, pushes observations + irrigation events, reads corrections. Trust model weighting confirmed in retrain logs. Nightly retrain promotes a new model with measurable accuracy gain.
5+	Each new layer ships with a STAC collection, a tile route, and a demo notebook.
6	Stripe-billed paid customer pulls bulk Parquet. Grant application submitted (preferably both NASA-CSDA L-Band cal/val and NIFA open ag data infra).

Critical files / paths

This plan currently lives at /Users/willmachugh/.claude/plans/we-were-working-on-lovely-widget.md and the prior revision at /Users/willmachugh/.claude/plans/glittery-stirring-origami.md. After approval, copy into the new umbrella repo as mud2dust/PLAN.md (or mud2dust/.claude/plans/mud2dust-plan.md to mirror farminggame convention) and mark farminggame/.claude/plans/openagdata-architecture.md superseded with a pointer here.

Files to be created during Phase 0:

mud2dust/PLAN.md (copy of this file)
mud2dust/README.md
mud2dust/LICENSE (MIT)
Four sub-repos under the GitHub org: sensor-bridge, pipeline, titiler, site.

No existing functions or utilities to reuse — mud2dust/ is empty. Phase 13 references in farminggame/.claude/plans/farminggame-plan.md need to be retitled per §11 once this plan is approved.

Scientific plan

Build status

Foundations (new — gates AWS deploy + local-dev verification)

Phase 0 scaffolding

Phase 1 — Anchor + one satellite layer

Phase 2 — Multi-source fusion

Phase 3 — Open public API + dashboard

Cross-cutting