diff --git a/SCHEMA.md b/SCHEMA.md new file mode 100644 index 0000000..8420ca5 --- /dev/null +++ b/SCHEMA.md @@ -0,0 +1,40 @@ +# Evidence schema (v1) + +This repo is coordinated across parallel sessions. Each session writes its own evidence under: + +- `data/ho36/` +- `data/flaneur/` + +## `evidence.json` + +Top-level: + +```json +{ + "hostel_name": "string", + "collected_at": "ISO datetime", + "collector_session": "A|B|C", + "evidence": [], + "profile": {} +} +``` + +Each evidence item (one “fact”) is a row/object with: + +- `target`: `"ho36"` | `"flaneur"` +- `source`: `"official_site"` | `"google_maps"` | `"booking"` | `"hostelworld"` | `"tripadvisor"` | `"instagram"` | `"facebook"` | `"tiktok"` | `"press"` | `"other"` +- `metric_name`: string +- `metric_value`: string | number | null +- `url`: string +- `captured_at`: ISO datetime +- `status`: `"ok"` | `"blocked"` | `"unknown"` | `"error"` +- `confidence`: `"high"` | `"med"` | `"low"` +- `notes`: string +- `screenshot_path`: string | null + +## `evidence.csv` + +One row per evidence item, with the same fields: + +`target,source,metric_name,metric_value,url,captured_at,status,confidence,notes,screenshot_path` +