GM report: HO36 vs Flaneur evidence + pricing
This commit is contained in:
parent
57f73db262
commit
f3c330d909
24 changed files with 10722 additions and 75 deletions
|
|
@ -1,4 +1,4 @@
|
|||
# SESSION A PROMPT — HO36 Data Collection (Codex)
|
||||
# SESSION A PROMPT - HO36 Data Collection (Codex)
|
||||
|
||||
Copy/paste this entire prompt into your **Session A** Codex window.
|
||||
|
||||
|
|
@ -7,12 +7,12 @@ Copy/paste this entire prompt into your **Session A** Codex window.
|
|||
```markdown
|
||||
You are a competitive-intel researcher collecting **public, no-login** online footprint data for **HO36 Lyon**. You are working in parallel with Session B (Flâneur) and Session C (discovery helper). Your job is to gather the **highest-signal numbers + URLs + screenshots** so we can explain why **HO36 was likely full around NYE** while Flâneur had availability.
|
||||
|
||||
## Non‑Negotiables (do not stall)
|
||||
## Non-Negotiables (do not stall)
|
||||
- Public pages only. **No login**, no captcha solving, no bypassing anti-bot.
|
||||
- If blocked, **record `status: blocked`** with URL + screenshot and move on.
|
||||
- Be “insistent”: timebox each source to **10 minutes**, then move on.
|
||||
- Be "insistent": timebox each source to **10 minutes**, then move on.
|
||||
- Always capture evidence: URL + timestamp + screenshot path.
|
||||
- Use the repo’s schema exactly (`SCHEMA.md`).
|
||||
- Use the repo's schema exactly (`SCHEMA.md`).
|
||||
|
||||
## Start Here (Git coordination)
|
||||
1) Read Forgejo details from `~/readme.md`
|
||||
|
|
@ -41,7 +41,7 @@ Capture pages with:
|
|||
--html "data/ho36/raw/<name>__YYYYMMDD.html" \
|
||||
--wait-ms 2000
|
||||
```
|
||||
Convert JSON→CSV:
|
||||
Convert JSON->CSV:
|
||||
```bash
|
||||
python3 tools/json_to_csv.py --json data/ho36/evidence.json --csv data/ho36/evidence.csv
|
||||
```
|
||||
|
|
@ -55,7 +55,7 @@ Collect:
|
|||
- booking engine domain (e.g. Mews/RoomRaccoon/etc.)
|
||||
- languages visible
|
||||
- inventory/price claims (if stated)
|
||||
- any NYE/seasonal policy hints (min nights, sold out banners) — if none, record `unknown`
|
||||
- any NYE/seasonal policy hints (min nights, sold out banners) - if none, record `unknown`
|
||||
Capture: homepage + any booking/rooms page you can access.
|
||||
|
||||
### 2) Google Maps (must get rating + review count)
|
||||
|
|
@ -73,7 +73,7 @@ cat > /root/tmp/ho36_maps_iframe.html <<'EOF'
|
|||
</body></html>
|
||||
EOF
|
||||
```
|
||||
Then use Playwright to screenshot and read the iframe body text (look for “#### avis”).
|
||||
Then use Playwright to screenshot and read the iframe body text (look for \"#### avis\").
|
||||
Record `google_maps.rating` and `google_maps.review_count`.
|
||||
|
||||
### 3) Hostelworld (must get listing URL + rating + review count)
|
||||
|
|
@ -116,4 +116,3 @@ git push origin data/ho36
|
|||
|
||||
Begin now. Post progress after each source.
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -5,7 +5,7 @@ flaneur,other,canonical_url_booking_engine,https://booking.roomraccoon.fr/le-fl-
|
|||
flaneur,other,canonical_url_instagram,https://www.instagram.com/leflaneur_gh/,https://leflaneur-guesthouse.com/,2026-01-02T19:01:21+01:00,ok,high,Linked from official site footer/social icons.,
|
||||
flaneur,other,canonical_url_facebook,https://www.facebook.com/leflaneurlyon,https://leflaneur-guesthouse.com/,2026-01-02T19:01:21+01:00,ok,high,Linked from official site footer/social icons.,
|
||||
flaneur,other,canonical_url_tiktok,,https://leflaneur-guesthouse.com/,2026-01-02T19:01:21+01:00,unknown,low,No TikTok link found on official site homepage/footer; will attempt direct check later.,
|
||||
flaneur,other,canonical_url_booking_listing,,https://www.booking.com/searchresults.html?ss=Le%20Fl%C3%A2neur%20Guesthouse%20Lyon,2026-01-02T19:01:21+01:00,unknown,low,Attempted Booking.com search; listing URL not yet identified (timeboxed).,
|
||||
flaneur,other,canonical_url_booking_listing,https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html,https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html,2026-01-03T01:57:11+00:00,ok,high,"Captured Booking.com listing; canonical URL from <link rel=""canonical"">. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.",data/flaneur/screenshots/flaneur__booking__listing__20260103.png
|
||||
flaneur,other,canonical_url_hostelworld_listing,https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/,https://www.hostelworld.com/st/hostels/lyon/,2026-01-02T19:30:41+01:00,ok,high,Found on Hostelworld Lyon hostels page.,
|
||||
flaneur,other,canonical_url_tripadvisor_listing,https://www.tripadvisor.com/Hotel_Review-g187265-d8778985-Reviews-Le_Flaneur_Guesthouse-Lyon_Rhone_Auvergne_Rhone_Alpes.html,https://www.tripadvisor.com/Hotel_Review-g187265-d8778985-Reviews-Le_Flaneur_Guesthouse-Lyon_Rhone_Auvergne_Rhone_Alpes.html,2026-01-02T19:01:21+01:00,blocked,med,TripAdvisor page blocked (captcha-delivery.com JS challenge in curl capture).,data/flaneur/screenshots/flaneur__tripadvisor__20260102.png
|
||||
flaneur,official_site,hero_cta_text,Réservez votre séjour maintenant !,https://leflaneur-guesthouse.com/,2026-01-02T19:09:44+01:00,ok,high,Homepage headline/CTA text.,data/flaneur/screenshots/flaneur__official_site__20260102.png
|
||||
|
|
@ -27,8 +27,15 @@ flaneur,google_maps,category,Hostel,"https://www.google.com/maps/place/Le+Fl%C3%
|
|||
flaneur,google_maps,address,"56 Rue Sébastien Gryphe, 69007 Lyon, France","https://www.google.com/maps/place/Le+Fl%C3%A2neur+Guesthouse/@45.7512135,4.8428045,17z/data=!3m1!5s0x47f4ea4464dcb499:0x7fbb59cd88d1026a!4m9!3m8!1s0x47f4ea446430af35:0xe27846417ed8f4f!5m2!4m1!1i2!8m2!3d45.7512135!4d4.8428045!16s%2Fg%2F11ckqn6t7v",2026-01-02T19:30:41+01:00,ok,high,Address from Maps meta/OG content and official site footer.,data/flaneur/screenshots/flaneur__google_maps__20260102.png
|
||||
flaneur,hostelworld,rating,8.1,https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/,2026-01-02T19:30:41+01:00,ok,high,Hostelworld header shows '8.1 Fabulous (2332)'.,data/flaneur/screenshots/flaneur__hostelworld__20260102.png
|
||||
flaneur,hostelworld,review_count,2332,https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/,2026-01-02T19:30:41+01:00,ok,high,Hostelworld header shows '8.1 Fabulous (2332)'.,data/flaneur/screenshots/flaneur__hostelworld__20260102.png
|
||||
flaneur,hostelworld,kitchen_facilities,"Self-Catering Facilities, Fridge/Freezer, Utensils, Microwave, Pots and Pans, Sink, Stove",https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/,2026-01-03T01:55:47+00:00,ok,high,From Hostelworld facilities list captured separately (verify/results/hostelworld_facilities_flaneur_100844.json).,
|
||||
flaneur,official_site,direct_booking_min_bed_eur,22.88,https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/,2026-01-03T01:21:09+00:00,ok,high,Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Dortoir mixte 16 lits' (available units: 10). Refund policy hint: Remboursable jusqu'a 15h la veille. HTML snapshot: data/flaneur/raw/flaneur__roomraccoon__pricing__20260103_20260104__20260103.html.,data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png
|
||||
flaneur,official_site,direct_booking_min_private_room_eur,50.88,https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/,2026-01-03T01:21:09+00:00,ok,high,Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Chambre Privée 4 personnes' (availability not shown). Refund policy hint: Remboursable jusqu'a 15h la veille. HTML snapshot: data/flaneur/raw/flaneur__roomraccoon__pricing__20260103_20260104__20260103.html.,data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png
|
||||
flaneur,booking,access_status,blocked_by_waf,https://www.booking.com/searchresults.html?ss=Le%20Fl%C3%A2neur%20Guesthouse%20Lyon,2026-01-02T19:30:41+01:00,blocked,high,Booking.com returns AWS WAF 'verify you're not a robot' JS challenge in curl capture.,data/flaneur/screenshots/flaneur__booking__20260102.png
|
||||
flaneur,booking,rating,8.1,https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html,2026-01-03T01:57:11+00:00,ok,high,AggregateRating.ratingValue from JSON-LD on the listing page. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.,data/flaneur/screenshots/flaneur__booking__listing__20260103.png
|
||||
flaneur,booking,review_count,502,https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html,2026-01-03T01:57:11+00:00,ok,high,AggregateRating.reviewCount from JSON-LD on the listing page. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.,data/flaneur/screenshots/flaneur__booking__listing__20260103.png
|
||||
flaneur,instagram,access_status,blocked_or_login_required,https://www.instagram.com/leflaneur_gh/,2026-01-02T19:30:41+01:00,blocked,med,Instagram profile did not yield follower counts in automated capture (likely login/consent wall).,data/flaneur/screenshots/flaneur__instagram__20260102.png
|
||||
flaneur,instagram,followers_count,2296,https://www.instagram.com/leflaneur_gh/,2026-01-02T20:23:30+00:00,ok,high,From og:description meta (verify/results/flaneur_googlebot_audit.jsonl).,
|
||||
flaneur,instagram,posts_count,742,https://www.instagram.com/leflaneur_gh/,2026-01-02T20:23:30+00:00,ok,high,From og:description meta (verify/results/flaneur_googlebot_audit.jsonl).,
|
||||
flaneur,facebook,access_status,blocked_or_login_required,https://www.facebook.com/leflaneurlyon,2026-01-02T19:30:41+01:00,blocked,high,Facebook page prompts login (screenshot shows login form).,data/flaneur/screenshots/flaneur__facebook__20260102.png
|
||||
flaneur,tiktok,search_results_visible,True,https://www.tiktok.com/search?q=le%20flaneur%20guesthouse%20lyon,2026-01-02T19:30:41+01:00,ok,low,"Search results page loads, but identifying official account requires further navigation/login.",data/flaneur/screenshots/flaneur__tiktok__20260102.png
|
||||
flaneur,press,press_mentions,,https://leflaneur-guesthouse.com/,2026-01-02T19:30:41+01:00,unknown,low,No dedicated press/mentions page identified on official site during this session.,
|
||||
|
|
|
|||
|
|
|
@ -79,13 +79,13 @@
|
|||
"target": "flaneur",
|
||||
"source": "other",
|
||||
"metric_name": "canonical_url_booking_listing",
|
||||
"metric_value": null,
|
||||
"url": "https://www.booking.com/searchresults.html?ss=Le%20Fl%C3%A2neur%20Guesthouse%20Lyon",
|
||||
"captured_at": "2026-01-02T19:01:21+01:00",
|
||||
"status": "unknown",
|
||||
"confidence": "low",
|
||||
"notes": "Attempted Booking.com search; listing URL not yet identified (timeboxed).",
|
||||
"screenshot_path": null
|
||||
"metric_value": "https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html",
|
||||
"url": "https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html",
|
||||
"captured_at": "2026-01-03T01:57:11+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Captured Booking.com listing; canonical URL from <link rel=\"canonical\">. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
|
|
@ -339,6 +339,42 @@
|
|||
"notes": "Hostelworld header shows '8.1 Fabulous (2332)'.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__hostelworld__20260102.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "hostelworld",
|
||||
"metric_name": "kitchen_facilities",
|
||||
"metric_value": "Self-Catering Facilities, Fridge/Freezer, Utensils, Microwave, Pots and Pans, Sink, Stove",
|
||||
"url": "https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/",
|
||||
"captured_at": "2026-01-03T01:55:47+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From Hostelworld facilities list captured separately (verify/results/hostelworld_facilities_flaneur_100844.json).",
|
||||
"screenshot_path": null
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "official_site",
|
||||
"metric_name": "direct_booking_min_bed_eur",
|
||||
"metric_value": 22.88,
|
||||
"url": "https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/",
|
||||
"captured_at": "2026-01-03T01:21:09+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Dortoir mixte 16 lits' (available units: 10). Refund policy hint: Remboursable jusqu'a 15h la veille. HTML snapshot: data/flaneur/raw/flaneur__roomraccoon__pricing__20260103_20260104__20260103.html.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "official_site",
|
||||
"metric_name": "direct_booking_min_private_room_eur",
|
||||
"metric_value": 50.88,
|
||||
"url": "https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/",
|
||||
"captured_at": "2026-01-03T01:21:09+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Chambre Privée 4 personnes' (availability not shown). Refund policy hint: Remboursable jusqu'a 15h la veille. HTML snapshot: data/flaneur/raw/flaneur__roomraccoon__pricing__20260103_20260104__20260103.html.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "booking",
|
||||
|
|
@ -351,6 +387,30 @@
|
|||
"notes": "Booking.com returns AWS WAF 'verify you're not a robot' JS challenge in curl capture.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__booking__20260102.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "booking",
|
||||
"metric_name": "rating",
|
||||
"metric_value": 8.1,
|
||||
"url": "https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html",
|
||||
"captured_at": "2026-01-03T01:57:11+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "AggregateRating.ratingValue from JSON-LD on the listing page. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "booking",
|
||||
"metric_name": "review_count",
|
||||
"metric_value": 502,
|
||||
"url": "https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html",
|
||||
"captured_at": "2026-01-03T01:57:11+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "AggregateRating.reviewCount from JSON-LD on the listing page. HTML snapshot: data/flaneur/raw/flaneur__booking__listing__20260103.html.",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "instagram",
|
||||
|
|
@ -363,6 +423,30 @@
|
|||
"notes": "Instagram profile did not yield follower counts in automated capture (likely login/consent wall).",
|
||||
"screenshot_path": "data/flaneur/screenshots/flaneur__instagram__20260102.png"
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "instagram",
|
||||
"metric_name": "followers_count",
|
||||
"metric_value": 2296,
|
||||
"url": "https://www.instagram.com/leflaneur_gh/",
|
||||
"captured_at": "2026-01-02T20:23:30+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From og:description meta (verify/results/flaneur_googlebot_audit.jsonl).",
|
||||
"screenshot_path": null
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "instagram",
|
||||
"metric_name": "posts_count",
|
||||
"metric_value": 742,
|
||||
"url": "https://www.instagram.com/leflaneur_gh/",
|
||||
"captured_at": "2026-01-02T20:23:30+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From og:description meta (verify/results/flaneur_googlebot_audit.jsonl).",
|
||||
"screenshot_path": null
|
||||
},
|
||||
{
|
||||
"target": "flaneur",
|
||||
"source": "facebook",
|
||||
|
|
|
|||
|
|
@ -72,8 +72,9 @@ Observed:
|
|||
|
||||
## Distribution status (required platforms)
|
||||
|
||||
- Booking.com: blocked by WAF JS challenge in this environment
|
||||
- https://www.booking.com/searchresults.html?ss=Le%20Fl%C3%A2neur%20Guesthouse%20Lyon
|
||||
- Booking.com: listing captured (note: WAF challenge also occurred on search access in this environment)
|
||||
- Listing: https://www.booking.com/hotel/fr/le-flaneur-guesthouse.fr.html
|
||||
- Rating: 8.1; Reviews: 502 (schema.org JSON-LD on listing page)
|
||||
- Hostelworld: listing accessible + rating/reviews captured
|
||||
- https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/
|
||||
- TripAdvisor: blocked by captcha/JS challenge in this environment
|
||||
|
|
@ -81,7 +82,7 @@ Observed:
|
|||
|
||||
## Socials (status)
|
||||
|
||||
- Instagram: blocked/login/consent wall in automated capture (no follower metric extracted)
|
||||
- Instagram: 2,296 followers; 742 posts (from `og:description` meta; UI may still prompt login depending on session)
|
||||
- https://www.instagram.com/leflaneur_gh/
|
||||
- Facebook: blocked/login required in automated capture
|
||||
- https://www.facebook.com/leflaneurlyon
|
||||
|
|
@ -92,6 +93,16 @@ Observed:
|
|||
|
||||
Unknown (no explicit NYE min-nights / sold-out / seasonal policy found on captured pages).
|
||||
|
||||
## Direct booking snapshot (proxy window)
|
||||
|
||||
Proxy window used (past dates not queryable): 2026-01-03 to 2026-01-04.
|
||||
|
||||
- Direct booking engine: RoomRaccoon (`booking.roomraccoon.fr`)
|
||||
- Min dorm bed: 22.88 EUR ("Dortoir mixte 16 lits"; 10 beds available in snapshot)
|
||||
- Min private room: 50.88 EUR ("Chambre Privee 4 personnes")
|
||||
|
||||
Kitchen signal (Hostelworld facilities): full self-catering (stove, utensils, fridge/freezer).
|
||||
|
||||
## Strong / weak vs HO36 (evidence-based only)
|
||||
|
||||
Strong signals:
|
||||
|
|
|
|||
3197
data/flaneur/raw/flaneur__booking__listing__20260103.html
Normal file
3197
data/flaneur/raw/flaneur__booking__listing__20260103.html
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
BIN
data/flaneur/screenshots/flaneur__booking__listing__20260103.png
Normal file
BIN
data/flaneur/screenshots/flaneur__booking__listing__20260103.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 274 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 2 MiB |
|
|
@ -8,24 +8,30 @@ ho36,official_site,booking_cta_text,Réserver un lit,https://ho36lyon.com/,2026-
|
|||
ho36,official_site,booking_engine_domain,www.mews.li,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,Homepage loads https://www.mews.li/distributor/distributor.min.js and uses a Mews widget token.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,languages_available,"fr,en,it,nl",https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,From hreflang + language switcher.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,address,"36 Rue Montesquieu, 69007 Lyon",https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,From official site footer map link.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,starting_price_eur_per_night,22,https://ho36lyon.com/en/,2026-01-02T18:22:11+00:00,ok,high,Marketing claim in meta description: “Beds from 22€/night”.,data/ho36/screenshots/ho36__official_site__en__20260102.png
|
||||
ho36,official_site,private_rooms_count,13,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,Section “13 chambres privatives”.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,dorm_beds_count,50,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,Section “50 lits en dortoirs”.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,amenities_highlights,"accueil_multilingue_24_7,bagagerie,boutique_essentiels,petit_dejeuner,cafeterie,bar_bieres_cocktails,espaces_de_vie,programmation_artistique",https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,med,"Compiled from visible sections (“Services disponibles”, “PETIT DEJEUNER ET BOISSONS 7j/7”, and experience description).",data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,starting_price_eur_per_night,22,https://ho36lyon.com/en/,2026-01-02T18:22:11+00:00,ok,high,"Marketing claim in meta description: ""Beds from 22€/night"".",data/ho36/screenshots/ho36__official_site__en__20260102.png
|
||||
ho36,official_site,private_rooms_count,13,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,"Section ""13 chambres privatives"".",data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,dorm_beds_count,50,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,high,"Section ""50 lits en dortoirs"".",data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,official_site,amenities_highlights,"accueil_multilingue_24_7,bagagerie,boutique_essentiels,petit_dejeuner,cafeterie,bar_bieres_cocktails,espaces_de_vie,programmation_artistique",https://ho36lyon.com/,2026-01-02T18:03:15+00:00,ok,med,"Compiled from visible sections (""Services disponibles"", ""PETIT DEJEUNER ET BOISSONS 7j/7"", and experience description).",data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,other,canonical_url_google_maps_shortlink,https://maps.app.goo.gl/vfGnGGQxJBNwvdgX8,https://maps.app.goo.gl/vfGnGGQxJBNwvdgX8,2026-01-02T18:03:15+00:00,ok,high,Linked from official site footer.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,google_maps,rating,4.1,"https://www.google.fr/maps/place/HO36+Hostel+Lyon/@45.7529047,4.8394703,17z/data=!4m9!3m8!1s0x47f4ea44c206c2fd:0xb36a1c20ef67ead4!5m2!4m1!1i2!8m2!3d45.752901!4d4.8420452!16s%2Fg%2F1tnpkbvv?entry=tts",2026-01-02T18:05:56+00:00,ok,med,"Extracted from Google Maps UI (displayed as “4,1”). Review count was not visible in the rendered view.",data/ho36/screenshots/ho36__google_maps__20260102.png
|
||||
ho36,google_maps,review_count,1447,https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d11135.645489923992!2d4.84204!3d45.7529227!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0xb36a1c20ef67ead4!2sho36%20Lyon%20Guilloti%C3%A8re!5e0!3m2!1sfr!2sfr!4v1567089009427!5m2!1sfr!2sfr,2026-01-02T18:40:13+00:00,ok,high,Extracted from the Google Maps embed iframe (same embed URL appears on the official site). The embed shows “1 447 avis”.,data/ho36/screenshots/ho36__google_maps_embed_iframe__20260102.png
|
||||
ho36,google_maps,rating,4.1,"https://www.google.fr/maps/place/HO36+Hostel+Lyon/@45.7529047,4.8394703,17z/data=!4m9!3m8!1s0x47f4ea44c206c2fd:0xb36a1c20ef67ead4!5m2!4m1!1i2!8m2!3d45.752901!4d4.8420452!16s%2Fg%2F1tnpkbvv?entry=tts",2026-01-02T18:05:56+00:00,ok,med,"Extracted from Google Maps UI (displayed as ""4,1""). Review count was not visible in the rendered view.",data/ho36/screenshots/ho36__google_maps__20260102.png
|
||||
ho36,google_maps,review_count,1447,https://www.google.com/maps/embed?pb=!1m14!1m8!1m3!1d11135.645489923992!2d4.84204!3d45.7529227!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x0%3A0xb36a1c20ef67ead4!2sho36%20Lyon%20Guilloti%C3%A8re!5e0!3m2!1sfr!2sfr!4v1567089009427!5m2!1sfr!2sfr,2026-01-02T18:40:13+00:00,ok,high,"Extracted from the Google Maps embed iframe (same embed URL appears on the official site). The embed shows ""1,447 avis"".",data/ho36/screenshots/ho36__google_maps_embed_iframe__20260102.png
|
||||
ho36,instagram,profile_url,https://www.instagram.com/ho36hotel_lyon/,https://www.instagram.com/ho36hotel_lyon/,2026-01-02T18:11:22+00:00,ok,high,Linked from official site footer.,data/ho36/screenshots/ho36__instagram__20260102.png
|
||||
ho36,instagram,followers_count,3247,https://www.instagram.com/ho36hotel_lyon/,2026-01-02T18:11:22+00:00,ok,high,"From og:description meta (“3,247 followers…”).",data/ho36/screenshots/ho36__instagram__20260102.png
|
||||
ho36,instagram,posts_count,108,https://www.instagram.com/ho36hotel_lyon/,2026-01-02T18:11:22+00:00,ok,high,From og:description meta (“…108 publications…”).,data/ho36/screenshots/ho36__instagram__20260102.png
|
||||
ho36,instagram,followers_count,3247,https://www.instagram.com/ho36hotel_lyon/,2026-01-02T18:11:22+00:00,ok,high,"From og:description meta (""3,247 followers..."").",data/ho36/screenshots/ho36__instagram__20260102.png
|
||||
ho36,instagram,posts_count,108,https://www.instagram.com/ho36hotel_lyon/,2026-01-02T18:11:22+00:00,ok,high,"From og:description meta (""...108 publications..."").",data/ho36/screenshots/ho36__instagram__20260102.png
|
||||
ho36,facebook,page_url,https://www.facebook.com/ho36hotels/,https://www.facebook.com/ho36hotels/,2026-01-02T18:12:26+00:00,ok,high,Linked from official site footer.,data/ho36/screenshots/ho36__facebook__20260102.png
|
||||
ho36,facebook,likes_count,3185,https://www.facebook.com/ho36hotels/,2026-01-02T18:12:26+00:00,ok,high,From og:description meta (brand-level page: includes Lyon + other locations).,data/ho36/screenshots/ho36__facebook__20260102.png
|
||||
ho36,facebook,people_were_here,104,https://www.facebook.com/ho36hotels/,2026-01-02T18:12:26+00:00,ok,high,From og:description meta (brand-level page).,data/ho36/screenshots/ho36__facebook__20260102.png
|
||||
ho36,tripadvisor,listing_url,https://www.tripadvisor.fr/Hotel_Review-g187265-d293643-Reviews-Ho36_Hostel-Lyon_Rhone_Auvergne_Rhone_Alpes.html,https://www.tripadvisor.fr/Hotel_Review-g187265-d293643-Reviews-Ho36_Hostel-Lyon_Rhone_Auvergne_Rhone_Alpes.html,2026-01-02T18:09:34+00:00,blocked,low,Blocked by DataDome CAPTCHA in this environment (no bypass attempted).,data/ho36/screenshots/ho36__tripadvisor__20260102.png
|
||||
ho36,booking,listing_url,https://www.booking.com/hotel/fr/ho36-hostel.html,https://www.booking.com/hotel/fr/ho36-hostel.html,2026-01-02T19:11:29+00:00,blocked,high,Booking.com returned an AWS WAF / bot challenge (HTTP 202 + challenge.js) when attempting to access the (probable) HO36 listing URL. No bypass attempted.,data/ho36/screenshots/ho36__booking__listing_waf__20260102.png
|
||||
ho36,booking,listing_url_canonical,https://www.booking.com/hotel/fr/ho36-hostels-lyon.html,https://www.booking.com/hotel/fr/ho36-hostels-lyon.html,2026-01-03T01:57:06+00:00,ok,high,"Captured listing successfully; canonical URL from <link rel=""canonical"">. HTML snapshot: data/ho36/raw/ho36__booking__listing__20260103.html.",data/ho36/screenshots/ho36__booking__listing__20260103.png
|
||||
ho36,booking,rating,8.2,https://www.booking.com/hotel/fr/ho36-hostels-lyon.html,2026-01-03T01:57:06+00:00,ok,high,AggregateRating.ratingValue from JSON-LD on the listing page.,data/ho36/screenshots/ho36__booking__listing__20260103.png
|
||||
ho36,booking,review_count,1356,https://www.booking.com/hotel/fr/ho36-hostels-lyon.html,2026-01-03T01:57:06+00:00,ok,high,AggregateRating.reviewCount from JSON-LD on the listing page.,data/ho36/screenshots/ho36__booking__listing__20260103.png
|
||||
ho36,hostelworld,listing_url,https://www.hostelworld.com/hostels/p/270217/ho36-hostel/,https://www.hostelworld.com/hostels/p/270217/ho36-hostel/,2026-01-02T19:01:18+00:00,ok,high,"Located via Hostelworld Lyon directory page, then captured directly.",data/ho36/screenshots/ho36__hostelworld_listing__20260102.png
|
||||
ho36,hostelworld,rating,8.86,https://www.hostelworld.com/hostels/p/270217/ho36-hostel/,2026-01-02T19:01:18+00:00,ok,high,AggregateRating.ratingValue from schema.org JSON-LD on the listing page.,data/ho36/screenshots/ho36__hostelworld_listing__20260102.png
|
||||
ho36,hostelworld,review_count,1587,https://www.hostelworld.com/hostels/p/270217/ho36-hostel/,2026-01-02T19:01:18+00:00,ok,high,AggregateRating.reviewCount from schema.org JSON-LD on the listing page.,data/ho36/screenshots/ho36__hostelworld_listing__20260102.png
|
||||
ho36,hostelworld,lyon_directory_position,8,https://www.hostelworld.com/hostels/europe/france/lyon/,2026-01-02T19:09:05+00:00,ok,med,Position in the Hostelworld Lyon directory page (schema.org ItemList position). Sorting may vary by user/session.,data/ho36/screenshots/ho36__hostelworld_lyon__20260102.png
|
||||
ho36,hostelworld,kitchen_facilities,Microwave,https://www.hostelworld.com/hostels/p/270217/ho36-hostel/,2026-01-03T01:55:38+00:00,ok,high,From Hostelworld facilities list captured separately (verify/results/hostelworld_facilities_ho36_270217.json).,
|
||||
ho36,official_site,direct_booking_min_bed_eur,28.0,https://ho36lyon.com/,2026-01-03T01:21:23+00:00,ok,high,Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Lit en dortoir mixte' (available units: 3). Refund policy hint: Non remboursable. Source: embedded Mews widget. Text snapshot: data/ho36/raw/ho36__mews__pricing__20260103_20260104__20260103.txt.,data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png
|
||||
ho36,official_site,direct_booking_min_private_room_eur,55.0,https://ho36lyon.com/,2026-01-03T01:21:23+00:00,ok,high,Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Chambre single RDC' (available units: 1). Refund policy hint: Non remboursable. Source: embedded Mews widget. Text snapshot: data/ho36/raw/ho36__mews__pricing__20260103_20260104__20260103.txt.,data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png
|
||||
ho36,tiktok,profile_url,,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,unknown,low,No TikTok link found on official site footer/header in captured pages.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
ho36,other,nye_availability_or_policy_indicator,,https://ho36lyon.com/,2026-01-02T18:03:15+00:00,unknown,low,No publicly visible NYE-specific sold-out/min-night indicator captured without deep booking-engine interaction.,data/ho36/screenshots/ho36__official_site__home__20260102.png
|
||||
|
|
|
|||
|
|
|
@ -120,7 +120,7 @@
|
|||
"captured_at": "2026-01-02T18:22:11+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Marketing claim in meta description: “Beds from 22€/night”.",
|
||||
"notes": "Marketing claim in meta description: \"Beds from 22€/night\".",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__official_site__en__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -132,7 +132,7 @@
|
|||
"captured_at": "2026-01-02T18:03:15+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Section “13 chambres privatives”.",
|
||||
"notes": "Section \"13 chambres privatives\".",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__official_site__home__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -144,7 +144,7 @@
|
|||
"captured_at": "2026-01-02T18:03:15+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Section “50 lits en dortoirs”.",
|
||||
"notes": "Section \"50 lits en dortoirs\".",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__official_site__home__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -156,7 +156,7 @@
|
|||
"captured_at": "2026-01-02T18:03:15+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "med",
|
||||
"notes": "Compiled from visible sections (“Services disponibles”, “PETIT DEJEUNER ET BOISSONS 7j/7”, and experience description).",
|
||||
"notes": "Compiled from visible sections (\"Services disponibles\", \"PETIT DEJEUNER ET BOISSONS 7j/7\", and experience description).",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__official_site__home__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -180,7 +180,7 @@
|
|||
"captured_at": "2026-01-02T18:05:56+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "med",
|
||||
"notes": "Extracted from Google Maps UI (displayed as “4,1”). Review count was not visible in the rendered view.",
|
||||
"notes": "Extracted from Google Maps UI (displayed as \"4,1\"). Review count was not visible in the rendered view.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__google_maps__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -192,7 +192,7 @@
|
|||
"captured_at": "2026-01-02T18:40:13+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Extracted from the Google Maps embed iframe (same embed URL appears on the official site). The embed shows \u201c1\u202f447 avis\u201d.",
|
||||
"notes": "Extracted from the Google Maps embed iframe (same embed URL appears on the official site). The embed shows \"1,447 avis\".",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__google_maps_embed_iframe__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -216,7 +216,7 @@
|
|||
"captured_at": "2026-01-02T18:11:22+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From og:description meta (“3,247 followers…”).",
|
||||
"notes": "From og:description meta (\"3,247 followers...\").",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__instagram__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -228,7 +228,7 @@
|
|||
"captured_at": "2026-01-02T18:11:22+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From og:description meta (“…108 publications…”).",
|
||||
"notes": "From og:description meta (\"...108 publications...\").",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__instagram__20260102.png"
|
||||
},
|
||||
{
|
||||
|
|
@ -291,6 +291,42 @@
|
|||
"notes": "Booking.com returned an AWS WAF / bot challenge (HTTP 202 + challenge.js) when attempting to access the (probable) HO36 listing URL. No bypass attempted.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__booking__listing_waf__20260102.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "booking",
|
||||
"metric_name": "listing_url_canonical",
|
||||
"metric_value": "https://www.booking.com/hotel/fr/ho36-hostels-lyon.html",
|
||||
"url": "https://www.booking.com/hotel/fr/ho36-hostels-lyon.html",
|
||||
"captured_at": "2026-01-03T01:57:06+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Captured listing successfully; canonical URL from <link rel=\"canonical\">. HTML snapshot: data/ho36/raw/ho36__booking__listing__20260103.html.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "booking",
|
||||
"metric_name": "rating",
|
||||
"metric_value": 8.2,
|
||||
"url": "https://www.booking.com/hotel/fr/ho36-hostels-lyon.html",
|
||||
"captured_at": "2026-01-03T01:57:06+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "AggregateRating.ratingValue from JSON-LD on the listing page.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "booking",
|
||||
"metric_name": "review_count",
|
||||
"metric_value": 1356,
|
||||
"url": "https://www.booking.com/hotel/fr/ho36-hostels-lyon.html",
|
||||
"captured_at": "2026-01-03T01:57:06+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "AggregateRating.reviewCount from JSON-LD on the listing page.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__booking__listing__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "hostelworld",
|
||||
|
|
@ -339,6 +375,42 @@
|
|||
"notes": "Position in the Hostelworld Lyon directory page (schema.org ItemList position). Sorting may vary by user/session.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__hostelworld_lyon__20260102.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "hostelworld",
|
||||
"metric_name": "kitchen_facilities",
|
||||
"metric_value": "Microwave",
|
||||
"url": "https://www.hostelworld.com/hostels/p/270217/ho36-hostel/",
|
||||
"captured_at": "2026-01-03T01:55:38+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "From Hostelworld facilities list captured separately (verify/results/hostelworld_facilities_ho36_270217.json).",
|
||||
"screenshot_path": null
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "official_site",
|
||||
"metric_name": "direct_booking_min_bed_eur",
|
||||
"metric_value": 28.0,
|
||||
"url": "https://ho36lyon.com/",
|
||||
"captured_at": "2026-01-03T01:21:23+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Lit en dortoir mixte' (available units: 3). Refund policy hint: Non remboursable. Source: embedded Mews widget. Text snapshot: data/ho36/raw/ho36__mews__pricing__20260103_20260104__20260103.txt.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "official_site",
|
||||
"metric_name": "direct_booking_min_private_room_eur",
|
||||
"metric_value": 55.0,
|
||||
"url": "https://ho36lyon.com/",
|
||||
"captured_at": "2026-01-03T01:21:23+00:00",
|
||||
"status": "ok",
|
||||
"confidence": "high",
|
||||
"notes": "Pricing snapshot window 2026-01-03 to 2026-01-04. Room: 'Chambre single RDC' (available units: 1). Refund policy hint: Non remboursable. Source: embedded Mews widget. Text snapshot: data/ho36/raw/ho36__mews__pricing__20260103_20260104__20260103.txt.",
|
||||
"screenshot_path": "data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png"
|
||||
},
|
||||
{
|
||||
"target": "ho36",
|
||||
"source": "tiktok",
|
||||
|
|
@ -365,7 +437,7 @@
|
|||
}
|
||||
],
|
||||
"profile": {
|
||||
"positioning": "Atypical, cool and affordable hostel designed as a place of exchange and discovery (with artistic programming) offering a “true travel experience”.",
|
||||
"positioning": "Atypical, cool and affordable hostel designed as a place of exchange and discovery (with artistic programming) offering a \"true travel experience\".",
|
||||
"target_audience": [
|
||||
"hostel_travelers",
|
||||
"backpackers",
|
||||
|
|
|
|||
|
|
@ -1,8 +1,9 @@
|
|||
# HO36 Lyon — Footprint profile (Session A)
|
||||
# HO36 Lyon - Footprint profile (Session A)
|
||||
|
||||
## Canonical links
|
||||
|
||||
- Official site: https://ho36lyon.com/
|
||||
- Booking.com: https://www.booking.com/hotel/fr/ho36-hostels-lyon.html
|
||||
- Google Maps (shortlink): https://maps.app.goo.gl/vfGnGGQxJBNwvdgX8
|
||||
- Hostelworld: https://www.hostelworld.com/hostels/p/270217/ho36-hostel/
|
||||
- Instagram: https://www.instagram.com/ho36hotel_lyon/
|
||||
|
|
@ -12,28 +13,29 @@
|
|||
## Quick facts (from captured sources)
|
||||
|
||||
- Address: 36 Rue Montesquieu, 69007 Lyon
|
||||
- Booking CTA: “Réserver un lit” embedded on homepage
|
||||
- Booking CTA: "Réserver un lit" embedded on homepage
|
||||
- Booking engine: Mews (`www.mews.li`) embedded widget on the official site
|
||||
- Languages visible: FR / EN / IT / NL
|
||||
- Inventory claims: 13 private rooms + 50 dorm beds
|
||||
- Price signal: “Beds from 22€/night” (site meta description)
|
||||
- Price signal: "Beds from 22 EUR/night" (site meta description)
|
||||
|
||||
## Positioning / messaging (exact copy excerpts)
|
||||
|
||||
- FR meta description: “Le HO36 est un lieu atypique, cool et abordable… lieu d'échange et de découvertes…”
|
||||
- EN meta description: “HO36 is an atypical, cool and affordable place… designed as a place of exchange and discovery…”
|
||||
- Tagline (JSON‑LD): “Eat - drink - live - sleep”
|
||||
- FR meta description: "Le HO36 est un lieu atypique, cool et abordable... lieu d'échange et de découvertes..."
|
||||
- EN meta description: "HO36 is an atypical, cool and affordable place... designed as a place of exchange and discovery..."
|
||||
- Tagline (JSON-LD): "Eat - drink - live - sleep"
|
||||
|
||||
## Amenities / experience cues (high-signal)
|
||||
|
||||
- “Services disponibles” includes: 24/7 multilingual reception, luggage storage, small essentials shop
|
||||
- On-site food/drink cues: breakfast (7:00–11:00), cafeteria (7:00–23:30), beers + “creative cocktails”
|
||||
- “L'expérience ho36” text explicitly emphasizes: quality service at “prix doux”, cosmopolitan teams, and “programmation artistique”
|
||||
- "Services disponibles" includes: 24/7 multilingual reception, luggage storage, small essentials shop
|
||||
- On-site food/drink cues: breakfast (7:00-11:00), cafeteria (7:00-23:30), beers + "creative cocktails"
|
||||
- "L'expérience ho36" text explicitly emphasizes: quality service at "prix doux", cosmopolitan teams, and "programmation artistique"
|
||||
|
||||
## Social proof (public, no-login)
|
||||
|
||||
- Booking.com: 8.2 rating; 1,356 reviews (schema.org JSON-LD on listing page)
|
||||
- Instagram @ho36hotel_lyon: 3,247 followers; 108 posts (from `og:description`)
|
||||
- Facebook page “HO36” (brand-level page): 3,185 likes; 104 “people were here” (from `og:description`)
|
||||
- Facebook page "HO36" (brand-level page): 3,185 likes; 104 "people were here" (from `og:description`)
|
||||
- Google Maps: 4.1 rating; 1,447 reviews (via Google Maps embed iframe)
|
||||
- Hostelworld: 8.86 rating; 1,587 reviews (schema.org JSON-LD on listing page)
|
||||
|
||||
|
|
@ -41,5 +43,5 @@
|
|||
|
||||
- Google Maps: 4.1 rating; 1,447 reviews
|
||||
- TripAdvisor: blocked by DataDome CAPTCHA (no bypass attempted)
|
||||
- Booking.com: AWS WAF / bot challenge when attempting listing access (no bypass attempted)
|
||||
- Booking.com: listing captured; 8.2 rating; 1,356 reviews (note: WAF challenge also occurred in an earlier attempt)
|
||||
- Hostelworld: listing accessible and captured (rating + review count extracted)
|
||||
|
|
|
|||
3266
data/ho36/raw/ho36__booking__listing__20260103.html
Normal file
3266
data/ho36/raw/ho36__booking__listing__20260103.html
Normal file
File diff suppressed because one or more lines are too long
|
|
@ -0,0 +1,103 @@
|
|||
Passer à :
|
||||
Barre d'outils
|
||||
Navigation du site
|
||||
Contenu principal
|
||||
HO36 Hostel Lyon
|
||||
FR - FR
|
||||
EUR
|
||||
Fermer
|
||||
Dates
|
||||
2
|
||||
Catégories
|
||||
Tarifs
|
||||
Récapitulatif
|
||||
Détails et paiement
|
||||
Sélectionnez une catégorie
|
||||
Nuits sélectionnées 1
|
||||
Sam. 03/01/2026
|
||||
→
|
||||
Dim. 04/01/2026
|
||||
Clients sélectionnés 1
|
||||
Adultes 1
|
||||
Modifier
|
||||
Image suivante
|
||||
Image précédente
|
||||
2
|
||||
Chambre Ho my God
|
||||
Non remboursable
|
||||
Personnes maximum : 2
|
||||
Disponible : 4
|
||||
|
||||
Grande chambre double avec lit king size et salle de bain privative
|
||||
|
||||
Plus
|
||||
À partir de
|
||||
75,00 €
|
||||
par chambre/par nuit
|
||||
(Hors taxe de séjour, TVA comprise)
|
||||
Afficher les tarifs
|
||||
Image suivante
|
||||
Image précédente
|
||||
3
|
||||
Chambre familiale
|
||||
Non remboursable
|
||||
Personnes maximum : 6
|
||||
Disponible : 1
|
||||
|
||||
Chambre avec 1 lit double et 2 lits superposés pour accueillir 6 personnes, salle de bain privative, idéal pour les familles
|
||||
|
||||
Plus
|
||||
À partir de
|
||||
180,00 €
|
||||
par chambre/par nuit
|
||||
(Hors taxe de séjour, TVA comprise)
|
||||
Afficher les tarifs
|
||||
Image suivante
|
||||
Image précédente
|
||||
2
|
||||
Chambre single RDC
|
||||
Non remboursable
|
||||
Personnes maximum : 1
|
||||
Disponible : 1
|
||||
|
||||
Chambre lit simple au RDC avec salle de bain privée
|
||||
|
||||
Plus
|
||||
À partir de
|
||||
55,00 €
|
||||
par chambre/par nuit
|
||||
(Hors taxe de séjour, TVA comprise)
|
||||
Afficher les tarifs
|
||||
Image suivante
|
||||
Image précédente
|
||||
2
|
||||
Dortoir privatif pour 2 personnes
|
||||
Non remboursable
|
||||
Personnes maximum : 2
|
||||
Disponible : 1
|
||||
|
||||
Dortoir privatif avec lit superposé, salle de bain partagée
|
||||
|
||||
Plus
|
||||
À partir de
|
||||
76,00 €
|
||||
par dortoir/par nuit
|
||||
(Hors taxe de séjour, TVA comprise)
|
||||
Afficher les tarifs
|
||||
Image suivante
|
||||
Image précédente
|
||||
4
|
||||
Lit en dortoir mixte
|
||||
Non remboursable
|
||||
Personnes maximum : 1
|
||||
Disponible : 3
|
||||
|
||||
1 lit dans un dortoir mixte de 6 à 8 personnes
|
||||
|
||||
Plus
|
||||
À partir de
|
||||
28,00 €
|
||||
par lit/par nuit
|
||||
(Hors taxe de séjour, TVA comprise)
|
||||
Afficher les tarifs
|
||||
Catégories sans disponibilité
|
||||
BIN
data/ho36/screenshots/ho36__booking__listing__20260103.png
Normal file
BIN
data/ho36/screenshots/ho36__booking__listing__20260103.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 324 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 1,010 KiB |
124
reports/flaneur_vs_ho36_gm_report_2026-01-03.md
Normal file
124
reports/flaneur_vs_ho36_gm_report_2026-01-03.md
Normal file
|
|
@ -0,0 +1,124 @@
|
|||
# Lyon Hostel Footprint Report (HO36 vs Le Flaneur)
|
||||
|
||||
Date: 2026-01-03 (UTC)
|
||||
|
||||
## Scope and method (what is and is not verified)
|
||||
|
||||
- Public, no-login sources only (no CAPTCHA or paywall bypass).
|
||||
- "HO36 full vs Flaneur empty at New Year" cannot be verified directly after-the-fact because booking engines do not allow querying past dates. This report uses a proxy stay window to compare current demand signals and conversion drivers.
|
||||
- Proxy pricing/availability window used: 2026-01-03 to 2026-01-04 (1 night).
|
||||
- Review theme analysis uses Hostelworld reviews from the last 12 months (public API observed via browser network traffic; no login). Google Maps review text was not collected (only rating + review count).
|
||||
|
||||
## Executive summary (what likely explains the gap)
|
||||
|
||||
1) Cleanliness/facilities perception is the strongest gap.
|
||||
- Flaneur: 11 negative Hostelworld reviews (<= 65/100) in the last 12 months, with repeated hygiene/bathroom complaints.
|
||||
- HO36: 1 negative review in the last 12 months; high cleanliness and facilities averages.
|
||||
|
||||
2) Booking.com channel traction appears materially stronger for HO36.
|
||||
- Similar rating (HO36 8.2 vs Flaneur 8.1), but HO36 has far more Booking reviews (1,356 vs 502), which usually correlates with higher visibility and stronger conversion on that channel.
|
||||
|
||||
3) Price is not the primary driver.
|
||||
- In the proxy window, Flaneur is cheaper and more flexible (refundable) yet shows materially higher availability than HO36.
|
||||
|
||||
4) Flaneur has a product advantage (full kitchen) that is not currently compensating for the trust gap.
|
||||
- Kitchen/self-catering is a differentiator vs HO36 (microwave-only), but it needs to be paired with visible cleanliness and sleep-comfort improvements to convert.
|
||||
|
||||
## Side-by-side comparison (high-signal metrics)
|
||||
|
||||
| Category | HO36 | Le Flaneur | Evidence |
|
||||
|---|---:|---:|---|
|
||||
| Booking.com rating (reviews) | 8.2 (1,356) | 8.1 (502) | `data/ho36/screenshots/ho36__booking__listing__20260103.png`, `data/flaneur/screenshots/flaneur__booking__listing__20260103.png` |
|
||||
| Hostelworld rating (reviews) | 8.86 (1,587) | 8.1 (2,332) | `data/ho36/screenshots/ho36__hostelworld_listing__20260102.png`, `data/flaneur/screenshots/flaneur__hostelworld__20260102.png` |
|
||||
| Google Maps rating (reviews) | 4.1 (1,447) | 4.3 (855) | `data/ho36/screenshots/ho36__google_maps_embed_iframe__20260102.png`, `data/flaneur/screenshots/flaneur__google_maps__20260102.png` |
|
||||
| Instagram followers | 3,247 | 2,296 | `verify/results/ho36_googlebot_audit.jsonl`, `verify/results/flaneur_googlebot_audit.jsonl` |
|
||||
| Direct booking engine | Mews (`www.mews.li`) | RoomRaccoon (`booking.roomraccoon.fr`) | `data/ho36/evidence.json`, `data/flaneur/evidence.json` |
|
||||
| Proxy min dorm bed EUR (availability) | 28.00 (3 beds) | 22.88 (10 beds) | `data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png`, `data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png` |
|
||||
| Proxy min private room EUR | 55.00 | 50.88 | `data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png`, `data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png` |
|
||||
| Kitchen (Hostelworld facilities) | Microwave only | Full self-catering | `verify/results/hostelworld_facilities_ho36_270217.json`, `verify/results/hostelworld_facilities_flaneur_100844.json` |
|
||||
|
||||
## Pricing and availability (proxy window 2026-01-03 to 2026-01-04)
|
||||
|
||||
This is a like-for-like snapshot from each hostel's direct booking engine (not an OTA), used to compare relative demand and conversion constraints.
|
||||
|
||||
| Hostel | Cheapest dorm bed | Cheapest private room | Cancellation signal (proxy window) | Availability signal (proxy window) |
|
||||
|---|---|---|---|---|
|
||||
| HO36 | 28.00 EUR ("Lit en dortoir mixte") | 55.00 EUR ("Chambre single RDC") | Non-refundable | 3 beds available on cheapest dorm |
|
||||
| Le Flaneur | 22.88 EUR ("Dortoir mixte 16 lits") | 50.88 EUR ("Chambre Privee 4 personnes") | Refundable until 15:00 day before | 10 beds available on cheapest dorm |
|
||||
|
||||
Interpretation: Flaneur is cheaper and more flexible, yet shows higher availability. This points away from price/policy as the root cause and toward trust (cleanliness, comfort, safety perception) and channel visibility.
|
||||
|
||||
## Product/amenities - what is different (observable)
|
||||
|
||||
| Area | HO36 | Le Flaneur | Why it matters |
|
||||
|---|---|---|---|
|
||||
| Kitchen | Microwave only | Stove + utensils + fridge + self-catering | Kitchen is a strong value driver for budget travelers; it can be a conversion lever if paired with cleanliness trust. |
|
||||
| Bar/cafe | Yes | Yes | Both compete on "social + bar/cafe"; differentiation needs to be sharper (events, atmosphere, review narrative). |
|
||||
| Coworking / meeting | Not clearly listed in Hostelworld facilities | Listed (meeting rooms, coworking space) | If Flaneur targets remote workers, the offer must be visible on OTAs and reflected in reviews/photos. |
|
||||
| Accessibility | Not listed in captured Hostelworld facilities | Wheelchair friendly + accessible bathrooms | Can widen addressable audience; should be highlighted consistently across channels. |
|
||||
|
||||
## Review themes (last 12 months - Hostelworld)
|
||||
|
||||
Source: `verify/results/hostelworld_review_themes.md`
|
||||
|
||||
### HO36 (12m)
|
||||
|
||||
- Reviews (12m): 16
|
||||
- Mean score: 84.4/100; median: 89/100
|
||||
- Positive/Neutral/Negative: 12 / 3 / 1
|
||||
- Biggest positives: cleanliness, staff; consistent "feels safe" despite some neighborhood concern mentions.
|
||||
- Main negative (low frequency): occasional check-in / keycard / process issues.
|
||||
|
||||
### Le Flaneur (12m)
|
||||
|
||||
- Reviews (12m): 51
|
||||
- Mean score: 78.3/100; median: 83/100
|
||||
- Positive/Neutral/Negative: 25 / 15 / 11
|
||||
- Recurring pain points (seen in multiple reviews, not single outliers):
|
||||
- Cleanliness issues (especially bathrooms, odor)
|
||||
- Safety/neighborhood discomfort framing (more frequent than HO36 in negative reviews)
|
||||
- Reception availability / staff process issues (distinct from "staff friendliness", which is often praised)
|
||||
- Recurring positives:
|
||||
- Staff friendliness
|
||||
- Kitchen/self-catering
|
||||
- Value for money
|
||||
|
||||
## Ranked hypotheses (evidence-backed)
|
||||
|
||||
1) Cleanliness + bathroom trust gap reduces conversion (HIGH).
|
||||
- Evidence: Flaneur has repeated negative cleanliness/bathroom mentions and lower cleanliness/facilities averages; HO36 has far fewer negatives and higher cleanliness averages.
|
||||
|
||||
2) Booking.com visibility gap (MED-HIGH).
|
||||
- Evidence: Similar rating, but HO36 has ~2.7x the review volume on Booking (1,356 vs 502). Review volume generally correlates with ranking and click-through.
|
||||
|
||||
3) "Safety in the neighborhood" narrative is hurting Flaneur more than HO36 (MED).
|
||||
- Evidence: Safety/neighborhood appears as a repeated negative theme for Flaneur in the last 12 months; for HO36 it appears more as a neutral/positive reassurance theme.
|
||||
|
||||
4) Sleep comfort and dorm UX issues compound the cleanliness narrative (MED).
|
||||
- Evidence: Flaneur has recurring mentions of sleep/noise and dorm comfort in neutral/negative themes; this matters disproportionately around peak periods when guests compare options quickly.
|
||||
|
||||
5) Flaneur's differentiators (kitchen, coworking, "tiers lieu") are not driving enough demand because they are not turning into social proof (MED).
|
||||
- Evidence: Kitchen is praised but does not dominate the overall review narrative; Booking review volume is relatively low for the category.
|
||||
|
||||
## Action plan for Flaneur (10 concrete, fast experiments)
|
||||
|
||||
1) Bathrooms: 14-day deep clean + odor elimination sprint; publish proof (photos, short reels) and push it to OTAs as new images.
|
||||
2) Housekeeping QA: introduce a visible checklist and nightly spot checks; track defects; respond to every cleanliness review with a specific fix.
|
||||
3) Sleep product upgrade: add curtains where feasible, tighten bunks, improve lighting; then message it as "better sleep" on Hostelworld/Booking.
|
||||
4) Reception reliability: ensure real 24/7 coverage or clearly communicate the actual hours and self-check-in; reduce "process" complaints.
|
||||
5) Safety perception: improve lighting/signage at entrance; add clear guest guidance for late arrivals; emphasize lockers/security features on listings.
|
||||
6) Reposition the kitchen: run 2-3 weekly communal cooking nights (cheap, high-UGC) and push as a reason to stay during winter.
|
||||
7) OTA listing optimization: refresh top-fold photos (bathrooms, beds, common areas, kitchen), and align amenity lists across all channels.
|
||||
8) Review ops: QR code at checkout; staff asks happy guests for reviews; target +30 Booking reviews and +50 Hostelworld reviews in 60 days.
|
||||
9) Pricing tests: keep dorm prices, but bundle value for privates (late checkout, breakfast voucher, bar credit) to drive higher ADR.
|
||||
10) Local partnerships for NYE-like periods: bar/event partners + "stay and go out" packages; create a dedicated landing page and social cadence.
|
||||
|
||||
## Evidence index (quick links)
|
||||
|
||||
- HO36 evidence: `data/ho36/evidence.json`, `data/ho36/evidence.csv`, `data/ho36/profile.md`
|
||||
- Flaneur evidence: `data/flaneur/evidence.json`, `data/flaneur/evidence.csv`, `data/flaneur/profile.md`
|
||||
- Pricing snapshot JSON: `verify/results/pricing_window__20260103_20260104__20260103.json`
|
||||
- Review theme summary: `verify/results/hostelworld_review_themes.md`
|
||||
- Hostelworld facilities JSON:
|
||||
- `verify/results/hostelworld_facilities_ho36_270217.json`
|
||||
- `verify/results/hostelworld_facilities_flaneur_100844.json`
|
||||
2
verify/.gitignore
vendored
2
verify/.gitignore
vendored
|
|
@ -1,4 +1,4 @@
|
|||
__pycache__/
|
||||
*.pyc
|
||||
.venv/
|
||||
|
||||
results/hostelworld_reviews_*.json
|
||||
|
|
|
|||
130
verify/results/hostelworld_facilities_flaneur_100844.json
Normal file
130
verify/results/hostelworld_facilities_flaneur_100844.json
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
{
|
||||
"captured_at": "2026-01-03T01:55:47",
|
||||
"property_id": 100844,
|
||||
"url": "https://www.hostelworld.com/hostels/p/100844/le-flaneur-guesthouse/",
|
||||
"facilities": [
|
||||
{
|
||||
"category": "Free",
|
||||
"items": [
|
||||
"Free City Maps",
|
||||
"Free WiFi",
|
||||
"Free Internet Access",
|
||||
"Free Security Lockers",
|
||||
"Free Luggage Storage"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "General",
|
||||
"items": [
|
||||
"Bicycle Parking",
|
||||
"Meeting Rooms",
|
||||
"Adaptors",
|
||||
"Coworking Space"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Services",
|
||||
"items": [
|
||||
"Luggage Storage",
|
||||
"24 Hour Reception",
|
||||
"Internet café"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Entertainment",
|
||||
"items": [
|
||||
"Internet access",
|
||||
"Board Games",
|
||||
"Wi-Fi"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Food & Drinks",
|
||||
"items": [
|
||||
"Bar",
|
||||
"Mini-Supermarket",
|
||||
"Cafe",
|
||||
"Tea & Coffee Making Facilities",
|
||||
"Filtered Water Dispenser",
|
||||
"Mini-bar"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Kitchen",
|
||||
"items": [
|
||||
"Self-Catering Facilities",
|
||||
"Fridge/Freezer",
|
||||
"Utensils",
|
||||
"Microwave",
|
||||
"Pots and Pans",
|
||||
"Sink",
|
||||
"Stove"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Accessibility",
|
||||
"items": [
|
||||
"Wheelchair Friendly",
|
||||
"Wheelchair-Accessible Bathrooms"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Laundry",
|
||||
"items": [
|
||||
"Laundry Facilities",
|
||||
"Dryer",
|
||||
"Iron / Ironing Board",
|
||||
"Washing Machine",
|
||||
"Free Iron / Ironing Board",
|
||||
"Recycling Bins"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Wellness",
|
||||
"items": [
|
||||
"Hot Showers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Outdoors",
|
||||
"items": [
|
||||
"Outdoor Terrace"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Safety & Security",
|
||||
"items": [
|
||||
"Security Lockers",
|
||||
"24 Hour Security",
|
||||
"Safe Deposit Box",
|
||||
"Fire Extinguishers",
|
||||
"Smoke Alarms"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Social Areas",
|
||||
"items": [
|
||||
"Common Room",
|
||||
"Lounge Room"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Bedroom",
|
||||
"items": [
|
||||
"Linen Included",
|
||||
"Air Conditioning",
|
||||
"Reading Light",
|
||||
"Hair Dryers",
|
||||
"Ceiling fan"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Available for a Fee",
|
||||
"items": [
|
||||
"Towels for hire",
|
||||
"Breakfast Not Included",
|
||||
"Hair Dryers For Hire"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
100
verify/results/hostelworld_facilities_ho36_270217.json
Normal file
100
verify/results/hostelworld_facilities_ho36_270217.json
Normal file
|
|
@ -0,0 +1,100 @@
|
|||
{
|
||||
"captured_at": "2026-01-03T01:55:38",
|
||||
"property_id": 270217,
|
||||
"url": "https://www.hostelworld.com/hostels/p/270217/ho36-hostel/",
|
||||
"facilities": [
|
||||
{
|
||||
"category": "Free",
|
||||
"items": [
|
||||
"Free WiFi",
|
||||
"Free Internet Access",
|
||||
"Free Luggage Storage"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Services",
|
||||
"items": [
|
||||
"Luggage Storage",
|
||||
"24 Hour Reception",
|
||||
"Housekeeping"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Entertainment",
|
||||
"items": [
|
||||
"Internet access",
|
||||
"Book Exchange",
|
||||
"Board Games",
|
||||
"Wi-Fi"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Food & Drinks",
|
||||
"items": [
|
||||
"Bar",
|
||||
"Cafe",
|
||||
"Tea & Coffee Making Facilities"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Kitchen",
|
||||
"items": [
|
||||
"Microwave"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Laundry",
|
||||
"items": [
|
||||
"Iron / Ironing Board",
|
||||
"Free Iron / Ironing Board"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Outdoors",
|
||||
"items": [
|
||||
"Outdoor Terrace"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Safety & Security",
|
||||
"items": [
|
||||
"Key Card Access",
|
||||
"24 Hour Security",
|
||||
"First Aid Kits",
|
||||
"Smoke Alarms",
|
||||
"Smoke Detector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Social Areas",
|
||||
"items": [
|
||||
"Common Room",
|
||||
"Games Room"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Bedroom",
|
||||
"items": [
|
||||
"Linen Included",
|
||||
"Reading Light",
|
||||
"Towels Not Included",
|
||||
"Charging Plugs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Hostelworld Policies",
|
||||
"items": [
|
||||
"Flexible NRR"
|
||||
]
|
||||
},
|
||||
{
|
||||
"category": "Available for a Fee",
|
||||
"items": [
|
||||
"Towels for hire",
|
||||
"Breakfast Not Included",
|
||||
"Hair Dryers For Hire",
|
||||
"Paid Security lockers"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
89
verify/results/hostelworld_review_themes.md
Normal file
89
verify/results/hostelworld_review_themes.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
# Hostelworld review themes (last 12 months)
|
||||
|
||||
- Generated: 2026-01-03T01:54:19
|
||||
- Window: last 365 days
|
||||
- Negative threshold: <= 65/100
|
||||
- Positive threshold: >= 85/100
|
||||
|
||||
## HO36
|
||||
|
||||
| Metric | Value |
|
||||
|---|---|
|
||||
| Window since | 2025-01-03 |
|
||||
| Reviews (12m) | 16 |
|
||||
| Overall mean (/100) | 84.4 |
|
||||
| Overall median (/100) | 89.0 |
|
||||
| Positive / Neutral / Negative | 12 / 3 / 1 |
|
||||
|
||||
| Subscore (avg/10) | Score |
|
||||
|---|---|
|
||||
| cleanliness | 9.1 |
|
||||
| facilities | 8.9 |
|
||||
| staff | 9.4 |
|
||||
| atmosphere | 8.2 |
|
||||
| safety | 8.1 |
|
||||
| location | 6.4 |
|
||||
| value | 9.0 |
|
||||
|
||||
### Themes (mentions, min 5)
|
||||
|
||||
**positive**
|
||||
- safety_neighborhood: 7
|
||||
- cleanliness: 7
|
||||
|
||||
### Top keywords (sanity check)
|
||||
|
||||
**negative**
|
||||
when (3), checked (2), ask (2), out (2), luggage (2), staff (1), didn (1), give (1), keycard (1), nor (1), towel (1), had (1), back (1), reception (1), area (1)
|
||||
|
||||
**positive**
|
||||
hostel (16), nice (13), great (8), area (7), good (7), rooms (5), there (5), staff (5), clean (4), super (4), central (4), location (4), comfortable (4), walking (4), stay (4)
|
||||
|
||||
## Flaneur
|
||||
|
||||
| Metric | Value |
|
||||
|---|---|
|
||||
| Window since | 2025-01-03 |
|
||||
| Reviews (12m) | 51 |
|
||||
| Overall mean (/100) | 78.3 |
|
||||
| Overall median (/100) | 83.0 |
|
||||
| Positive / Neutral / Negative | 25 / 15 / 11 |
|
||||
|
||||
| Subscore (avg/10) | Score |
|
||||
|---|---|
|
||||
| cleanliness | 7.4 |
|
||||
| facilities | 7.3 |
|
||||
| staff | 9.1 |
|
||||
| atmosphere | 7.8 |
|
||||
| safety | 8.0 |
|
||||
| location | 7.3 |
|
||||
| value | 8.0 |
|
||||
|
||||
### Themes (mentions, min 5)
|
||||
|
||||
**negative**
|
||||
- cleanliness: 7
|
||||
- safety_neighborhood: 5
|
||||
- staff_reception: 5
|
||||
|
||||
**neutral**
|
||||
- cleanliness: 12
|
||||
- kitchen_food: 8
|
||||
- staff_reception: 7
|
||||
- sleep_noise: 5
|
||||
|
||||
**positive**
|
||||
- staff_reception: 11
|
||||
- cleanliness: 9
|
||||
- kitchen_food: 8
|
||||
- safety_neighborhood: 6
|
||||
- sleep_noise: 5
|
||||
|
||||
### Top keywords (sanity check)
|
||||
|
||||
**negative**
|
||||
hostel (8), people (6), had (6), staff (6), room (6), bathrooms (5), there (4), like (4), nice (4), place (3), felt (3), beds (3), location (3), stay (3), super (3)
|
||||
|
||||
**positive**
|
||||
good (15), staff (12), hostel (12), great (10), clean (9), stay (9), nice (9), room (7), common (7), there (7), place (7), from (7), which (7), kitchen (6), would (6)
|
||||
|
||||
127
verify/results/pricing_window__20260103_20260104__20260103.json
Normal file
127
verify/results/pricing_window__20260103_20260104__20260103.json
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
{
|
||||
"captured_at": "2026-01-03T01:21:23+00:00",
|
||||
"window": {
|
||||
"checkin": "2026-01-03",
|
||||
"checkout": "2026-01-04"
|
||||
},
|
||||
"flaneur": {
|
||||
"url": "https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/",
|
||||
"final_url": "https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/",
|
||||
"captured_at": "2026-01-03T01:21:09+00:00",
|
||||
"checkin": "2026-01-03",
|
||||
"checkout": "2026-01-04",
|
||||
"screenshot_path": "/root/flaneur-analysis/data/flaneur/screenshots/flaneur__roomraccoon__pricing__20260103_20260104__20260103.png",
|
||||
"html_path": "/root/flaneur-analysis/data/flaneur/raw/flaneur__roomraccoon__pricing__20260103_20260104__20260103.html",
|
||||
"rooms": [
|
||||
{
|
||||
"name": "Dortoir mixte 16 lits",
|
||||
"available_units": 10,
|
||||
"min_price_eur": 22.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir mixte 10 lits - 1",
|
||||
"available_units": 8,
|
||||
"min_price_eur": 25.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir mixte 10 lits - 2",
|
||||
"available_units": 9,
|
||||
"min_price_eur": 25.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir féminin 12 lits",
|
||||
"available_units": 11,
|
||||
"min_price_eur": 25.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir mixte 6 lits - 1",
|
||||
"available_units": 5,
|
||||
"min_price_eur": 28.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir mixte 4 lits D4 /D8",
|
||||
"available_units": 7,
|
||||
"min_price_eur": 30.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir mixte 4 lits D7",
|
||||
"available_units": 3,
|
||||
"min_price_eur": 30.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "Chambre Privée 4 personnes",
|
||||
"available_units": null,
|
||||
"min_price_eur": 50.88,
|
||||
"unit": "room",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
},
|
||||
{
|
||||
"name": "D9 - Dortoir Féminin 4 lits",
|
||||
"available_units": 3,
|
||||
"min_price_eur": 30.88,
|
||||
"unit": "bed",
|
||||
"refund_policy_hint": "Remboursable jusqu'à 15h la veille"
|
||||
}
|
||||
]
|
||||
},
|
||||
"ho36": {
|
||||
"url": "https://ho36lyon.com/",
|
||||
"final_url": "https://ho36lyon.com/",
|
||||
"captured_at": "2026-01-03T01:21:23+00:00",
|
||||
"checkin": "2026-01-03",
|
||||
"checkout": "2026-01-04",
|
||||
"screenshot_path": "/root/flaneur-analysis/data/ho36/screenshots/ho36__mews__pricing__20260103_20260104__20260103.png",
|
||||
"html_path": "/root/flaneur-analysis/data/ho36/raw/ho36__mews__pricing__20260103_20260104__20260103.txt",
|
||||
"rooms": [
|
||||
{
|
||||
"name": "Chambre Ho my God",
|
||||
"available_units": 4,
|
||||
"min_price_eur": 75.0,
|
||||
"unit": "par chambre/par nuit",
|
||||
"refund_policy_hint": "Non remboursable"
|
||||
},
|
||||
{
|
||||
"name": "Chambre familiale",
|
||||
"available_units": 1,
|
||||
"min_price_eur": 180.0,
|
||||
"unit": "par chambre/par nuit",
|
||||
"refund_policy_hint": "Non remboursable"
|
||||
},
|
||||
{
|
||||
"name": "Chambre single RDC",
|
||||
"available_units": 1,
|
||||
"min_price_eur": 55.0,
|
||||
"unit": "par chambre/par nuit",
|
||||
"refund_policy_hint": "Non remboursable"
|
||||
},
|
||||
{
|
||||
"name": "Dortoir privatif pour 2 personnes",
|
||||
"available_units": 1,
|
||||
"min_price_eur": 76.0,
|
||||
"unit": "par dortoir/par nuit",
|
||||
"refund_policy_hint": "Non remboursable"
|
||||
},
|
||||
{
|
||||
"name": "Lit en dortoir mixte",
|
||||
"available_units": 3,
|
||||
"min_price_eur": 28.0,
|
||||
"unit": "par lit/par nuit",
|
||||
"refund_policy_hint": "Non remboursable"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
315
verify/tools/analyze_hostelworld_reviews.py
Normal file
315
verify/tools/analyze_hostelworld_reviews.py
Normal file
|
|
@ -0,0 +1,315 @@
|
|||
#!/usr/bin/env python3
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
from collections import Counter
|
||||
from datetime import date, datetime, timedelta
|
||||
from pathlib import Path
|
||||
from statistics import mean, median
|
||||
from typing import Any
|
||||
|
||||
|
||||
STOPWORDS = {
|
||||
"the",
|
||||
"and",
|
||||
"a",
|
||||
"an",
|
||||
"to",
|
||||
"of",
|
||||
"in",
|
||||
"for",
|
||||
"on",
|
||||
"at",
|
||||
"with",
|
||||
"is",
|
||||
"it",
|
||||
"was",
|
||||
"were",
|
||||
"are",
|
||||
"be",
|
||||
"been",
|
||||
"i",
|
||||
"we",
|
||||
"you",
|
||||
"they",
|
||||
"this",
|
||||
"that",
|
||||
"as",
|
||||
"but",
|
||||
"so",
|
||||
"if",
|
||||
"not",
|
||||
"very",
|
||||
"really",
|
||||
"just",
|
||||
"my",
|
||||
"our",
|
||||
"their",
|
||||
"your",
|
||||
"me",
|
||||
"us",
|
||||
"them",
|
||||
}
|
||||
|
||||
|
||||
THEME_RULES: dict[str, list[re.Pattern[str]]] = {
|
||||
"cleanliness": [
|
||||
re.compile(r"\bclean\b", re.I),
|
||||
re.compile(r"\bdirty\b", re.I),
|
||||
re.compile(r"\bsmell\b", re.I),
|
||||
re.compile(r"\bstink\b", re.I),
|
||||
re.compile(r"\breek\b", re.I),
|
||||
re.compile(r"\bmold\b", re.I),
|
||||
re.compile(r"\bbath(room|rooms)?\b", re.I),
|
||||
re.compile(r"\btoilet(s)?\b", re.I),
|
||||
re.compile(r"\bshower(s)?\b", re.I),
|
||||
],
|
||||
"staff_reception": [
|
||||
re.compile(r"\bstaff\b", re.I),
|
||||
re.compile(r"\breception\b", re.I),
|
||||
re.compile(r"\bfront\s*desk\b", re.I),
|
||||
re.compile(r"\bhelpful\b", re.I),
|
||||
re.compile(r"\brude\b", re.I),
|
||||
re.compile(r"\bfriendl(y|iness)\b", re.I),
|
||||
re.compile(r"\b24h\b", re.I),
|
||||
re.compile(r"\b24\s*hour\b", re.I),
|
||||
],
|
||||
"safety_neighborhood": [
|
||||
re.compile(r"\bsafe\b", re.I),
|
||||
re.compile(r"\bunsafe\b", re.I),
|
||||
re.compile(r"\bdanger(ous|)\b", re.I),
|
||||
re.compile(r"\bdrug(s|)\b", re.I),
|
||||
re.compile(r"\bdealer(s)?\b", re.I),
|
||||
re.compile(r"\bsketchy\b", re.I),
|
||||
re.compile(r"\bafter\s+dark\b", re.I),
|
||||
re.compile(r"\bnight\b", re.I),
|
||||
re.compile(r"\bhomeless\b", re.I),
|
||||
],
|
||||
"sleep_noise": [
|
||||
re.compile(r"\bnois(e|y)\b", re.I),
|
||||
re.compile(r"\bloud\b", re.I),
|
||||
re.compile(r"\bsleep\b", re.I),
|
||||
re.compile(r"\bbunk\b", re.I),
|
||||
re.compile(r"\brattl(e|ing)\b", re.I),
|
||||
re.compile(r"\bcurtain(s)?\b", re.I),
|
||||
re.compile(r"\bprivacy\b", re.I),
|
||||
],
|
||||
"kitchen_food": [
|
||||
re.compile(r"\bkitchen\b", re.I),
|
||||
re.compile(r"\bbreakfast\b", re.I),
|
||||
re.compile(r"\bfood\b", re.I),
|
||||
re.compile(r"\bbar\b", re.I),
|
||||
re.compile(r"\bcafe\b", re.I),
|
||||
re.compile(r"\bcoffee\b", re.I),
|
||||
re.compile(r"\bdrink(s)?\b", re.I),
|
||||
],
|
||||
"value_price": [
|
||||
re.compile(r"\bvalue\b", re.I),
|
||||
re.compile(r"\bprice\b", re.I),
|
||||
re.compile(r"\bexpensive\b", re.I),
|
||||
re.compile(r"\bcheap\b", re.I),
|
||||
re.compile(r"\bworth\b", re.I),
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description="Summarize Hostelworld reviews JSON into theme counts.")
|
||||
parser.add_argument("--in", dest="inputs", action="append", required=True, help="Input JSON file")
|
||||
parser.add_argument("--out", required=True, help="Output Markdown path")
|
||||
parser.add_argument("--label", action="append", default=[], help="Optional label per --in (same order)")
|
||||
parser.add_argument(
|
||||
"--days",
|
||||
type=int,
|
||||
default=365,
|
||||
help="Only include reviews within the last N days (based on the review 'date' field)",
|
||||
)
|
||||
parser.add_argument("--low-threshold", type=float, default=65.0, help="Overall score <= this is 'negative' (0-100)")
|
||||
parser.add_argument("--high-threshold", type=float, default=85.0, help="Overall score >= this is 'positive' (0-100)")
|
||||
parser.add_argument("--min-theme-count", type=int, default=5, help="Only show themes with at least this many mentions")
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def get_overall(review: dict[str, Any]) -> float | None:
|
||||
rating = review.get("rating") or {}
|
||||
overall = rating.get("overall")
|
||||
return float(overall) if isinstance(overall, (int, float)) else None
|
||||
|
||||
|
||||
def get_text(review: dict[str, Any]) -> str:
|
||||
return str(review.get("notes") or "")
|
||||
|
||||
|
||||
def bucket(overall: float | None, *, low: float, high: float) -> str:
|
||||
if overall is None:
|
||||
return "unknown"
|
||||
if overall <= low:
|
||||
return "negative"
|
||||
if overall >= high:
|
||||
return "positive"
|
||||
return "neutral"
|
||||
|
||||
|
||||
def tokenize(text: str) -> list[str]:
|
||||
words = re.findall(r"[a-zA-Z]{3,}", text.lower())
|
||||
return [w for w in words if w not in STOPWORDS]
|
||||
|
||||
|
||||
def detect_themes(text: str) -> set[str]:
|
||||
hits: set[str] = set()
|
||||
for theme, patterns in THEME_RULES.items():
|
||||
if any(p.search(text) for p in patterns):
|
||||
hits.add(theme)
|
||||
return hits
|
||||
|
||||
|
||||
def fmt10(score100: float | None) -> str:
|
||||
if score100 is None:
|
||||
return "n/a"
|
||||
return f"{score100 / 10:.1f}"
|
||||
|
||||
|
||||
def summarize(payload: dict[str, Any], *, low: float, high: float) -> dict[str, Any]:
|
||||
reviews_all = payload.get("reviews") or []
|
||||
fetched_at = payload.get("fetched_at")
|
||||
ref_dt = None
|
||||
try:
|
||||
ref_dt = datetime.fromisoformat(fetched_at) if isinstance(fetched_at, str) else None
|
||||
except ValueError:
|
||||
ref_dt = None
|
||||
if ref_dt is None:
|
||||
ref_dt = datetime.now()
|
||||
|
||||
days = int(payload.get("_analysis_days", 365))
|
||||
since_date = (ref_dt.date() - timedelta(days=days))
|
||||
|
||||
reviews: list[dict[str, Any]] = []
|
||||
for r in reviews_all:
|
||||
d = r.get("date")
|
||||
if not isinstance(d, str) or len(d) < 10:
|
||||
continue
|
||||
try:
|
||||
rd = date.fromisoformat(d[:10])
|
||||
except ValueError:
|
||||
continue
|
||||
if rd >= since_date:
|
||||
reviews.append(r)
|
||||
overall_scores = [get_overall(r) for r in reviews]
|
||||
overall_clean = [s for s in overall_scores if s is not None]
|
||||
|
||||
bucket_counts = Counter(bucket(get_overall(r), low=low, high=high) for r in reviews)
|
||||
|
||||
sub_keys = ["safety", "location", "staff", "atmosphere", "cleanliness", "facilities", "value"]
|
||||
subs: dict[str, list[float]] = {k: [] for k in sub_keys}
|
||||
for r in reviews:
|
||||
rating = r.get("rating") or {}
|
||||
for k in sub_keys:
|
||||
v = rating.get(k)
|
||||
if isinstance(v, (int, float)):
|
||||
subs[k].append(float(v))
|
||||
sub_avgs = {k: (mean(v) if v else None) for k, v in subs.items()}
|
||||
|
||||
theme_counts: dict[str, Counter[str]] = {b: Counter() for b in ["positive", "neutral", "negative"]}
|
||||
keyword_counts: dict[str, Counter[str]] = {b: Counter() for b in ["positive", "neutral", "negative"]}
|
||||
|
||||
for r in reviews:
|
||||
text = get_text(r)
|
||||
b = bucket(get_overall(r), low=low, high=high)
|
||||
if b not in theme_counts:
|
||||
continue
|
||||
for t in detect_themes(text):
|
||||
theme_counts[b][t] += 1
|
||||
keyword_counts[b].update(tokenize(text))
|
||||
|
||||
return {
|
||||
"property_id": payload.get("property_id"),
|
||||
"month_count": payload.get("month_count"),
|
||||
"total_reviews": len(reviews),
|
||||
"since_date": since_date.isoformat(),
|
||||
"bucket_counts": dict(bucket_counts),
|
||||
"overall_mean": mean(overall_clean) if overall_clean else None,
|
||||
"overall_median": median(overall_clean) if overall_clean else None,
|
||||
"subscore_avgs": sub_avgs,
|
||||
"theme_counts": {k: dict(v) for k, v in theme_counts.items()},
|
||||
"top_keywords": {k: keyword_counts[k].most_common(25) for k in keyword_counts},
|
||||
"review_statistics": payload.get("review_statistics") or {},
|
||||
}
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
labels = list(args.label or [])
|
||||
while len(labels) < len(args.inputs):
|
||||
labels.append(Path(args.inputs[len(labels)]).stem)
|
||||
|
||||
summaries = []
|
||||
for path, label in zip(args.inputs, labels, strict=True):
|
||||
payload = json.loads(Path(path).read_text(encoding="utf-8"))
|
||||
payload["_analysis_days"] = args.days
|
||||
s = summarize(payload, low=args.low_threshold, high=args.high_threshold)
|
||||
s["label"] = label
|
||||
summaries.append(s)
|
||||
|
||||
out_path = Path(args.out)
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
lines: list[str] = []
|
||||
lines.append("# Hostelworld review themes (last 12 months)")
|
||||
lines.append("")
|
||||
lines.append(f"- Generated: {datetime.now().isoformat(timespec='seconds')}")
|
||||
lines.append(f"- Window: last {args.days} days")
|
||||
lines.append(f"- Negative threshold: <= {args.low_threshold:.0f}/100")
|
||||
lines.append(f"- Positive threshold: >= {args.high_threshold:.0f}/100")
|
||||
lines.append("")
|
||||
|
||||
for s in summaries:
|
||||
lines.append(f"## {s['label']}")
|
||||
lines.append("")
|
||||
lines.append("| Metric | Value |")
|
||||
lines.append("|---|---|")
|
||||
lines.append(f"| Window since | {s['since_date']} |")
|
||||
lines.append(f"| Reviews (12m) | {s['total_reviews']} |")
|
||||
lines.append(f"| Overall mean (/100) | {s['overall_mean']:.1f} |" if s["overall_mean"] is not None else "| Overall mean (/100) | n/a |")
|
||||
lines.append(f"| Overall median (/100) | {s['overall_median']:.1f} |" if s["overall_median"] is not None else "| Overall median (/100) | n/a |")
|
||||
bc = s["bucket_counts"]
|
||||
lines.append(f"| Positive / Neutral / Negative | {bc.get('positive',0)} / {bc.get('neutral',0)} / {bc.get('negative',0)} |")
|
||||
lines.append("")
|
||||
|
||||
sub = s["subscore_avgs"]
|
||||
lines.append("| Subscore (avg/10) | Score |")
|
||||
lines.append("|---|---|")
|
||||
for k in ["cleanliness", "facilities", "staff", "atmosphere", "safety", "location", "value"]:
|
||||
lines.append(f"| {k} | {fmt10(sub.get(k))} |")
|
||||
lines.append("")
|
||||
|
||||
lines.append(f"### Themes (mentions, min {args.min_theme_count})")
|
||||
lines.append("")
|
||||
for bucket_name in ["negative", "neutral", "positive"]:
|
||||
counts = Counter(s["theme_counts"].get(bucket_name, {}))
|
||||
counts = Counter({k: v for k, v in counts.items() if v >= args.min_theme_count})
|
||||
if not counts:
|
||||
continue
|
||||
lines.append(f"**{bucket_name}**")
|
||||
for theme, cnt in counts.most_common():
|
||||
lines.append(f"- {theme}: {cnt}")
|
||||
lines.append("")
|
||||
|
||||
lines.append("### Top keywords (sanity check)")
|
||||
lines.append("")
|
||||
for bucket_name in ["negative", "positive"]:
|
||||
kws = s["top_keywords"].get(bucket_name, [])
|
||||
if not kws:
|
||||
continue
|
||||
lines.append(f"**{bucket_name}**")
|
||||
lines.append(", ".join([f"{w} ({c})" for w, c in kws[:15]]))
|
||||
lines.append("")
|
||||
|
||||
out_path.write_text("\n".join(lines) + "\n", encoding="utf-8")
|
||||
print(out_path)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
313
verify/tools/capture_pricing_window.py
Normal file
313
verify/tools/capture_pricing_window.py
Normal file
|
|
@ -0,0 +1,313 @@
|
|||
#!/usr/bin/env python3
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
from dataclasses import asdict, dataclass
|
||||
from datetime import date, datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from playwright.sync_api import Error as PlaywrightError
|
||||
from playwright.sync_api import sync_playwright
|
||||
|
||||
|
||||
FLANEUR_URL = "https://booking.roomraccoon.fr/le-fl-neur-guesthouse-8346/fr/"
|
||||
HO36_URL = "https://ho36lyon.com/"
|
||||
|
||||
|
||||
def iso_now() -> str:
|
||||
return datetime.now(timezone.utc).astimezone().isoformat(timespec="seconds")
|
||||
|
||||
|
||||
def parse_iso_date(value: str) -> date:
|
||||
try:
|
||||
return date.fromisoformat(value)
|
||||
except ValueError as exc:
|
||||
msg = f"Invalid ISO date: {value!r} (expected YYYY-MM-DD)"
|
||||
raise SystemExit(msg) from exc
|
||||
|
||||
|
||||
def fmt_dd_mm_yyyy(d: date, sep: str = "-") -> str:
|
||||
return f"{d.day:02d}{sep}{d.month:02d}{sep}{d.year:04d}"
|
||||
|
||||
|
||||
def fmt_mm_dd_yyyy(d: date) -> str:
|
||||
return f"{d.month:02d}/{d.day:02d}/{d.year:04d}"
|
||||
|
||||
|
||||
def ensure_parent(path: Path) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
|
||||
def parse_eur_amount(text: str) -> float | None:
|
||||
cleaned = text.replace("\xa0", " ").strip()
|
||||
m = re.search(r"([0-9]+(?:[.,][0-9]{1,2})?)", cleaned)
|
||||
if not m:
|
||||
return None
|
||||
return float(m.group(1).replace(",", "."))
|
||||
|
||||
|
||||
def parse_int(text: str) -> int | None:
|
||||
m = re.search(r"(\d+)", text)
|
||||
return int(m.group(1)) if m else None
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class RoomOffer:
|
||||
name: str
|
||||
available_units: int | None
|
||||
min_price_eur: float | None
|
||||
unit: str | None
|
||||
refund_policy_hint: str | None
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Capture:
|
||||
url: str
|
||||
final_url: str | None
|
||||
captured_at: str
|
||||
checkin: str
|
||||
checkout: str
|
||||
screenshot_path: str
|
||||
html_path: str
|
||||
rooms: list[RoomOffer]
|
||||
|
||||
|
||||
def capture_flaneur(*, checkin: date, checkout: date, page, screenshot_path: Path, html_path: Path) -> Capture:
|
||||
page.goto(FLANEUR_URL, wait_until="domcontentloaded", timeout=60_000)
|
||||
page.wait_for_timeout(1_500)
|
||||
|
||||
expected_start = fmt_dd_mm_yyyy(checkin, sep="-")
|
||||
expected_end = fmt_dd_mm_yyyy(checkout, sep="-")
|
||||
|
||||
# RoomRaccoon uses readonly inputs with overlay divs; if defaults differ, we still capture
|
||||
# but record what was actually present.
|
||||
actual_start = page.input_value("#reservationStart")
|
||||
actual_end = page.input_value("#reservationEnd")
|
||||
|
||||
if actual_start != expected_start or actual_end != expected_end:
|
||||
# Best-effort adjust using DOM injection + event dispatch (may be ignored by the app).
|
||||
page.evaluate(
|
||||
"""([start, end]) => {
|
||||
const s = document.querySelector('#reservationStart');
|
||||
const e = document.querySelector('#reservationEnd');
|
||||
if (s) { s.value = start; s.dispatchEvent(new Event('change', { bubbles: true })); }
|
||||
if (e) { e.value = end; e.dispatchEvent(new Event('change', { bubbles: true })); }
|
||||
}""",
|
||||
[expected_start, expected_end],
|
||||
)
|
||||
page.wait_for_timeout(300)
|
||||
actual_start = page.input_value("#reservationStart")
|
||||
actual_end = page.input_value("#reservationEnd")
|
||||
|
||||
page.locator('div:has-text("V\u00c9RIFIER LA DISPONIBILIT\u00c9")').first.click()
|
||||
page.wait_for_selector(".be-room", timeout=45_000)
|
||||
page.wait_for_timeout(800)
|
||||
|
||||
ensure_parent(screenshot_path)
|
||||
ensure_parent(html_path)
|
||||
page.screenshot(path=str(screenshot_path), full_page=True)
|
||||
html_path.write_text(page.content(), encoding="utf-8")
|
||||
|
||||
rooms: list[RoomOffer] = []
|
||||
cards = page.locator(".be-room")
|
||||
for i in range(cards.count()):
|
||||
card = cards.nth(i)
|
||||
name = (card.locator("h2,h3").first.inner_text().strip() if card.locator("h2,h3").count() else "").strip()
|
||||
if not name:
|
||||
continue
|
||||
avail_text = card.locator(".be-room-availability").first.inner_text().strip() if card.locator(".be-room-availability").count() else ""
|
||||
available_units = parse_int(avail_text)
|
||||
price_texts = [t.strip() for t in card.locator(".be-room-ratetype-price").all_text_contents() if t.strip()]
|
||||
prices = [p for p in (parse_eur_amount(t) for t in price_texts) if p is not None]
|
||||
min_price = min(prices) if prices else None
|
||||
card_text = card.inner_text().strip()
|
||||
refund_hint = None
|
||||
for line in card_text.splitlines():
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
if "Remboursable" in line or "Non remboursable" in line:
|
||||
refund_hint = line
|
||||
break
|
||||
if name.lower().startswith("chambre"):
|
||||
unit = "room"
|
||||
else:
|
||||
unit = "bed"
|
||||
rooms.append(
|
||||
RoomOffer(
|
||||
name=name,
|
||||
available_units=available_units,
|
||||
min_price_eur=min_price,
|
||||
unit=unit,
|
||||
refund_policy_hint=refund_hint,
|
||||
)
|
||||
)
|
||||
|
||||
return Capture(
|
||||
url=FLANEUR_URL,
|
||||
final_url=page.url,
|
||||
captured_at=iso_now(),
|
||||
checkin=checkin.isoformat(),
|
||||
checkout=checkout.isoformat(),
|
||||
screenshot_path=str(screenshot_path),
|
||||
html_path=str(html_path),
|
||||
rooms=rooms,
|
||||
)
|
||||
|
||||
|
||||
def capture_ho36(*, checkin: date, checkout: date, page, screenshot_path: Path, html_path: Path) -> Capture:
|
||||
page.goto(HO36_URL, wait_until="domcontentloaded", timeout=60_000)
|
||||
page.wait_for_timeout(2_500)
|
||||
|
||||
page.locator('input[id^="mews-checkin-"]').first.fill(fmt_mm_dd_yyyy(checkin))
|
||||
page.locator('input[id^="mews-checkout-"]').first.fill(fmt_mm_dd_yyyy(checkout))
|
||||
page.wait_for_timeout(300)
|
||||
page.locator(".mews-button").first.click()
|
||||
page.wait_for_selector("iframe.mews-distributor", timeout=30_000)
|
||||
|
||||
frame = page.frame_locator("iframe.mews-distributor")
|
||||
frame.locator("text=/S\u00e9lectionnez une cat\u00e9gorie/i").wait_for(timeout=45_000)
|
||||
|
||||
# Currency is sometimes CAD by default. Switch to EUR if needed.
|
||||
if frame.locator("text=CAD").count():
|
||||
frame.locator("text=CAD").first.click()
|
||||
frame.locator('h2:has-text("S\u00e9lectionnez votre devise")').wait_for(timeout=30_000)
|
||||
frame.locator("text=\u20ac\xa0EUR").first.click()
|
||||
frame.locator("text=EUR").first.wait_for(timeout=30_000)
|
||||
page.wait_for_timeout(800)
|
||||
|
||||
# Currency switching can trigger a brief reload; wait for room names to appear.
|
||||
frame.locator("text=/S\u00e9lectionnez une cat\u00e9gorie/i").wait_for(timeout=45_000)
|
||||
frame.locator("text=/^(Chambre|Lit|Dortoir|Suite)\\b/i").first.wait_for(timeout=45_000)
|
||||
|
||||
body_text = frame.locator("body").inner_text(timeout=30_000)
|
||||
lines = [l.strip() for l in body_text.splitlines() if l.strip()]
|
||||
|
||||
rooms: list[RoomOffer] = []
|
||||
name_re = re.compile(r"^(Chambre|Lit|Dortoir|Suite)\b", re.IGNORECASE)
|
||||
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
if line.lower().startswith("cat\u00e9gories sans disponibilit\u00e9"):
|
||||
break
|
||||
if line != "Image pr\u00e9c\u00e9dente":
|
||||
i += 1
|
||||
continue
|
||||
|
||||
j = i + 1
|
||||
while j < len(lines) and lines[j].isdigit():
|
||||
j += 1
|
||||
if j >= len(lines) or not name_re.search(lines[j]):
|
||||
i += 1
|
||||
continue
|
||||
|
||||
name = lines[j]
|
||||
available_units = None
|
||||
price_eur = None
|
||||
unit = None
|
||||
refund_hint = None
|
||||
|
||||
k = j + 1
|
||||
while k < len(lines):
|
||||
nxt = lines[k]
|
||||
if nxt.lower().startswith("cat\u00e9gories sans disponibilit\u00e9") or nxt == "Image pr\u00e9c\u00e9dente":
|
||||
break
|
||||
if nxt == "Non remboursable" and refund_hint is None:
|
||||
refund_hint = nxt
|
||||
if nxt.startswith("Disponible") and available_units is None:
|
||||
available_units = parse_int(nxt)
|
||||
if "\u20ac" in nxt and price_eur is None:
|
||||
price_eur = parse_eur_amount(nxt)
|
||||
if nxt.startswith("par ") and unit is None:
|
||||
unit = nxt
|
||||
k += 1
|
||||
|
||||
rooms.append(
|
||||
RoomOffer(
|
||||
name=name,
|
||||
available_units=available_units,
|
||||
min_price_eur=price_eur,
|
||||
unit=unit,
|
||||
refund_policy_hint=refund_hint,
|
||||
)
|
||||
)
|
||||
i = k
|
||||
|
||||
ensure_parent(screenshot_path)
|
||||
ensure_parent(html_path)
|
||||
page.screenshot(path=str(screenshot_path), full_page=True)
|
||||
html_path.write_text(body_text, encoding="utf-8")
|
||||
|
||||
return Capture(
|
||||
url=HO36_URL,
|
||||
final_url=page.url,
|
||||
captured_at=iso_now(),
|
||||
checkin=checkin.isoformat(),
|
||||
checkout=checkout.isoformat(),
|
||||
screenshot_path=str(screenshot_path),
|
||||
html_path=str(html_path),
|
||||
rooms=rooms,
|
||||
)
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description="Capture a comparable pricing/availability window for both hostels.")
|
||||
parser.add_argument("--checkin", default="2026-01-03", help="ISO date YYYY-MM-DD")
|
||||
parser.add_argument("--checkout", default="2026-01-04", help="ISO date YYYY-MM-DD")
|
||||
parser.add_argument("--repo-root", default=str(Path(__file__).resolve().parents[2]), help="Repo root (default: auto)")
|
||||
parser.add_argument("--run-tag", default=None, help="Optional YYYYMMDD tag for filenames (default: today)")
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
repo_root = Path(args.repo_root).resolve()
|
||||
checkin = parse_iso_date(args.checkin)
|
||||
checkout = parse_iso_date(args.checkout)
|
||||
|
||||
if checkout <= checkin:
|
||||
raise SystemExit("--checkout must be after --checkin")
|
||||
|
||||
run_tag = args.run_tag or datetime.now().strftime("%Y%m%d")
|
||||
window_tag = f"{checkin.strftime('%Y%m%d')}_{checkout.strftime('%Y%m%d')}"
|
||||
|
||||
flaneur_png = repo_root / "data" / "flaneur" / "screenshots" / f"flaneur__roomraccoon__pricing__{window_tag}__{run_tag}.png"
|
||||
flaneur_html = repo_root / "data" / "flaneur" / "raw" / f"flaneur__roomraccoon__pricing__{window_tag}__{run_tag}.html"
|
||||
ho36_png = repo_root / "data" / "ho36" / "screenshots" / f"ho36__mews__pricing__{window_tag}__{run_tag}.png"
|
||||
ho36_html = repo_root / "data" / "ho36" / "raw" / f"ho36__mews__pricing__{window_tag}__{run_tag}.txt"
|
||||
|
||||
out_json = repo_root / "verify" / "results" / f"pricing_window__{window_tag}__{run_tag}.json"
|
||||
ensure_parent(out_json)
|
||||
|
||||
try:
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
context = browser.new_context(locale="fr-FR", timezone_id="Europe/Paris")
|
||||
page = context.new_page()
|
||||
|
||||
flaneur = capture_flaneur(checkin=checkin, checkout=checkout, page=page, screenshot_path=flaneur_png, html_path=flaneur_html)
|
||||
page.wait_for_timeout(1_250)
|
||||
|
||||
ho36 = capture_ho36(checkin=checkin, checkout=checkout, page=page, screenshot_path=ho36_png, html_path=ho36_html)
|
||||
browser.close()
|
||||
except (PlaywrightError, OSError, ValueError) as exc:
|
||||
raise SystemExit(f"capture failed: {exc}") from exc
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"captured_at": iso_now(),
|
||||
"window": {"checkin": checkin.isoformat(), "checkout": checkout.isoformat()},
|
||||
"flaneur": asdict(flaneur),
|
||||
"ho36": asdict(ho36),
|
||||
}
|
||||
|
||||
out_json.write_text(json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8")
|
||||
print(json.dumps(payload, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
109
verify/tools/fetch_hostelworld_reviews.py
Normal file
109
verify/tools/fetch_hostelworld_reviews.py
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
#!/usr/bin/env python3
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import random
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from urllib.error import HTTPError, URLError
|
||||
from urllib.request import Request, urlopen
|
||||
|
||||
|
||||
API_BASE = "https://prod.apigee.hostelworld.com/legacy-hwapi-service/2.2"
|
||||
|
||||
|
||||
def iso_now() -> str:
|
||||
return datetime.now(timezone.utc).astimezone().isoformat(timespec="seconds")
|
||||
|
||||
|
||||
def fetch_json(url: str, *, timeout_s: float = 30.0, user_agent: str | None = None) -> dict[str, Any]:
|
||||
req = Request(url)
|
||||
req.add_header("Accept", "application/json")
|
||||
req.add_header("User-Agent", user_agent or "Mozilla/5.0")
|
||||
with urlopen(req, timeout=timeout_s) as resp:
|
||||
return json.load(resp)
|
||||
|
||||
|
||||
def jitter_sleep(base_s: float, jitter_s: float) -> None:
|
||||
time.sleep(max(0.0, base_s + random.uniform(-jitter_s, jitter_s)))
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(description="Fetch Hostelworld property reviews (no login).")
|
||||
parser.add_argument("--property-id", type=int, required=True)
|
||||
parser.add_argument(
|
||||
"--month-count",
|
||||
type=int,
|
||||
default=12,
|
||||
help="Legacy API uses a fixed tail window; set high and filter client-side by date",
|
||||
)
|
||||
parser.add_argument("--currency", default="EUR")
|
||||
parser.add_argument("--out", required=True, help="Output JSON path")
|
||||
parser.add_argument("--sleep-s", type=float, default=1.0, help="Delay between page requests")
|
||||
parser.add_argument("--jitter-s", type=float, default=0.25, help="Random +/- jitter applied to --sleep-s")
|
||||
parser.add_argument("--timeout-s", type=float, default=30.0)
|
||||
parser.add_argument("--user-agent", default=None)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
out_path = Path(args.out)
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
base_url = (
|
||||
f"{API_BASE}/properties/{args.property_id}/reviews/"
|
||||
f"?sort=-date&allLanguages=false&page=1&monthCount={args.month_count}¤cy={args.currency}"
|
||||
)
|
||||
|
||||
fetched_at = iso_now()
|
||||
try:
|
||||
page1 = fetch_json(base_url, timeout_s=args.timeout_s, user_agent=args.user_agent)
|
||||
except (HTTPError, URLError, TimeoutError, ValueError) as exc:
|
||||
raise SystemExit(f"failed to fetch page 1: {exc}") from exc
|
||||
|
||||
pagination = page1.get("pagination") or {}
|
||||
number_of_pages = int(pagination.get("numberOfPages") or 1)
|
||||
total_items = int(pagination.get("totalNumberOfItems") or 0)
|
||||
|
||||
pages: list[dict[str, Any]] = [page1]
|
||||
for page_num in range(2, number_of_pages + 1):
|
||||
jitter_sleep(args.sleep_s, args.jitter_s)
|
||||
url = (
|
||||
f"{API_BASE}/properties/{args.property_id}/reviews/"
|
||||
f"?sort=-date&allLanguages=false&page={page_num}&monthCount={args.month_count}¤cy={args.currency}"
|
||||
)
|
||||
pages.append(fetch_json(url, timeout_s=args.timeout_s, user_agent=args.user_agent))
|
||||
|
||||
reviews: list[dict[str, Any]] = []
|
||||
review_statistics: dict[str, Any] | None = None
|
||||
for page in pages:
|
||||
if review_statistics is None:
|
||||
review_statistics = page.get("reviewStatistics") or {}
|
||||
reviews.extend(page.get("reviews") or [])
|
||||
|
||||
payload: dict[str, Any] = {
|
||||
"fetched_at": fetched_at,
|
||||
"api_base": API_BASE,
|
||||
"property_id": args.property_id,
|
||||
"month_count": args.month_count,
|
||||
"currency": args.currency,
|
||||
"pagination": {
|
||||
"number_of_pages": number_of_pages,
|
||||
"total_number_of_items": total_items,
|
||||
},
|
||||
"review_statistics": review_statistics or {},
|
||||
"reviews": reviews,
|
||||
}
|
||||
|
||||
out_path.write_text(json.dumps(payload, ensure_ascii=False, indent=2) + "\n", encoding="utf-8")
|
||||
print(json.dumps(payload, ensure_ascii=False, indent=2))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Loading…
Add table
Reference in a new issue