Saturday, May 17, 2025

Docker + Node.js + puppeteer: scrap Tugo travel insurance quote

I built a tool to scrap travel insurance quotes from various provider websites. For most of them, the process was straightforward — I could analyze the HTTP traffic to identify the relevant endpoints and use PROC HTTP to send POST requests directly to retrieve quotes.

However, scraping quotes from the TuGo website proved to be much more challenging.
TuGo’s quote system is highly dynamic and heavily reliant on JavaScript. Initially, I considered using Python with Selenium, but it quickly became clear that this approach wasn’t ideal due to performance limitations, maintenance problem and compatibility issues with the site’s rendering logic.

After reviewing several popular tools, I decided to go with Node.js and Puppeteer. This combination proved to be significantly more reliable and better suited for interacting with TuGo’s modern front-end framework.

Here are the basic steps I followed:
$ docker run -it \
  -v "$PWD:/home/pptruser/app" \
  -w /home/pptruser \
  --rm ghcr.io/puppeteer/puppeteer \
  node ./app/tugo_quote.js 06/07/1966 17/05/2025 31/05/2025
    
Sample output:
********  Inputs ********
traveller_dob: 06/07/1966
trip_start_date: 17/05/2025
trip_end_date: 31/05/2025
********  Tugo travel quote START ********
1. Basic information input
1.a Origin
1.b Destination
1.c Trip Start Date
1.d Trip End Date
1.e Trip Arrival Date
1.f Trip Cost
1.g Traveller info
1.h Click Button - Get a Quote
2. Quote results
2.a Close promotion dialog
2.b Fill in Questions if exist
No Questionnaire button found — skipping.
2.c Enable sliders
2.d Quote loop
Quote for Sum Insured $50K Deductiable $0 =  $60.48
Quote for Sum Insured $50K Deductiable $500 =  $54.43
Quote for Sum Insured $50K Deductiable $1000 =  $48.38
Quote for Sum Insured $100K Deductiable $0 =  $85.99
Quote for Sum Insured $100K Deductiable $500 =  $77.39
Quote for Sum Insured $100K Deductiable $1000 =  $68.80
******** Tugo travel quote END ********