Thursday, September 11, 2025

Hello, my web automation framework!

Recently I’ve been enjoying web automation. If you often find yourself repeating the same manual steps in the browser — logging in, clicking menus, filling forms, checking results — you know how tedious it can get. To save time, I built a lightweight web automation framework using Node.js and Puppeteer. The best part is that I can pause the automation at any step, do some manual work in the browser if needed, and then continue with the following steps seamlessly.

Here are key design ideas:

1. Store inputs in .env
Sensitive values like username, password, and target URLs don’t belong in code. I load them at runtime with dotenv:
/* .env
USER=myuser
PASS=secretpassword
URL=https://example.com
*/

import dotenv from "dotenv";
dotenv.config();

console.log(process.env.USER, process.env.URL);
2. Share state with a global context object
Instead of passing around dozens of variables, I keep everything in one ctx object. This context is passed into all modules and steps, so they can use common objects like browser, page, or config flags.

3. Reuse or create a browser page
The framework can attach to an existing page of the Chrome session if it’s already there, or launch a new one if not. This way I can debug in a real browser window or let the script create a fresh page for me.

4. Define steps as reusable objects
All actions are defined in ctx.steps, where each step has a name and a run(ctx) function. For example:
// app.mjs
import { runner } from "../runner.mjs";

ctx.steps = [
  {
    name: "Page step 1",
    run: async (ctx) => {
      const { page } = ctx;
      // do something on page
    },
  },
  {
    name: "Page step 2",
    run: async (ctx) => {
      const { page } = ctx;
      // do something on page
    },
  },
];

// kick off the web automation
await runner(ctx);

5. Focus only on the steps.
With this framework, I just focus on writing the steps. And since they’re modular, I can run them sequentially, skip some, or even execute them in random order — all with the same framework.
Usage:
    node app.mjs                     # run all steps
    node app.mjs 3                   # run from step 3 to end
    node app.mjs 2 5                 # run steps 2..5
    node app.mjs --list              # list steps


In the future, I plan to extend this framework with screenshots on errors, step runtime tracking, and detailed reporting. That way, not only will the automation save me from repetitive work, but it will also give me clear insights into how each step performs and where failures happen.

Tuesday, September 2, 2025

Partial Web Automation: Chrome Debugging + Puppeteer

I don’t want to fully automate every interaction in the browser - just certain repetitive routines (e.g., filling in forms, checking values, etc.). At the same time, I like to watch the script run step by step so I can verify what’s happening.

Here’s how I’ve set things up to partially automate my workflow.
For example, when I need to log in to an insurance company’s website and fill out an application, I store the applicant information (name, birthday, address) in an Excel file. I log in to the site manually, then launch a Node.js script to handle the repetitive parts. This way, I can monitor each step as it runs and still perform the final review myself.

Steps:
1. Install Node.js and Puppeteer
# install the package globally  
npm install -g puppeteer 
2. Start Chrome with a debugging port
Launch Chrome with the option --remote-debugging-port=8888. To confirm it’s running correctly, check either of these URLs in your browser:
chrome://version/
http://127.0.0.1:8888/json/version (shows Chrome’s DevTools JSON endpoint)

3. Log in and let Puppeteer handle the routine
After logging in to the site manually, you can add code at the // DO SOMETHING placeholder to automate whichever steps you’d like.
const url = 'https://www.manulife.ca';
const puppeteer = require('puppeteer');

(async () => {
    // connect to Chrome
    const browser = await puppeteer.connect({
        browserURL: 'http://127.0.0.1:8888', // debug endpoint
        defaultViewport: null,
    });
    
    const pages = await browser.pages();
    console.log('NOTE: Pages currently open:', pages.map(p => p.url()));

    let page = pages.find(p => p.url().includes(url));

    if (page) {
        console.log('NOTE: Attaching existing page:', page.url());
        await page.bringToFront();
    } else {
        console.log('NOTE: Creating new one...');
        page = await browser.newPage();
        await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 90_000 });
    }

    // DO SOMETHING ...

    browser.disconnect();
})().catch(err => {
    console.log("ERROR occurs.");
    console.error(err);
    process.exit(1);
});