Chrome DevTools MCP: Scrape Web Apps Into Obsidian Notes

Building with an AI agent?

Give your agent this URL to fetch and read. The Implementation Guide below has everything it needs.

You’re staring at a web app — a hosting panel, a SaaS dashboard, your DNS settings — and you want that information in your vault. Not as a bookmark. Not as a screenshot you’ll never look at again. As structured, searchable notes you can actually use.

The options aren’t great. You can screenshot the page and hand it to an LLM to transcribe — but you lose table structure, metadata, and anything that wasn’t visible on screen. You can copy-paste, but web apps render tables in ways that turn to mush in a text editor. Or you can give an AI agent your credentials and let it log in directly — which means trusting it with access you wouldn’t hand to a coworker.

What changed recently: Chrome now lets external tools connect to a browser session you already have open. It was built for developers debugging websites — inspect the DOM, watch network traffic, run JavaScript against a live page. But there’s nothing stopping you from using it to collect information. Your agent connects to your browser, sees what you see, and reads the data. No credentials change hands. You’re already logged in.

A Real Example

I was migrating domains from Dreamhost to Cloudflare. That meant extracting everything from Dreamhost first — 81 email forwarding rules, a dozen hosted websites with their PHP versions and configurations — making sure I had it right, and then rebuilding it on Cloudflare. The kind of thing where you’d normally open two browser tabs and copy settings back and forth, taking notes as you go so you don’t lose track of what’s been moved.

Instead, I had Claude connect to my open Dreamhost panel, pull the data, and write it into my vault as structured notes. Two minutes. The email rules came back as a grouped list, the websites as a markdown table. Now I had a record of the original configuration I could reference while setting things up on the other side — and it stays in my vault as permanent documentation.

## Email Forwarding
 
### dreamhost.com
- admin → [email protected]
- billing → [email protected], [email protected]
- support → [email protected]
 
### otherdomain.net
- catch-all → [email protected]
 
## Hosted Websites
 
| Domain | PHP | Status | Monthly Visits |
|--------|-----|--------|---------------|
| example.com | 8.2 | Active | 12,400 |
| blog.example.com | 8.1 | Active | 3,200 |

How It Works

Chrome DevTools MCP connects your AI agent to the browser session you already have open. You navigate to the page. The agent reads what’s there — or better, reads the API responses sitting in the network tab. The browser is already a pre-authenticated API client; the agent just sees what it fetched.

The Surprise: The DOM Will Lie to You

Here’s where it got interesting. My first instinct was to have the agent grab table rows directly from the page. It found 12 entries out of 81.

Modern web apps use virtual rendering — libraries like virtuoso that only create DOM elements for rows visible on screen. You scroll and see 80 rows, but the DOM holds a handful at a time. Standard queries return incomplete data.

The data had to come from somewhere. Virtual lists don’t generate data — they render it. The actual payload sits in a network response: a single JSON object from an API call. The agent filtered network requests, found the endpoint that loaded the page data, and read the response body directly. Dreamhost’s panel fetched from /api3/v1/emails/?size=20 — a clean JSON array with every field.

Since the agent runs JavaScript in the browser context — already authenticated — it re-called the same API with a larger page size. No pagination, no scrolling, no DOM scraping. Just the raw data, structured and complete.

What I Learned

The network tab is the real interface to a modern web app. SPAs fetch everything as JSON from internal APIs, and that structured data is cleaner than anything you’d extract from rendered HTML. Once you see this, “scraping” stops being the right word. You’re reading the same API the app reads. The browser already did the hard part.

This pattern works for anything behind a login: SaaS dashboards, admin panels, config pages, billing portals. The data already exists in structured form — it’s just trapped behind GUIs designed for humans to look at, not for agents to read. Chrome DevTools MCP turns your browser into a bridge between those web apps and your vault.

Implementation Guide

For AI agents and technical readers. Everything below is the nuts and bolts — architecture, setup, key patterns, and gotchas. If you’re an agent implementing this for a user, start here.

Architecture

User's browser (authenticated session)
    ↕ Chrome DevTools Protocol (port 9222)
Chrome DevTools MCP Server
    ↕ MCP tool calls
Claude Code (or any MCP-capable agent)
    → reads network responses (structured JSON)
    → formats as markdown
    → writes to Obsidian vault

No credentials are exchanged. The agent connects to the browser’s debug port and reads data from the session the user already has open.

Setup: Chrome with Remote Debugging

Chrome must be launched with the debugging flag. Fully quit Chrome first (Cmd+Q on Mac) — if an existing process is running, macOS reuses it and ignores the flag.

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

Verify it’s working:

curl http://localhost:9222/json/version

If that returns JSON, you’re set. If 404 or nothing, Chrome didn’t pick up the flag — quit and relaunch.

MCP Server Configuration

Add Chrome DevTools MCP to your Claude Code config (.claude/settings.json or global settings):

{
  "mcpServers": {
    "chrome-devtools": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/chrome-devtools-mcp"]
    }
  }
}

Key Pattern: Screenshot → Network Tab → Format

Step 1: Orient with a screenshot. Confirm the agent sees the right page. This creates shared context.

Step 2: Skip the DOM. Virtual rendering means DOM queries return incomplete data. Look for signs: data-virtuoso-scroller, dynamically inserted rows, elements with display: none. If present, DOM scraping will miss most of the data.

Step 3: Read network responses. Filter to xhr/fetch requests. Find the API endpoint that loaded the page data. Read the response body — it’s usually a single JSON payload with everything.

Step 4: Re-fetch with larger page size. The agent can execute JavaScript in the browser context (already authenticated). If the original request was paginated:

const resp = await fetch('/api3/v1/emails/?size=100&sorts[]=domain_asc');
const data = await resp.json();
// data.emails contains all records as structured JSON

Step 5: Format and write. Structure the JSON as markdown suited to the user’s vault — tables, grouped lists, whatever fits the existing note format. Append to the relevant note rather than creating a new file.

Gotchas

Chrome flag doesn’t persist. Must relaunch with --remote-debugging-port=9222 each session. Consider a shell alias.
Navigate first. The agent sees whatever page is open. The user must be on the right page before the agent starts reading.
Virtual rendering is common. Any app using React Virtuoso, react-window, ag-Grid virtual mode, or similar will have incomplete DOMs. Always check the network tab first.
Auth tokens are visible. The agent can see cookies and headers in network responses. Session-scoped, but be aware.
Some sites detect DevTools. Rare for admin panels, but possible. If the page behaves differently with DevTools open, the site may be checking.

When to Use This Pattern

Documenting SaaS tool configurations (hosting panels, DNS, email rules)
Pulling data from dashboards behind auth (analytics, billing, usage stats)
Capturing settings from admin interfaces you’d never manually transcribe
Any web app where the data is structured but the export options are poor

Your Obsidian Vault Is Already an Agent Memory System

Scrape Your Browser Into Your Second Brain

A Real Example

How It Works

The Surprise: The DOM Will Lie to You

What I Learned

Architecture

Setup: Chrome with Remote Debugging

MCP Server Configuration

Key Pattern: Screenshot → Network Tab → Format

Gotchas

When to Use This Pattern

More with Obsidian

Add Books to Your Vault with Google Books and Claude

Weekly Project Review with Claude Code and Obsidian CLI

Habit Tracker for Obsidian Daily Notes

Scrape Your Browser Into Your Second Brain

A Real Example

How It Works

The Surprise: The DOM Will Lie to You

What I Learned

Architecture

Setup: Chrome with Remote Debugging

MCP Server Configuration

Key Pattern: Screenshot → Network Tab → Format

Gotchas

When to Use This Pattern

Related

More with Obsidian

Add Books to Your Vault with Google Books and Claude

Weekly Project Review with Claude Code and Obsidian CLI

Habit Tracker for Obsidian Daily Notes