Block Ads in Playwright with an Adblocker
Playwright is such a powerful technology because it gives you programmatic access to a full web browser. However, that means it also suffers from the same weaknesses as the browser. Anything that slows down a human browsing the web also slows down your scripts.
Not least among these slow-downs are online ads, trackers, and other annoyances. By preventing these third party resources from loading, we can improve performance, improve stability, and decrease cost.
This article covers how you can efficiently reduce the load of ads and trackers within Playwright scripts for an optimal automation experience. If you're interested in going deeper into resource management and optimization, you'll love my Playwright resource deep-dive.
Let's get started...
Block Ads With Built-In Methods
A straightforward method to stop ads and trackers is to block them by domain. For websites that have minimal dependencies on third-party domains, this is an effective and no-extra-dependency solution:
// watch all network activity
await page.route('**/*', (route) => {
// abort requests from known ad domains
if (route.request().url().includes('cdn.ads.com')) {
route.abort();
} else {
route.continue();
}
});
This is an excellent first step in managing ads during a Playwright job, but it's also quite limited. Firstly, it requires that you keep track of all the third-parties you want to block on your target site. And second, it requires that you know ahead of time which websites your script will run on. After all, your tool might load sites selected by your end users, or it might crawl multiple arbitrary sites during runtime.
Embed an Ad Blocker
Just like in a real browser, you can run an adblocker in your Playwright scripts, leveraging huge open-source blacklists of known offenders.
While this strategy depends on an external library, it's by far the best way to block wasteful web traffic in your automations.
From all the options out there, I recommend @cliqz/adblocker-playwright
. It's a great library with clean docs and great config options (if you even need them). Let's set it up...
Getting Started with @cliqz/adblocker-playwright
It all starts with installing the package:
npm install --save @cliqz/adblocker-playwright
Then set it up like so:
import * as pw from 'playwright';
import { PlaywrightBlocker } from '@cliqz/adblocker-playwright';
const browser = await pw.chromium.launch();
const context = await browser.newContext();
const blocker = await PlaywrightBlocker.fromPrebuiltAdsAndTracking(fetch);
blocker.enableBlockingInContext(context);
By tapping into an up-to-date list of ad sources and trackers, this strategy efficiently filters out unwanted noise. And if you need more control, well, you got it...
Pick Your Favorite Blacklists
Perhaps you've got a custom blacklist. Or perhaps you want to include a few of the open-source lists maintained by good webizens. @cliqz/adblocker
supports leveraging both local and remote blacklists.
Here's an example:
import { readFileSync } from 'fs';
import { PlaywrightBlocker } from '@cliqz/adblocker-playwright';
const blocker = await PlaywrightBlocker.fromLists(fetch, [
'https://easylist.to/easylist/easylist.txt',
]);
blocker.enableBlockingInContext(context);
Interested in picking some custom lists? I suggest starting with EasyList, then if you discover gaps, search the web. There are many, many options.
Next Steps...
These ad-blocking strategies with Playwright pave the way for cleaner automation workflows, better performance, and enhanced security. But this is only one tool in your Playwright optimization toolkit. Be sure to explore my in-depth guide on managing Playwright resources. It has a lot of great tips packed as densely together as I could fit them.
In the meantime, happy automating!