Simplify and Stabilize Your Playwright Locators

ยท

5 min read

Who controls the locators, controls the web.

Much of Playwright's power comes from its ability to target and interact with elements on a webpage. But as you well know, the web is a finicky place. Elements come and go, and the slightest change in HTML can break your automation scripts.

In this article, we'll explore strategies for creating simple and stable Playwright locators that can withstand the test of time and changes in web applications. We'll also look at how to optimize your selectors for speed and efficiency.

For a more in-depth look at these concepts, check out my complete deep dive on optimizing Playwright locators.

Let's go!

Optimize Selector Specificity

Refining selector specificity involves finding an ideal equilibrium: you need selectors that are precise enough to unambiguously identify target elements across different page states, yet not so intricate that they shatter with the slightest DOM adjustments.

Steer clear of overly rigid selectors; such fragility means that a trivial HTML modification could render them useless. On the flip side, too loose selectors might lead you down a path of position-dependent logic (like indices) or unwarranted reliance on the web page's current implementation.

We're looking for the Goldilocks zone of search queries.

Here are a few anti-patterns and how to fix them:

// ๐Ÿ”ด BAD: Selectors based on fragile structural assumptions
const $link = page.locator('li:nth-child(3) > a');
// ๐ŸŸข GOOD: Selectors grounded in distinguishing features
const $link = page.locator('a[href*="privacy"]');

// ๐Ÿ”ด BAD: Selectors anchored to mutable text
const $button = page.locator('button', {hasText: 'Sign up'});
// ๐ŸŸข GOOD: Robust, pattern-based selectors
const $button = page.locator('button', {
  hasText: /(sign|start|subscribe|launch)/i,
});

// ๐Ÿ”ด BAD: Selectors tied to changeable style-specific classes
const $input = page.locator('input.email');
// ๐ŸŸข GOOD: Selectors focused on functional attributes
const $input = page.locator('input[type="email"]');

// ๐Ÿ”ด BAD: Excessively specific hierarchical selectors
const $list = page.locator('footer > ul.inline-links > li');
// ๐ŸŸข GOOD: Shallow, adaptive nesting
const $list = page.locator('footer:last-of-type li', {
  has: page.locator('a'),
});

To sum up:

  • Prioritize attributes that capture the essence of the target element, rather than arbitrary qualities.

  • If your target elements have no stable defining characteristics, anchor your queries to parents or children who do have such qualities.

  • Keep chain selectors short, and try to only use stable elements within them.

  • Leverage regular expressions for text matching to accommodate variations.

Prefer Semantic Locator Methods

When it comes to selecting elements, it's always better to prioritize semantic attributes over stylistic or functional attributes. While an element's classes or its location in the DOM are typically the easiest to target, they're also the most prone to change. On the other hand, semantic attributes (such as role, label, or title) change infrequently. And when they do, they tend to change in ways that can be accounted for ahead of time.

Given the benefit of targeting semantic attributes, Playwright provides utility methods for directly accessing them. Use them whenever possible. Here's why:

  • Semantic methods encourage best practices. As you'll see in the passages that follow, the most durable selector patterns have helper methods.

  • Semantic methods are typed, improving your IDE experience and alerting you to errors. Plain selector strings can only be debugged at runtime, resulting in more bugs.

  • Semantic methods are chainable. This dramatically simplifies the queries themselves. It also results in better error messages when queries fail.

Let's run through each method that's currently available:

// Target Test IDs when you have control of the HTML
// E.G. `<section data-testid="delete-modal" />`
const $modal = page.getByTestId('delete-modal');

// Target explicit or implied element roles
// "button" matches `<button>`, `<input type="button">`, or `<div aria-role="button">`
const $button = page.getByRole('button', {hasText: 'Buy'});

// Target text HTML attributes that are unlikely to change
const $input = page.getByLabel('Email');
const $search = page.getByPlaceholder(/^search/i);
const $image = page.getByAltText('Profile Picture');
const $icon = page.getByTitle('Info', {exact: false});

// Or target elements by `innerText`, when you can be sure it's stable
const $dialog = page.getByText(/^confirm/i);

I cannot stress enough how valuable it is to target based on role and data-testid in particular.

When you have control of the DOM, data-testid is a fantastic convention to implement across dev teams. Unlike every other HTML attribute, there is never any reason for data-testid to change, making it the least brittle of all possible selectors.

That said, given that you won't always have control over the HTML of target pages, role is an excellent fallback. As shown in the example code above, role is an inherently forgiving selector. It can continue working even across substantial DOM changes.

Chain Locators, Not Selectors

As described above, it's best to avoid long selector query strings. They're inherently difficult to debug, and they result in less descriptive error messages.

Of course, you're not always going to be able to avoid chaining. Sometimes, you need to target particularly evasive DOM nodes. More often, you simply want to break a page up into subtrees and drill down from an intermediate HTML element. (For example, a list of articles in a blog feed.)

Here are two strategies for designing locators that are both easy to read and easy to debug...

Drill Down Within Subtrees

Chaining locators is like adding layers to a sketch; each additional stroke refines the image. Begin with a broad locator and use methods like .filter(), .first(), .last(), and conditional parameters to progressively narrow down to your target element.

// Find the first article about BrowserCat
const $articles = page.locator('article');
const $aboutBrowserCat = $articles.filter({
  hasText: /BrowserCat/i
});
const $firstArticle = $aboutBrowserCat.first();

This approach separates concerns, ties the locators to logical entities, and keeps the code readable and adaptable. If your layout changes, you might only need to adjust the parent locator instead of unraveling multiple complex strings.

Filter by Content with {has} and {hasNot}

Sometimes you need to select an element not only by its properties but also by its relation to others. With {has} and {hasNot} parameters, you can define these relationships clearly, creating a robust context for your selectors.

// Find the first article with an image
const $articles = page.locator('article');
const $withImages = $articles.filter({ 
  has: page.locator('img'),
});
const $firstArticle = $withImages.first();

These parameters act as assertions about the presence of certain elements within a parent. This increases the number of stable, unique attributes you can leverage in creating good query patterns.

Next steps

We've touched on several key practices for creating simple and stable Playwright locators that can withstand the test of time and changes in web applications. I go much deeper on performance, error handling, and refactoring your selectors in my Playwright locators deep-dive. Check it out.

Until next time, happy automating!

ย