When a suite passes on localhost, passes in CI, and then fails only in preview deployments, the obvious instinct is to blame flakiness. Sometimes that is correct, but more often the failure is telling you that the preview environment is exposing a real difference in how the application is built, deployed, cached, seeded, or authenticated.

That makes preview-only failures especially valuable. They are usually not random. They are symptoms of environment drift between where the test was written and where the app is actually running. In a modern web stack, those differences can hide in asset manifests, CDN behavior, auth redirects, asynchronous data seeding, feature flags, service worker caches, or the timing of a deployment pipeline.

This guide is for frontend engineers, DevOps engineers, and QA engineers who need to debug why test suites fail only in preview environments without turning every incident into a guessing game. It focuses on practical causes, fast triage, and the kinds of checks that help you separate genuine regressions from environment-specific failures.

What makes preview environments different

Preview environments are often treated like disposable copies of production, but they are usually not identical. They are optimized for speed, isolation, and developer feedback, which means teams accept tradeoffs that rarely show up in a local browser session.

Common differences include:

  • Different base URLs, subdomains, or path prefixes
  • Separate identity providers or mocked authentication flows
  • Short-lived infrastructure with warm-up delays
  • CDN caches that behave differently from localhost
  • Staged database seeding or partial fixture loading
  • Feature flags tied to the branch, environment, or tenant
  • Build artifacts produced by a different pipeline than the one used locally

A test that assumes a stable environment may fail when any of those assumptions change. Preview deployments are often the first place where those assumptions are challenged.

If a test only fails in preview, treat the environment as part of the system under test, not as a neutral backdrop.

First question, is it a test problem or an environment problem

Before chasing assertions and selectors, decide whether the failure is caused by the test, the app, or the environment. This saves time and prevents overfitting a workaround.

Ask three questions:

  1. Does the same build fail consistently in the same preview environment?
  2. Does the test fail only when run through the full suite, or also in isolation?
  3. Does the failure disappear when you bypass one environmental dependency, such as auth, caching, or a seeded dataset?

A useful debugging pattern is to reproduce the failure with the smallest possible surface area. Run one test, on one browser, against one preview URL, with logging enabled. If it still fails, the environment is likely the trigger. If it only fails in the suite, your problem may be test order, shared state, or resource contention.

Capture the basics before changing anything

At minimum, log these values with every preview test run:

  • commit SHA
  • preview URL
  • browser and version
  • test start time
  • environment name or tenant ID
  • feature flag state if available
  • API base URL
  • authenticated user role

Those details create a breadcrumb trail. Without them, preview failures become difficult to correlate with deployment events and config changes.

Asset caching and stale builds

One of the most common reasons browser tests fail only in preview environments is stale client-side assets. The application may have been deployed, but the browser, CDN, service worker, or reverse proxy is still serving an older JavaScript bundle, stylesheet, or image.

This can cause failures that look unrelated at first:

  • selectors no longer match because the UI changed but old JavaScript is loaded
  • API calls target an outdated endpoint
  • a component renders differently because a stale stylesheet hides or repositions elements
  • a service worker serves cached application shell content from a previous build

What to check

Look for mismatches between HTML and asset manifests. A preview page might reference a bundle hash that was updated in one path but not another. Also check whether the browser is using cached service worker content. Service workers are a frequent source of confusing preview-only failures because they can survive reloads and keep serving old assets.

Useful debugging steps:

  • hard refresh with cache disabled in the browser
  • open the page in an incognito session
  • unregister the service worker temporarily
  • verify the deployed asset manifest matches the HTML response
  • confirm cache-control headers are appropriate for HTML and immutable assets

A practical Playwright check

import { test, expect } from '@playwright/test';
test('loads the current app shell', async ({ page }) => {
  const response = await page.goto(process.env.PREVIEW_URL!);
  expect(response?.status()).toBe(200);

const assetHint = await page.locator(‘script[src*=”app-“]’).first().getAttribute(‘src’); expect(assetHint).toContain(‘app-‘); });

This is not a full cache test, but it helps surface obvious bundle mismatches. If you suspect service worker caching, add logging for navigator.serviceWorker.controller and inspect the installed worker version.

Why this shows up in preview first

Preview environments often sit behind different caching layers than production or local development. A branch-specific subdomain may be cached aggressively by an edge proxy, or the deploy process may publish HTML before assets are fully available. Local testing rarely simulates those layers unless you deliberately introduce them.

Authentication and authorization mismatches

Auth bugs are another common source of preview-only failures. The app may be authenticated locally using a developer session, but preview deploys often rely on a separate identity provider, temporary cookies, signed URLs, or role-based access configured per environment.

Failures can appear as:

  • redirects to login pages instead of the expected screen
  • missing user-specific data
  • 401 or 403 responses from API calls
  • browser tests that time out while waiting for a page that is actually gated behind auth
  • sessions that work in one preview environment but not another

Typical causes

  1. Redirect URI mismatch between preview subdomains and the identity provider
  2. Cookie domain or SameSite settings that work on localhost but not on preview URLs
  3. CSRF or anti-forgery tokens tied to hostnames
  4. Role-based permissions that differ between test users and production-like accounts
  5. Feature access hidden behind enterprise SSO, but not enabled for ephemeral branches

Debugging approach

Inspect network traffic and confirm the auth handshake completes. If the login flow is mocked in local development but real in preview, the test may not be waiting for the full redirect cycle. If preview uses a different domain pattern, cookie scope is often the first thing to verify.

A simple check in Playwright can help validate state before clicking through the UI:

import { test, expect } from '@playwright/test';
test('user lands on dashboard', async ({ page }) => {
  await page.goto(process.env.PREVIEW_URL!);
  await expect(page).toHaveURL(/dashboard|login/);
});

If the test lands on login, that is not necessarily a test failure. It may be the correct signal that preview auth is misconfigured.

What to compare across environments

Compare these values between local, CI, and preview:

  • cookie names and domains
  • redirect URL configuration
  • identity provider client ID and secret scope
  • user role mappings
  • environment variables for auth middleware

A small mismatch in redirect URI handling can break a suite that otherwise looks stable.

Data seeding and test fixtures

Preview environments often use synthetic data, partial fixtures, or branch-specific database seeds. That is practical, but it creates another failure mode, tests that pass only when the expected record exists, the seeded order is in a specific state, or the dataset is small enough that one query remains fast.

Common issues include:

  • tests assume a record exists when seed jobs are still running
  • fixtures are created in a different timezone or locale
  • IDs are generated differently in preview than in local mocks
  • the test data is correct but not yet visible because background jobs are delayed
  • cleanup logic from previous preview runs leaked into a shared database

Signs the seed is the problem

  • the failure occurs immediately after deploy, then disappears later
  • the UI loads, but expected content is missing or inconsistent
  • API responses show empty arrays or null fields where fixtures should exist
  • the test fails for a specific branch but not another, because the seed script changed

Safer debugging patterns

Use data setup that is idempotent and explicit. Instead of assuming a record exists, create or fetch the exact test entity within the test or a setup endpoint. If that is not possible, add a preflight check to confirm seed completion before the browser test begins.

For API-backed checks, it is often better to verify state through a setup endpoint than to search the UI for seed artifacts.

import requests

resp = requests.get(f”{PREVIEW_URL}/api/test-data-status”) assert resp.json()[“seed_complete”] is True

If your team does not expose a status endpoint, consider adding one for internal test environments. That is usually cheaper than debugging dozens of opaque failures.

Be careful with shared databases

Preview environments are supposed to be isolated, but in practice teams sometimes reuse databases to save time or cost. That can create state bleed between branches. If tests become more reliable after database resets, shared state is probably part of the issue.

Deployment timing and eventual consistency

Not every preview failure is about the browser. Sometimes the deploy is not truly ready when the suite starts. A frontend container may be live while the API, background worker, database migration, or search index is still catching up.

This creates classic timing failures:

  • page loads before dependent API routes are available
  • the app renders a component before migration-backed fields exist
  • a queue-driven process has not finished seeding required data
  • the preview URL resolves before the application is fully warmed up

A deployment is not ready just because the URL returns 200

A successful health check is only one signal. Your test suite may need stronger readiness criteria, especially if it depends on external services or post-deploy jobs.

Consider separate readiness checks for:

  • web app startup
  • schema migrations completed
  • test fixture generation completed
  • background workers online
  • search or indexing services ready
  • CDN cache propagation complete when relevant

Useful GitHub Actions gating pattern

name: preview-tests
on:
  workflow_dispatch:

jobs: wait-for-preview: runs-on: ubuntu-latest steps: - name: Wait for app readiness run: | for i in $(seq 1 30); do code=$(curl -s -o /dev/null -w “%{http_code}” “$PREVIEW_URL/health”) if [ “$code” = “200” ]; then exit 0; fi sleep 10 done exit 1

This does not guarantee the app is fully ready, but it prevents tests from racing the first moment the URL becomes reachable.

Watch for migration lag

Preview failures often appear after a schema change. The frontend merges first, the database migration lands later, and the test suite runs in the middle. If the app expects a new field or enum value, old data can break rendering or assertions.

That is one reason preview test failures should be correlated with deploy order, not just with code changes.

Why flaky browser tests get worse in preview environments

Browser automation tends to be sensitive to timing, layout, and asynchronous state. Preview environments amplify those weaknesses because they are more variable than local dev.

Common examples:

  • animations take longer or shorter depending on CPU allocation
  • responsive breakpoints shift because preview has different viewport defaults
  • fonts load differently, changing text width and element positions
  • network latency increases the chance of race conditions
  • the page loads with hydration gaps that local dev does not expose

A selector like page.locator('text=Save') might work locally and fail in preview if the page renders two save buttons or the text is replaced by an icon while loading. That is not a preview problem alone, it is often a sign that the test is too dependent on transient UI state.

Prefer state-based waits over arbitrary sleep

Avoid waitForTimeout unless you are trying to diagnose a very specific timing issue. Better to wait on a network response, a stable DOM change, or a known application state.

typescript

await page.getByRole('button', { name: 'Save' }).click();
await page.waitForResponse(resp => resp.url().includes('/api/save') && resp.ok());
await expect(page.getByText('Saved')).toBeVisible();

This is more resilient than sleeping for a fixed interval, which can be too short in preview and too long everywhere else.

Check locator strategy

If preview exposes hidden layout differences, selectors based on roles and stable attributes are usually more reliable than brittle CSS chains. Use data-testid for critical elements if the UI is expected to evolve.

Environment drift, the quiet culprit

Environment drift means the application behaves differently because one layer is not actually the same across environments. It can be subtle enough that nobody notices until tests start failing.

Sources of drift include:

  • different Node.js or browser versions
  • different env vars for API or auth
  • changed CDN headers
  • missing feature flags in preview
  • Docker images built from a different base tag
  • branch-specific config that diverged from main

Drift is hard to notice because each individual difference may seem harmless. Together, they can create a failure only visible in preview.

Build a comparison checklist

When a preview-only failure appears, compare local, CI, and preview across these dimensions:

Area Local CI Preview
Node / runtime version same? same? same?
Browser version same? same? same?
Base URL localhost CI target preview subdomain
Auth provider mocked or real mocked or real real or branch-specific
Database local container ephemeral preview instance
Seed timing manual automated async job
Feature flags developer defaults pipeline defaults branch defaults

The value of the table is not in filling it out once, it is in making drift visible early. If two columns differ in ways the test depends on, the failure may be expected rather than mysterious.

How to triage preview-only failures in a repeatable order

A reliable triage order helps teams avoid random experimentation. The goal is to reduce uncertainty quickly.

1. Reproduce with the same URL and same commit

Do not start by changing code. Confirm the exact build and preview environment that failed.

2. Check network and console logs

A browser failure is often downstream of an earlier network error, failed script, blocked request, or CSP issue.

3. Disable caches and service workers

If the failure vanishes, the issue is likely stale client-side state or asset mismatches.

4. Verify auth state

Confirm the test user is logged in, has the right role, and reaches the intended route.

5. Validate seeded data readiness

Check that the expected records exist before the test starts.

6. Add explicit readiness gates

If the preview is still warming up, wait for it intentionally instead of relying on luck.

7. Compare environment config

Inspect env vars, build metadata, browser versions, and deploy order.

If you cannot explain the difference between local and preview, do not assume the test is flaky. Assume the environment is different until proven otherwise.

When to fix the test and when to fix the environment

Not every preview failure should be solved by making the test more lenient. That can hide real regressions.

Fix the test when:

  • the locator is too brittle
  • the assertion depends on ephemeral UI text
  • the test does not wait for the correct application state
  • the flow assumes a deterministic order that the app does not guarantee

Fix the environment when:

  • assets are stale or inconsistently cached
  • auth redirects are broken for preview domains
  • seeds are incomplete or race with test execution
  • readiness checks are too weak
  • preview config diverges from production in ways that matter

A good rule is this, if the app is behaving correctly but the test is too strict, improve the test. If the app is behaving incorrectly because preview setup is incomplete, improve the environment.

A minimal debugging checklist your team can reuse

Before escalating a preview-only failure, collect the following:

  • preview URL and branch name
  • commit SHA and deploy time
  • browser and test runner version
  • screenshots or trace artifacts
  • console errors and failed network requests
  • auth state and user role
  • seed status and relevant fixture IDs
  • cache or service worker status
  • any recent config or flag changes

You can turn this into a standard incident note, a bug template, or a CI artifact bundle. The important part is consistency.

Building better preview environments for testing

The long-term answer is not just better debugging, it is making preview environments less surprising.

A few practical improvements go a long way:

  • publish readiness endpoints for migrations and seed completion
  • pin browser and runtime versions in CI
  • keep preview auth configuration aligned with production-like domains
  • make test data creation explicit and idempotent
  • invalidate HTML and app-shell caches aggressively on deploy
  • expose build metadata in the page, such as commit SHA and deploy timestamp
  • standardize feature flag defaults across preview and CI

These changes reduce the number of unexplained failures and make the remaining failures more meaningful.

Final thought

When test suites fail only in preview environments, that failure is usually trying to tell you something important about deployment reality. Preview environments sit at the intersection of frontend code, infrastructure, auth, data, and caching, which is exactly where many assumptions break down.

The fastest path to stability is not to mask the symptom. It is to identify which layer changed, make that layer observable, and remove ambiguity from the test setup. Once you do that, preview environments stop feeling random and start acting like what they should be, a practical bridge between development and production.