Partial Prerendering (PPR) in Production: Architecture Patterns (2026 Edition)

Published:March 25, 2026•Updated:April 22, 2026•Reading time:17 min read

Next.js Partial Prerendering (PPR), experimental since Next.js 14 and incrementally production-stable in Next.js 15, solves the fundamental rendering tradeoff that's been limiting React architectures since 2016: you either get fast TTFB with static generation (no server-side personalization) or correct server-rendered personalization with slow TTFB. PPR's approach - pre-generating a static HTML shell at build time and streaming dynamic content into Suspense boundaries at request time - isn't a tweak to either model. It's a different rendering model that combines two things that didn't work together before: React's concurrent renderer with Suspense streaming, and Next.js's ability to split a single route into a build-time static artifact and a request-time streaming renderer. In this post I walk through how the two-phase response mechanism works, how to architect Suspense boundaries correctly, how PPR interacts with the Full Route Cache, and what the measured performance outcomes look like in production.

Part I - The Rendering Tradeoff That PPR Resolves

To appreciate what PPR actually changes, it helps to look at how every prior rendering strategy fails for pages that mix static and dynamic content. Consider a product detail page (PDP) with: a product title, description, and images (static per SKU, cacheable for hours); a price that varies by user geography and logged-in status (dynamic, cannot be cached); an inventory badge (dynamic, must be real-time); and a 'Recently Viewed' section (dynamic, user-specific).

ISR (entire page static): Pre-renders the page at build time with a fixed price and inventory state. The CDN serves the page with sub-50ms TTFB. But the price shown may be wrong for the user's region, and inventory may be stale. To show correct dynamic values, the application must add client-side fetches (useEffect → fetch → setState), reintroducing the waterfall pattern that RSC was designed to eliminate. The page renders with skeleton placeholders while JavaScript executes, producing LCP degradation.
SSR (entire page dynamic): Reads cookies() for region detection, queries the pricing API, checks inventory - all at request time. Shows correct values. But TTFB is now 300–800ms (the time for all upstream data dependencies to resolve) rather than sub-50ms. This TTFB directly delays LCP: the browser cannot paint the Largest Contentful Paint element until the first byte arrives.
Streaming SSR (without PPR): React's concurrent renderer can stream HTML chunks as data resolves. But without PPR, the entire route is still dynamic - the first byte is still the result of the streaming renderer executing at request time, not a cached CDN response. The shell is computed on every request. Streaming SSR improves Time to First Byte *relative to blocking SSR* by sending the shell before database queries complete - but the shell is still generated per-request, not served from CDN cache.
PPR (static shell + dynamic holes): The static product content (title, description, images) is pre-rendered at build time and cached on the CDN edge. Request time: CDN returns the cached static shell immediately (TTFB < 50ms), then opens a streaming connection to the dynamic renderer which fills the price, inventory, and personalization Suspense boundaries as their data resolves. The LCP element (product hero image) is in the static shell - it renders at CDN speed, not at dynamic rendering speed.

Part II - The Two-Phase Response Mechanism

PPR works by splitting a single route's rendering into two phases that execute in different compute contexts: a build-time phase that produces a static HTML shell, and a request-time phase that streams dynamic content into that shell.

At build time, Next.js renders the route with all dynamic contexts suspended - cookies(), headers(), and searchParams calls inside Suspense boundaries cause those boundaries to render their fallback content. The result is a complete HTML document where dynamic regions are represented by their Suspense fallbacks (skeleton loaders, loading indicators, or empty placeholders). This HTML is stored in the CDN's edge cache, identical to a statically generated page.

At request time, when a user requests the page, the CDN immediately returns the cached static shell HTML. The HTTP connection stays open (HTTP/1.1 chunked transfer encoding, or HTTP/2 / HTTP/3 multiplexed streams) while the request is simultaneously forwarded to the dynamic renderer - a serverless function deployed at the origin. The dynamic renderer executes the Suspense-wrapped components with real request context (cookies() reads actual cookies, headers() reads actual request headers). As each Suspense boundary's data resolves, the dynamic renderer streams a <script> tag containing a React Server Component payload that instructs the browser's React runtime to replace the Suspense fallback with the real content. This replacement is a DOM reconciliation - not a full re-render - so it does not cause layout shift if the fallback has the correct dimensions.

Chunked transfer encoding mechanics: HTTP chunked transfer encoding (RFC 7230) allows a server to send a response in multiple pieces without knowing the total content length in advance. The CDN sends the static shell as the first chunk immediately. The dynamic renderer appends subsequent chunks (the <script> tags for each resolved Suspense boundary) to the same HTTP response as they become available. The browser processes each chunk as it arrives - the first chunk (static shell) is fully parseable HTML that the browser can render and display while waiting for subsequent chunks.
The static shell artifact: The build-time generated shell contains a complete HTML document: <html>, <head> with all metadata and <link rel='preload'> hints, the full static RSC tree, and Suspense fallback HTML for each dynamic boundary. This means the browser receives preload hints for the LCP image in the first chunk - it can begin fetching the product hero image while the dynamic renderer is still processing the price and inventory queries.
Dynamic renderer isolation: The dynamic portion of a PPR route is deployed as a separate serverless function invocation. It has access to the full request context (cookies, headers, search params) but executes only the Suspense-wrapped subtrees. This is architecturally different from a full SSR render - the dynamic renderer does less work per request, which reduces cold start impact and execution cost.

Part III - Suspense Boundary Architecture: The Rules of Static vs Dynamic

The Suspense boundary is the fundamental unit of PPR decomposition. Everything outside a Suspense boundary is static; everything inside is dynamic. The architectural discipline required is to place Suspense boundaries precisely - wrapping the minimum dynamic surface rather than the maximum.

What forces dynamic rendering (must be inside Suspense for PPR to work): cookies() reads, headers() reads, searchParams access (from page props), Math.random() calls, new Date() calls without a fixed seed, fetch() with cache: 'no-store' or next: { revalidate: 0 }, and any data fetching that requires request-specific context (user ID from session, locale from cookie, A/B variant from cookie).
What is safely static (must be OUTSIDE Suspense): Product title, description, images and their URL structure, static navigation, footer, schema.org structured data, breadcrumbs (when they don't depend on session), page metadata, font preloads, and any data that is the same for every user and cached at a reasonable TTL.
The Suspense placement anti-pattern: Wrapping the entire <main> in a single Suspense boundary defeats PPR - the entire visible content is a dynamic hole, and the static shell contains only <html>/<head>/<nav>/<footer>. This is functionally equivalent to full SSR with a skeleton loader. The correct granularity: one Suspense boundary per *independently dynamic concern* (price, inventory, personalized recommendations, cart state). Independent Suspense boundaries stream in parallel and resolve independently - a slow recommendations query does not block the price from rendering.
`searchParams` is the most common PPR trap: In Next.js, searchParams is a dynamic value - it cannot be known at build time. A page component that destructures searchParams from its props causes the entire page to opt into dynamic rendering, bypassing PPR entirely. The fix: move searchParams consumption into a child component wrapped in Suspense. <Suspense fallback={<SortFallback />}><SortedProductList searchParams={searchParams} /></Suspense> - the static product grid shell renders from CDN; only the sorted/filtered variant triggers a dynamic request.

Part IV - PPR Configuration in Next.js 15

Incremental adoption mode (recommended for production): In next.config.ts: experimental: { ppr: 'incremental' }. Per-route opt-in: export const experimental_ppr = true; at the top of any page.tsx or layout.tsx. This allows PPR to be enabled route-by-route rather than globally - enabling safe production rollout with monitoring at each step.
Global enable mode: experimental: { ppr: true } enables PPR for all routes in the application. This is the path to full site PPR adoption but requires that every route's Suspense boundary placement has been audited and validated.
`generateStaticParams` compatibility: PPR works correctly with generateStaticParams. The static shell for each pre-generated param combination is stored in the CDN cache. Dynamic holes are resolved per-request with actual request context. ISR (revalidate = N) refreshes the static shell independently of the dynamic holes - when the shell is regenerated, new static content is cached; dynamic content is always computed fresh.
Vercel deployment behavior: On Vercel, PPR routes produce two deployment artifacts: a CDN-cached static asset (the shell) and an Edge Function / Serverless Function (the dynamic renderer). The CDN asset is served globally from edge nodes; the dynamic renderer executes in the region closest to the user. This separation is visible in Vercel's deployment analytics - static shell requests never appear as function invocations.

Part V - Production Use Cases: Where PPR Delivers Maximum Value

PPR's benefit is proportional to the fraction of above-the-fold page content that is static. Pages where the LCP element and majority of visible content are static, but meaningful dynamic elements (personalization, session state, real-time data) exist below or alongside, are the ideal PPR candidates.

eCommerce PDP (highest impact): Static: product title, hero image, description, specifications, breadcrumbs, schema.org markup. Dynamic (Suspense-wrapped): price (geo-dependent), inventory badge, 'Add to Cart' button state, recommended products, review summary. The product hero image - typically the LCP element on mobile - is in the static shell and loads from CDN cache. LCP improvement from PPR on a typical PDP: from ~1.8–2.4s (SSR with upstream API calls) to ~0.6–1.0s (CDN-served shell with preloaded hero image). For a detailed analysis of how PPR fits into the broader headless commerce architecture, see the headless Shopify build guide.
Marketing pages with A/B testing: Static: all marketing copy, hero images, social proof. Dynamic (Suspense-wrapped): A/B variant assignment (reads from experiment cookie), personalized CTA text. Without PPR, any A/B testing via cookies forces the entire page into SSR. With PPR, the above-fold hero and copy render from CDN; only the A/B-sensitive CTA buttons are dynamic holes.
Authenticated dashboards: Static: navigation structure, sidebar menu, page skeleton, non-personalized widgets (time-of-day greeting without user name). Dynamic: user name, account balance, notifications, activity feed. The navigation - which typically contains the LCP element on desktop (logo or hero banner) - loads at CDN speed regardless of auth state.
Blog / content sites with reader features: Static: the article content itself (title, body, images, structured data - the dominant majority of the page). Dynamic: logged-in state indicator, bookmarked status, subscription prompt variant. A blog post where 95% of the page is the article content is essentially free to PPR-optimize - the static shell is almost the entire page.

Part VI - Suspense Fallback Design: Zero CLS

Cumulative Layout Shift (CLS) is the Core Web Vitals metric most at risk from PPR adoption. When a Suspense fallback is replaced by the real content, if the real content has different dimensions than the fallback, the page layout shifts - a CLS event. Poor CLS from PPR fallbacks is worse than the CLS from a full SSR page render, because it is visible to the user after they have already seen a rendered layout.

Skeleton UI sizing: Suspense fallback components must have explicit, fixed dimensions that match the real content dimensions. A price display fallback of <div className="h-7 w-20 animate-pulse bg-gray-200 rounded" /> works if the real price always renders in a ~28px × 80px space. If the real price can vary in height (one-line vs two-line with promotional badge), the fallback must account for the maximum height to prevent downward shift.
`min-height` on dynamic regions: For dynamic content with variable height (recommendations carousel, review summary), set a min-height on the Suspense wrapper matching the minimum expected content height. This prevents the page from compacting during the fallback state and expanding when real content loads.
Loading.tsx as the full-page streaming fallback: Next.js App Router's loading.tsx file creates a Suspense boundary wrapping the entire page segment. This is appropriate for route transitions (navigating to a new page) but should not be the only Suspense boundary in a PPR route - it would make the entire page dynamic. Use loading.tsx for navigation loading states; use fine-grained inline Suspense for PPR boundary decomposition.
Preload hints in the static shell: Because the static shell is the first HTML the browser receives, any <link rel='preload'> tags in the shell's <head> fire immediately. For the LCP image, this means <link rel='preload' as='image' href='/product-hero.avif' /> in the <head> - generated by Next.js's next/image component - starts the image fetch while the browser is still receiving the static HTML. The combination of CDN-fast TTFB and early image preload is what produces PPR's LCP improvements.

Part VII - PPR and the Full Route Cache

Understanding how PPR interacts with Next.js's Full Route Cache is essential for predicting its TTFB improvement in production. The Full Route Cache stores rendered route output to avoid per-request server rendering. Without PPR, any route that reads cookies() or headers() opts out of the Full Route Cache - it must be fully server-rendered on every request. This is the mechanism by which any personalization or authentication check forces SSR.

PPR restores Full Route Cache for personalized routes: When cookies() is called inside a Suspense boundary (the PPR dynamic hole), Next.js knows that only the Suspense-wrapped subtree is dynamic - the rest of the route is statically cacheable. The static shell is stored in the Full Route Cache (and CDN cache) regardless of what happens inside the Suspense boundaries. The cookies() call no longer pollutes the entire route's cacheability.
Cache invalidation is independent for shell vs dynamic content: The static shell is invalidated by ISR revalidation (time-based or on-demand revalidateTag). The dynamic content is never cached - it is computed fresh on every request. A product page with revalidate = 3600 regenerates its static shell hourly; the price and inventory Suspense holes always reflect current data. This separation of static freshness (hourly ISR) and dynamic freshness (per-request) is architecturally cleaner than the alternatives.
CDN edge vs origin execution: Static shell: served directly from CDN edge node globally (TTFB ~10–50ms, depending on geographic proximity). Dynamic holes: executed at the origin server (or Vercel Edge Function for global distribution). The user's perceived TTFB is the CDN's response time - the dynamic rendering latency is hidden behind the streaming connection that was already opened when the CDN returned the first byte.

Part VIII - Measured Performance Outcomes

The performance improvement from PPR is primarily in TTFB and LCP. The following benchmark profile is representative of production deployments on Vercel with Shopify Storefront API data fetching (50–200ms API latency to origin).

TTFB: Full SSR with upstream API calls: 300–800ms (p75). PPR with CDN-cached shell: 20–80ms (p75). Improvement: 4–10× reduction in TTFB. This improvement is consistent regardless of upstream API latency - because the CDN shell is returned before any dynamic API call is made.
LCP: On a PDP where the hero image is in the static shell: full SSR produces LCP of 1.8–2.4s on mobile p75; PPR produces LCP of 0.6–1.2s. The reduction comes from two compounding improvements: lower TTFB (the HTML arrives sooner) and earlier <link rel='preload'> execution (image fetch begins with the first HTML byte). For a detailed analysis of LCP mechanics and measurement methodology, see The Universal Web Performance Architecture.
INP: PPR has no direct impact on INP. INP measures interaction responsiveness (event handler latency), not page load. Indirect benefit: smaller client bundles (RSC server-renders more content) reduce JS parse time, which can reduce INP on page load interactions - but this is an RSC benefit, not a PPR-specific benefit.
CLS: CLS outcome depends entirely on Suspense fallback design. Well-designed skeletons with explicit dimensions: CLS = 0 (no layout shift). Poorly designed fallbacks (e.g., display: none replaced by a 200px block): CLS > 0.1 (poor). PPR introduces a CLS risk that does not exist in full SSR - the tradeoff requires deliberate fallback engineering.

Part IX - Decision Framework: PPR vs Alternatives

Use PPR when: (1) The page has a clear above-the-fold static region that contains the LCP element. (2) Dynamic content is limited to specific, identifiable regions (price, session state, recommendations) that can be isolated in Suspense boundaries. (3) The current architecture is full SSR with > 300ms TTFB due to upstream dependencies. (4) You are on Vercel or a platform that supports PPR's static shell + streaming dynamic architecture. (5) Core Web Vitals show LCP > 2.5s on mobile p75 and TTFB is the dominant contributing factor.
Use ISR + client-side fetch (no PPR) when: (1) Dynamic content is below the fold and does not affect LCP. (2) Dynamic content is infrequent or optional (e.g., notification badge) and client-side fetch waterfall is acceptable. (3) The team is not yet familiar with PPR's Suspense boundary discipline and the risk of CLS from fallback design errors is high.
Use full SSR (no PPR) when: (1) Every above-the-fold element is dynamic and user-specific - there is no static shell to cache. (2) The page requires real-time bid/auction pricing that must be synchronized across the full HTML response. (3) The application is not deployed on a platform that supports PPR's CDN+streaming architecture.
PPR is the wrong choice when: searchParams drives the entire page layout (e.g., a search results page where position, sort, and filter all affect above-the-fold content). In this case, the static shell would be a skeleton with no meaningful content, and the PPR benefit disappears. For search results pages, ISR with client-side filtering or full SSR with aggressive edge caching is typically superior.

Part X - Known Limitations and Production Caveats

`searchParams` propagation: As noted in Part III, any component in the static shell tree that accesses searchParams forces the entire page into dynamic rendering. This is a common migration pain point - search for searchParams usage in the component tree before enabling PPR on a route. Move all searchParams access to Suspense-wrapped components.
Middleware and response header mutations: Middleware that modifies response headers (e.g., setting Set-Cookie or Vary headers based on request content) can interact unexpectedly with PPR's cached static shell. The static shell is cached without knowledge of Middleware mutations. Test Middleware + PPR interactions explicitly before production deployment.
Cold start latency for the dynamic renderer: On serverless platforms, the dynamic renderer function has a cold start penalty (50–500ms on first invocation after inactivity). This affects the latency of dynamic hole resolution, not the static shell TTFB. For routes with frequent traffic, cold starts are rare; for low-traffic routes, cold start can make PPR's dynamic holes slower than expected.
PPR is currently `experimental` in Next.js 15 (incremental mode is production-stable): The experimental_ppr = true per-route flag is documented as production-ready in incremental mode. The global ppr: true configuration is still considered experimental. Monitor the Next.js changelog for the stability promotion - it is expected in Next.js 16.
Error boundaries in dynamic holes: If a dynamic hole's data fetch fails (API timeout, 500 response), the Suspense boundary falls back to its error.tsx Error Boundary. The static shell remains intact and rendered. Users see a gracefully degraded page (static content visible, dynamic holes show an error state) rather than a full error page. This is architecturally superior to SSR, where a single failed upstream call can produce a 500 for the entire page.

Conclusion

PPR is the biggest rendering architecture change in Next.js since RSC. It fixes a structural problem that ISR, SSR, and streaming SSR all left open: you couldn't serve CDN-cached HTML with sub-50ms TTFB on any route that had server-side dynamic content. The Suspense-boundary approach - build-time shell, request-time streaming fill - fits cleanly into the rest of the App Router model.

What it demands in practice is Suspense boundary discipline: every dynamic concern needs its own Suspense boundary, and every fallback needs the right dimensions to prevent CLS. Teams that invest in this boundary engineering get LCP improvements measurable in CrUX field data within days of deployment. For a complete picture of how PPR fits into App Router's rendering model, see the App Router migration guide.

For PPR architecture review, implementation, or Core Web Vitals engineering - performance optimization service | case studies | discuss your project.

FAQ

Is PPR available in Next.js 14? PPR was introduced as experimental in Next.js 14 under experimental: { ppr: true }. The incremental ('incremental') mode that enables per-route adoption was added in Next.js 15. For production use, Next.js 15 incremental mode is the recommended path.
Does PPR work with `getServerSideProps`? No. PPR is an App Router feature only. getServerSideProps is a Pages Router API. If you are on Pages Router and need PPR's capabilities, migrate to App Router first - see the migration guide.
Can I use PPR with internationalization (i18n)? Yes. The static shell is generated per-locale if you have locale-specific routes. The app/[locale]/product/[handle]/page.tsx generates a static shell per locale per product handle. Dynamic holes (price, session state) are resolved per-request regardless of locale.
How does PPR affect Time to Interactive? TTI is improved indirectly: smaller client bundles (RSC-rendered static content has no client JS) reduce parse time. PPR's main contribution is to TTFB and LCP, not directly to TTI. If your TTI is high, the root cause is likely excessive client-side JavaScript - see the 'use client' boundary analysis in the App Router migration guide.
What happens if the dynamic renderer is slow? Users see the Suspense fallbacks (skeleton UI) for the duration of the dynamic rendering latency. The static shell content is fully visible and interactive (navigation works, static text is readable) while the dynamic holes are loading. This is a fundamentally better user experience than full SSR blocking the entire page on the slowest upstream dependency.

References

Next.js Team. 'Partial Prerendering': https://nextjs.org/docs/app/building-your-application/rendering/partial-prerendering
Vercel. 'Partial Prerendering with Next.js': https://vercel.com/blog/partial-prerendering-with-nextjs-creating-a-new-default-rendering-model
Next.js Team. 'Next.js 15 Release Notes': https://nextjs.org/blog/next-15
React Team. 'Suspense Reference': https://react.dev/reference/react/Suspense
RFC 7230. 'Hypertext Transfer Protocol - Chunked Transfer Encoding': https://datatracker.ietf.org/doc/html/rfc7230
Google. 'Largest Contentful Paint': https://web.dev/articles/lcp
Google. 'Cumulative Layout Shift': https://web.dev/articles/cls
Google. 'Time to First Byte': https://web.dev/articles/ttfb
Next.js Team. 'Full Route Cache': https://nextjs.org/docs/app/building-your-application/caching#full-route-cache
HTTP Archive. 'Web Almanac 2025 - Performance': https://almanac.httparchive.org