Platform Engineering & Infrastructure
Modernizing Penn State’s Web Platform: Multi-Tenant Architecture, Automation, and Infrastructure Engineering
When I stepped into the platform engineering role, Penn State’s web infrastructure faced several systemic issues: sites were fragmented across Gatsby Cloud, deployments were brittle, releases were largely manual, redirects triggered unnecessary rebuilds, QA environments weren’t standardized, and automated testing didn’t exist. Moving content from stage to production often introduced instability because environments weren’t consistent or reproducible.
The architecture needed an overhaul—something multi-tenant, testable, automated, predictable, and scalable.
My work focused on rebuilding the platform from the ground up using Next.js, Vercel, GitHub Actions, Playwright, and a set of internal tools that streamlined the entire delivery pipeline.
Migrating From Gatsby Cloud to Next.js + Vercel
The first major shift was the migration away from Gatsby Cloud. The build process was slow, updates were fragile, and the ecosystem couldn’t support the multi-tenant flexibility Penn State needed.
I rebuilt the platform in Next.js, deployed on Vercel, which immediately provided:
Faster builds and incremental static regeneration
Edge middleware for routing and redirects
A more stable preview environment
Built-in scalability for multi-site hosting
Lower operational friction for new feature development
This move laid the foundation for all future improvements.
Centralizing the Web Ecosystem Into a Monorepo
Previously, each site lived in its own isolated repository. There was no shared architecture, duplicated code grew out of control, and updates were costly to propagate.
I consolidated all sites and shared packages into a single multi-package monorepo, which provided:
Shared components, utilities, and schemas
Centralized versioning and governance
Faster onboarding for new projects
Consistency across every Penn State domain
This monorepo became the backbone of our multi-tenant strategy.
Creating a Boilerplate Next.js App for New Sites
Before this work, starting a new site required dozens of manual steps, copied code, and inconsistent configurations.
I created a fully configured boilerplate Next.js application that includes:
Design system integration
Auth, routing, and layout primitives
CMS integration
Analytics setup
Environment variable scaffolding
Core middleware and Edge Config rules
New sites can now be created in minutes, not weeks.
Automating Deployments With GitHub Actions
To eliminate unreliable manual deployments, I implemented a full CI/CD pipeline using GitHub Actions, enabling:
Automated test runs
Controlled release channels
Consistent build processes across stage, QA, and production
Auto-generated release notes via webhooks
PR-level preview builds for stakeholder review
This replaced the unpredictable release process with a stable, repeatable workflow.
Stabilizing Stage-to-Prod Releases
Before the rebuild, moving from stage to production frequently introduced breaking changes. There was no environment parity, no automation, and no guardrails.
I solved this by:
Standardizing environment variables
Automating QA and UAT environment creation
Implementing GitFlow standards
Enforcing consistent versioning and dependency controls
Introducing integration tests across builds
The result was a stable, reliable promotional pipeline for high-traffic sites.
Implementing Playwright for Automated Testing
Manual testing was slow, inconsistent, and dependent on tribal engineering knowledge. I replaced it with a suite of automated Playwright tests, covering:
Navigation flows
Critical UI components
Form behavior
Redirect integrity
CMS-driven dynamic content
This reduced release risk and provided a safety net for every deployment.
Redirect Strategy With Vercel Edge Config
Originally, every time an editor updated redirects in the CMS, it triggered a full rebuild of the site—a major performance bottleneck.
I rearchitected the redirect system using Vercel Edge Config, enabling:
Dynamic, instant redirect updates
Zero rebuilds
Middleware-driven routing logic
Support for multi-space CMS data
This significantly reduced build times and eliminated redirect-related downtime.
Static Fallbacks for Outage Hardening
To prevent total outages during upstream failures or platform issues, I implemented a static fallback strategy. If the dynamic site becomes unavailable, users automatically receive a cached static version—preserving uptime even during critical incidents.
This provides resilience for high-traffic university sites that must remain accessible.
The Result: A Reliable, Automated, Multi-Site Platform
Through architectural redesign, DevOps automation, and scalable infrastructure patterns, I transformed Penn State’s web platform from fragile and inconsistent to modern, stable, and fully automated.
The outcomes:
Migrated to a modern tech stack (Next.js + Vercel)
Adopted a monorepo for shared development & multi-site efficiency
Automated CI/CD with GitHub Actions
Introduced Playwright testing for stability and risk reduction
Eliminated rebuild-triggering redirects with Edge Config
Standardized environment promotion and release cadence
Created a boilerplate application for rapid new-site development
Hardened the platform with static fallbacks for outages
This work established a long-term foundation that supports dozens of university sites and the teams that build and maintain them.