Multi-tenant CMS architecture

@da_poling|March 1, 2025 (1y ago)

The promise of a headless CMS is structured content. The cost is everything editors took for granted in Drupal.

Drupal is opinionated in ways that feel constraining until they are gone. The authoring experience is deeply customized, preview is immediate, and the relationship between what an editor does and what appears on the page is direct and legible. Moving to a headless architecture breaks that contract. Content becomes structured data, preview becomes a separate system that has to be built and maintained, and the editor who used to drag a component into a page now has to understand a content model to do the same thing. The flexibility is real. So is the disorientation.

That transition was the starting condition at Penn State. Teams that had worked in Drupal for years were now operating in Contentful spaces set up independently, without shared models, consistent naming conventions, or unified authoring patterns. Structured content existed. The experience of working with it did not.

Making multi-tenant structure real

The first problem was architectural. Every site maintained its own content types, so every site drifted from every other site in ways that accumulated silently. A module in one space did not exist in another. A field named one thing here was named something else there. Environment sync was manual and error-prone.

I rebuilt this around a modular content model: composable types shared across spaces, adopted selectively by each tenant, and updated in one place instead of many. The sync dashboard turned environment parity into something visible and actionable instead of something you had to reconstruct manually each time.

The export tooling came from a simpler observation: migrations were consuming engineering time for tasks that were fundamentally content operations. Non-technical teams needed to move structured content between spaces to seed QA environments, onboard new sites, and migrate program content. Making that self-service was not technically novel. It was operationally significant.

Experimentation changed behavior

The A/B testing integration was the most culturally impactful system in the stack. Running experiments through Next.js feature flags against CMS-driven variants let editorial teams test decisions instead of only debating them. That shift, from opinion-driven content decisions to data-driven ones, is harder than it sounds because it requires the organization to treat user behavior as a more reliable signal than internal preference.

When that shift holds, it compounds. Decisions get closer to users. The product gets closer to what users actually need. ROI follows from that, not the other way around.

LLM audit: useful and humbling

The LLM content audit was the most interesting and the most humbling system I built in this area. The goal was straightforward: analyze content at scale, identify low-value pages, flag taxonomy mismatches, and surface consolidation candidates.

Taxonomy proved harder than expected. Classifying a page is only as good as how closely the content's behavior matches its type definition. A page that appears to belong to one category by metadata may function like another when you look at traffic patterns and user pathways. The boundary between good and bad content is not fixed. A page ready for retirement this month may be exactly right next month when context changes around it.

Working with LLMs on this problem made one thing clear: context shape matters more than context volume. Too little context and the model pattern-matches superficial features. Too much context and it starts scripting against your input shape instead of reasoning beyond it. The right context gives the model room to surface things you could not have articulated directly. Finding that balance is the engineering problem, and it changes by domain.

What the system has to enforce

The deeper finding is about how people approach content. Most editors are not thinking in systems while writing. They are thinking about the page in front of them. If architecture does not enforce consistency, if content models do not make certain failures impossible, strategy drifts the same way code drifted before the monorepo.

Structure without creativity produces dead content. Creativity without structure produces chaos that does not scale.

The engineer's job is to make it structurally difficult for content to lose its shape. The author's job is to fill that structure with something worth reading. Those responsibilities are interdependent, and neither one works if the other is missing.