Why I'm Building a CMS Around Static HTML
WordPress runs something like 43% of all websites. Depending on how you count, that number has held roughly steady for years, drifting upward if anything. It is, from a market penetration standpoint, an extraordinary success. It's also — and I say this as someone who has built on it, maintained it, and periodically debugged production WordPress installations at 2am — a system designed around assumptions that are about fifteen years old.
I'm not here to stack another "WordPress is bad" post on the pile. The problems are well-documented and most of the complaints about them are valid. What I want to talk about is the architectural decision I made for my own CMS project — a thing I'm calling LogoPress for now — and specifically why I landed on static HTML as the core data format when there are approximately forty other options that are more obviously "correct."
The Basic Problem
When you're building a content management system, you have to decide where the truth lives. This is the most important architectural question and it gets answered, implicitly, very early — often before anyone calls it an architectural question.
In WordPress, the truth lives in a MySQL database. The HTML you see in a browser is generated at request time (or from a cache that's a copy of the generated HTML) by combining templates with database records. This made sense in 2003. It makes less sense now, for reasons:
- Dynamic generation means every page load depends on the database being available and responsive, even for content that hasn't changed in months
- Caching plugins exist specifically to paper over this — you're generating static HTML anyway, just in a roundabout way
- The database becomes a giant mutex; scaling WordPress horizontally is possible but substantially more complex than it should be
- The database format is opaque — your content is trapped in a MySQL schema, and exporting it to anything else requires a plugin or a script
The people who figured out early that "the cache is the real thing" built static site generators — Jekyll, Hugo, Eleventy. Good tools. But they went too far in the other direction: the content is in Markdown files with front matter, the "site" is generated by a build step, and the output is deployed to a CDN. Which is fine until you want any dynamism at all, or until the build step takes four minutes because your site has 8,000 pages.
What LogoPress Does Differently
The core idea is that the content files are the site. An HTML file for each post, each page, each piece of content — not generated from a database, not generated from Markdown, just HTML. Real HTML, readable and valid without a build step or a server or a template engine.
Around each HTML file is a thin metadata layer: a JSON sidecar with the same base filename, containing title, date, tags, status, and whatever other structured data makes sense for that content type. The HTML is for humans (and for browsers). The JSON is for machines — for the CMS tooling, for the search index, for the RSS feed generator, for anything that needs to query or sort or filter.
The system that manages all of this — the editor, the publishing workflow, the media handling — sits alongside the files and talks to them. It's not the source of truth. The files are the source of truth.
Here's why this matters:
Portability. If I decide tomorrow to stop using my CMS and switch to something else, I have a folder of HTML files and a folder of JSON files. I can do anything with those. I'm not trapped in a database schema that requires an export script that may or may not work depending on the plugin version.
Durability. HTML from 2008 renders correctly in a browser today. HTML from 1994 renders correctly in a browser today. The format is extraordinarily stable. If I write a blog post in 2026 and come back to it in 2040, the file will still be there, still readable, without running a migration.
Cachability. A static file on a CDN is the fastest thing that exists. No database query, no template rendering, no server-side processing. The file is the response. For content that doesn't change between requests — which is almost all content — this is the optimal architecture.
Simplicity. The system has fewer moving parts. There's no database to back up separately, no ORM to maintain, no connection pool to tune. The content is files. Files can be backed up with rsync.
The Hard Parts
I don't want to oversell this. The static-HTML-as-source-of-truth approach has real costs.
Search is harder. You can't query the filesystem efficiently. I'm building a search index — currently a SQLite database that's rebuilt from the content files when they change. This works, but it's a piece of infrastructure that a database-backed CMS gets for free.
Real-time content is harder. Comments, user-specific views, anything that varies per-request — these don't fit cleanly into a static model. I'm planning to handle this with a thin dynamic layer for truly dynamic content, but it adds complexity.
Tooling assumptions. A lot of existing CMS tooling assumes a database. Plugins, migration scripts, analytics integrations — many of them expect to talk to an API or a database, not to a file system. Some things I want to do require building the integration from scratch.
The mental model takes getting used to. "The HTML file is the source of truth" is slightly unfamiliar even to developers who've worked with static site generators. Explaining the system to someone takes more words than "it's like WordPress but different."
What This Is For
LogoPress, at this stage, is for my own use first. I'm deploying it on polygeek.com — this blog runs on it. If the design is sound, it extends to RunPee.com, which has specific requirements around multilingual content and machine-readable data about movies that a general-purpose CMS handles poorly.
After that, I don't know. I'm building it because I think the architectural approach is right, and I wanted to work on something where I controlled the whole stack. Whether it becomes something other people use is a separate question.
The part I keep coming back to is the durability argument. I have content on the web that I care about. I want it to exist and be readable in twenty years. The systems most likely to make that happen are the ones with the fewest dependencies — the ones where the content format is its own documentation, where the tools are optional and the files are not.
HTML files in a directory are about as close to that as you can get while still building something usable in 2026.
The irony of writing this post on the system I'm describing about is not lost on me. The content for this post is an HTML file. The metadata is in a JSON sidecar. The system is serving it to you right now. Either the architecture works or you're reading an error page, and if you're reading this, it works.
So far.