I wrote here how this blog is built. The better question – thanks Aman! – is why? Why not post these words on Medium, or put a static site on Github Pages or in an S3 bucket, or use WordPress on a virtual private server, or even go super old-school and run a web server at home (I do have this sweet, sweet Sonic fiber after all). It’s not as if we’re short on options in 2020.
Is there really something wrong with all of them? Well, I guess it depends what you mean by “wrong”…
The core technologies on which the internet is built embody some key design choices that
- enable centralized control of software systems and data, whether by corporations or the state,
- allow for the pervasive spread of disinformation that is very costly to identify and correct relative to the cost to produce and distribute it, and
- make it quite difficult to implement solutions to those problems by building applications on top of them, rather than replacing them.1
I think that’s the minimum decentralized-web proselytization required to set up the rest of this post, which is really a justification for me to use some nerd tech. And lest it needs be said: IPFS is, at best, a small part of the solution to these problems. It does not, on its own, cure the internet’s ails – that’s what bitcoin is for.
The main change that IPFS makes relative to the internet we’re used to is in how web pages are named: addresses are tied to content, not a particular server. To understand the importance of this shift, let’s go back to the old days of the internet. Let’s say it’s 1995, and I want to put a brand new hypertext document on the World Wide Web. First, I get all of my HTML written (including <blink> tags for emphasis!). Now I need to put that document somewhere where it can be found and retrieved. Unlike today, that probably means I’m going to put it in a specific directory on a specific hard drive on a specific computer with a specific IP address connected through a specific network connection. And unlike today, domain names are hard to come by, so I’m either using a folder on an existing web server, or, if I’m particularly well-resourced and adventurous, I’m running my own Apache web server on my very own internet-connected computer.
I go through all of these minutiae because all of these specifics were crucial to the quality, reliability, and speed of the connection, and whether the web page could be discovered by anyone I didn’t directly tell about its existence. There were a lot of details to get right, and lots of things could and did go wrong.
In fact, all of these things still matter, greatly. They matter so much that, in the intervening years, billions of dollars have been invested, earned, lost, and re-earned in solving these problems better than any individual could do for themselves.2 And because software automation allows for engineering costs to be spread over a very large number of customers, these services are all available for a tiny fraction of the cost of the web 1.0 era. From a certain perspective, these are solved problems.
Let’s look at three reasonably modern approaches to publishing hyperlinked words and pictures on the World-Wide-Web.3
Wordpress is truly a dinosaur walking among us – an extremely successful dinosaur. It is the most successful example of what we might call the “traditional” dynamic web page model. Your content is stored in a MySQL database, and it is fetched, formatted with templates, and turned into HTML by the WordPress server-side PHP code. That HTML is then served over HTTP by a traditional web server – either Apache or nginx.
As always, there are many ways to complexify this picture in the name of performance: running multiple servers behind a load balancer, caching pages on the server as HTML so they don’t need to be re-generated from the database every time, or using a content delivery network so that pages can be served in many cases without needing to bother the “real” servers at all.
2. Github Pages
While there are many ways of doing this, Github pages is a particularly modern and popular solution. In short, Github allows its users to populate a git repo with a web site (or, in many cases, source files that can be translated into a website in one go through a “static site generator”), and from that point forward they take responsibility for serving those files. While they do this with a lot of complex infrastructure in order to achieve performance, scalability, and reliability, the end result is just a very capable web server that finds files on disk (or in a cache) and sends them back to the client.
3. “Put it on Medium”
Even though WordPress and Github Pages (along with a static site generator) take care of a lot of the work of creating and serving a website, they still require that you, like, know what a website is. Even with a lot of the details being taken care of, that’s still a pretty high bar for someone whose main goal is to get their words and pictures into as many people’s browser tabs as possible. Medium is the latest iteration of a fully-hosted blogging platform, a category which promises that the user doesn’t need to know what HTML is, much less HTTP, a web server, load balancers, and DNS.
When it comes to sending words and pictures over the internet, this makes a lot of sense. Medium’s engineers are much better at all of this than you will ever be. It’s silly for everyone who wants to publish on the internet to have to know or care about the technologies that make it all go. But even though these engineering tasks are very much behind the curtain, they must still be made. Medium’s engineers are working from the same set of options as anyone else; they’re just getting paid to make them well enough that the service’s users never have to worry about it.
|Method||Browser Code||Server Rendering||Web Server||Web host||Naming||Routing|
|WordPress||None4||PHP||nginx / Apache||A known server||DNS||TCP|
|Github Pages||None4||None||¯\_()_/¯ 5||¯\_()_/¯ 5||DNS||TCP|
|IPFS||None4||None||IPFS daemon||Any IPFS node||IPNS||libp2p|
As I see it, the interesting arguments for IPFS boil down to these three.6
- Ownership, control, censorship
At the end of the day, Medium the company controls what happens on medium.com, subject to the laws of the jurisdictions where it operates and the private decisions of its leaders and owners. To a first approximation, Medium’s interests are mostly aligned with authors who put their writing there. But that alignment is far from perfect, and an individual writer has almost no say in the decisions the company makes. In many ways, Medium has been better than some of its web predecessors about communicating its goals and values. But the fact remains that its users are wholly subject to its business decisions, without any right to appeal. Of course, Medium is beholden to its customers as a whole – but this is of no help to a user of the system whose needs diverge from the chosen path. In addition to, and intertwined with, this corporate control is the state’s ability to dictate how Medium should act, including prohibiting certain content and demanding that certain information be turned over when demanded.
As I attempted to communicate in the discussion above, even the most sophisticated modern web services still operate within the fundamental limitations of HTTP and DNS. And while it is possible to create systems using these technologies that are robust to many perturbations, outages, and catastrophic events, doing so is difficult, expensive, and harmful to the quality of the user experience. It does not make sense for any profit-seeking entity to prioritize this kind of resilience – certainly not for consumer services based on ad or subscription revenue. From their point of view, the existing tools work well enough.
What role should aesthetics play in these decisions? I don’t know, and I think I’m pretty middle-of-the-road in the endless war between programmers who favor purity and formal elegance and those who prioritize outcomes and practical considerations. But I will say that content addressing strikes me, and many software people who come across it, as obviously superior to host-based addressing along certain dimensions. But this beauty can certainly feel like a slender reed on which to place the design of a system that needs to get real work done. So weight this as you will.
These advantages are, to varying degrees, abstract and aspirational. Building anything on IPFS in 2020 represents a bet on a future in which priorities have changed, and engineering decisions are therefore made differently. Put another way: IPFS is a technology that belongs to a certain class of futures – a technology that will thrive in those futures (but not others), and that may help to bring them about.
Plus, it’s super cool. You should try it!
- To be clear, these are all statements about how these technologies operate through human psychology, and with our sociopolitical context. They are statements about technosocial systems. ↩︎
- True story: my software engineering professor at Berkeley in 2000, Eric Brewer, was at the time a co-founder of Inktomi – basically the first of what we would now call a CDN. This was absolutely cutting edge technology at the time. My main memory of him is that he seemed very smug; around that time, he was a paper billionaire. ↩︎
- These examples are brutally simplified, and each one represents just one possible way of designing the system. For example, I describe self-hosted WordPress, but many people use WordPress’ SaaS offering. ↩︎
- Not that we can’t or don’t know, but that we don’t need to. ↩︎
- You’ll see people tout that, with IPFS, you get CDN functionality “for free” – because of content addressing, files can live and be served from anywhere. But, 1), CDNs exist and work well, and 2) just because content can be served from any IPFS node doesn’t mean that most content will be available from many nodes. ↩︎