On 12 March this year, the world wide web turned 30. Later this year, in October, the internet itself will turn 50 years old. It is clearly a banner year for internet historians.
But even just a glance at what those birthdays mark shows the strange evolution of the internet. For its first few decades, growth was markedly slow compared to its more recent, astonishingly fast expansion. From four computers in 1969, to around 100,000 in 1990, to several billion today – in little more than the blink of an eyelid, the internet has gone from being an interesting side-project to becoming critical infrastructure.
This rapid growth means some of the protocols and systems upon which the internet relies have not changed to keep pace – meaning the network which has become a mainstay to the world’s economic and political health worryingly frequently relies on systems sketched up decades ago for a network of a few hundred people, all of whom knew each other.
The internet is built on top of the first protocols it used, some of which were literally drawn on the back of a napkin
The internet was born as an experiment between a few universities, and funded by the US Department of Defense. The coming 29th October anniversary will mark 50 years since the first message was sent over ARPANET, the precursor to the internet, down a connection of a few hundred miles between the University of California, Los Angeles, and a telecoms provider.
The initial plan for ARPANET was to link three US universities and the telecoms company that helped build the hardware to allow them to network their computers – virtually a side-project to further research in other areas, the logic being it was easier to have different specialist computers at different universities, and allow them to be shared, than funding multiple highly specialised and expensive machines.
The DoD’s research agency had something of an ulterior motive in that the project would also provide a low-risk environment in which to test new networking technologies which could then prove useful in military systems.
The network, then, was a build-it-as-you-go ad hoc thing, put together often by relatively junior staffers. Even its first message shows the teething issues: the very first message sent on ARPANET failed, and caused a computer crash. Trying to be efficient, the computer transmitting the first message had been set up to spot when a user could only be typing one possible command – and once it had worked this out, to automatically send the rest of it.
The first message as intended was a prosaic, if functional, one: “login”. The “l” and the “o” were sent fine – but as soon as the sender hit “g”, his computer sent “gin” all in one flurry. The receiving computer, expecting one letter rather than three, promptly crashed. The first message there, was “lo” – a nicely dramatic, Lord of the Rings style greeting.
Over the next five years or so, the network grew from two computers, to four, to around a dozen, to a few dozen, and the protocols on how data was sent and received, how computers were marked on the network, and even how new protocols would be agreed were set up as the network went – at a time when everyone on it knew one another, and would often phone around to fix outages and similar issues.
One of the earliest issues which needed resolving was deciding how traffic should flow across the network, and what “addresses” to give to each computer, so it knew where to go. The protocol which decided this – TCP/IP – was developed by Vint Cerf and Bob Kahn in 1973, earning them the titles of “fathers of the internet”. But revolutionary though it was, it also showcases the headaches of turning an experimental network into a world-spanning one on the fly.
Steve Crocker, one of the UCLA post-docs who was there at the time of the internet’s very first message, recounts the problem. The first addressing system ARPANET was given would keep working for up to roughly sixty machines connected to the network, he explains. That seemed like plenty, given only four computers were connected in the first phase – until it wasn’t.
As the ARPANET began to transition to the internet, the addressing system – by now known as IP, or Internet Protocol – needed future-proofing, so that it wouldn’t keep having to be changed and updated, which was increasingly difficult now that the network was expanding beyond a group who knew one another.
“In the transition to the internet,” Crocker explains, “the decision was made to have 32-bit addresses. 32 bits gives you, if you use all of them, it’s four billion.”
The “32 bits” mentioned by Crocker is the amount of computer memory allocated to IP addresses – essentially a limitation on how many addresses you can have (like US five-digit zip codes). “If I thought sixty was large…four billion is bigger than sixty,” he says, laughing.
But then, just a few years ago, it became clear that while four billion IP addresses seemed an impossibly big number back then, it’s now obviously nowhere near big enough: we’re already bumping up against it as a limit now. In a world with seven billion people, and (soon) with the internet of things, more devices than people, we need to change again. And for a network as big as the internet, that’s not easy.
“Then you get to this awkward point where that turns out not to be big enough,” he says. “I have some pains of embarrassment about we should have known better, we should have encapsulated the addresses in a way that it would have been trivial to change the structure of that and everything would have been smooth.”
Changing the addressing system – a process formally known as changing from IPv4 to IPv6 (there was a version five, but it didn’t catch on) – this time will result in the total number of addresses increasing from just under 4.3 billion to 340,282,366,920,938,463,463,374,607,431,768,211,456 (that’s around 360 undecellion, for those wondering).
This one really should prove future-proof – it’s enough to assign an IP address to every single atom on the earth’s crust and still have plenty left over. But the pain of the transition, and the difficulty of the process, shows the unprecedented challenge of the internet.
In a world which will soon have more devices than people, we need to change again. For a network as big as the internet, that’s not easy
Because previous technological revolutions haven’t been so inter-connected, early experiments could make later building better: London’s single-track underground system has to be shut down for maintenance – so later systems added a third track, to allow for round-the-clock maintenance, for example. By contrast, in a very real sense, the internet is built on top of the very first networks and protocols it used, some of which were quite literally drawn on the back of a napkin.
If that’s a headache when it comes to working out the addressing system of the internet, it can be a nightmare when it comes to matters essential to the internet’s security. Humans have always found it easier to use text addresses when we use the internet than to remember the long strings of numbers used in IP addresses.
Even before the advent of the world wide web (and web addresses), text addresses were used for email. These were kept up to date with a text file on one folder on one computer which told the network which IP addresses corresponded to which host (usually email) addresses. On an internet with billions of users, thankfully, the system has evolved somewhat.
The new system is known as the Domain Name System, and features a range of dynamically updating servers which check-in and update each other on where addresses point. There is a small group of “core” servers which tell browsers where to go to find “.com”, “.net” etc, and then thousands or millions more which detail individual sites. It is the internet’s answer to a phone book.
Astonishingly, though, the entire system runs on trust, meaning if someone has access to a major and well-trusted name server they could re-route that traffic. This could just be done to steal traffic from a rival – I could hypothetically re-route traffic from Google.com by saying it should actually point to jamesrball.com (I would regret this as it would crash my site).
But it has more severe implications: earlier this year researchers at FireEye found evidence the Iranian state may have been behind a subtle and sophisticated attack re-routing traffic to sites of Middle Eastern governments, NGOs, and others across Iranian networks, giving it the chance to spy on such networks – with no way for the sites concerned to track or prevent the attack.
Because the early internet was built on trust, the protocols underlying the internet rely on it, too – and efforts to introduce better security get slow take-up. There is a secure version of DNS, and has been for five years, but its rollout in practice is glacial. It’s an issue even those running the system acknowledge.
“I’m a firm believer that all technologies have to evolve,” says Göran Marby, CEO of ICANN, the non-governmental body overseeing DNS. “Rightfully, you could say that the DNS has not fundamentally changed since its inception… If someone comes around with a better methodology to connect people, because that’s what we do, we’re connecting people, we should step aside.”
But, of course, they haven’t.
DNS is hardly the only such protocol facing these issues. Its lesser-known sister, Border Gateway Protocol, is a system used to manage which physical cables traffic flows along: routers advertise which other routers they’re connected to, in order to explain what routes are open for traffic to flow across the internet. If DNS is the phonebook of the internet, then BGP is effectively its satnav.
It is also fundamentally unchanged since it was drawn up by two network engineers at a conference in 1989 – quite literally on the backs of (depending who tells the story) two or three napkins. The solution they came up with was suitably ad-hoc and seen as temporary, to the point that the napkins concerned weren’t even kept for the historical record, though photocopies of them exist in Cisco’s archives.
BGP, even more than DNS, is based on trust – and so even small changes to the protocols of major internet service providers can suddenly re-route huge volumes of traffic. The danger with BGP is if someone says they can offer you a route to a particular website when they actually cannot.
This happened around a decade ago when Pakistani officials ordered traffic blocked to a certain YouTube video – reportedly an anti-Islamic video by the extreme Dutch politician Geert Wilders. A Pakistani ISP used BGP to do this, offering a “route” to YouTube – the whole site, not just one video – which actually dropped its traffic nowhere. The ISP accidentally didn’t advertise this route only to its customers, but to everyone.
Seeing a new route advertised, lots of BGP servers took it – causing a huge worldwide outage for YouTube, mostly by accident. Such incidents are hardly a thing of the past: late last year another BGP error suddenly saw lots of traffic intended for Google suddenly re-routed into China – sparking fears of a deliberate attack aimed at surveilling traffic, but later revealed to be the result of a fat-finger error by someone at a Nigerian internet provider, who had been updating their BGP settings.
One of the joys of the internet is the way in which it has evolved naturally and almost spontaneously from a small and experimental network, to become a rival to telecoms monopolies, established power, and to show new ways of connecting the world.
But that strength is a weakness, too: we would not build our power grid on a trial-and-error basis, and wouldn’t tolerate these kinds of issues now. But we have no option to do otherwise: we might want to say that “we wouldn’t start from here”, we have no choice but to do so.
The internet has become critical to our free societies, our free medias, and our economic futures. It is critical infrastructure. And we’re going to have to rebuild it, while it’s live, in real-time, for two billion users across nearly 200 countries. The internet’s next fifty years could be even livelier than its first.