The Domain Name System

by Steven J. Owens (unless otherwise attributed)

An Explanation of the Details, History and Philosophy of the Domain Name System (DNS)

Or: MAC addresses and IP addresss and Domain Names, Oh My!

This started life as two separate articles, which I've quickly and clumsily merged. I'm going to start with the gory details of the basic technology. Below, I'll talk about how the whole system is organized, and how it got that way, and how that should affect your philosophy.

Gory Details

Domain names are human-friendly identifiers for computers that are connected to the internet. They make it easier to remember and look up computers on the internet. To understand the basics of how internet domain names actually work, you have to understand how the underlying technology works, so let's get into the gory details.

The part that plugs your computer into the network is called the ethernet card, aka NIC (Network Interface Card). Ethernet isn't the only kind of networking hardware, but it's the most popular, so popular that people pretty much assume that if you're talking about your networking hardware in an internet context, you're talking about ethernet.

Every ethernet card, when it's manufactured, has a MAC address set for it; essentially a unique serial number built into it. Every ethernet card in existence has one. MAC stands for "Medium Access Control"; medium as in "whatever it is that the message travels through", which in this case is the ethernet network.

Here's a fun URL, it appears to let you identify the manufacturer from your ethernet card's MAC:

http://www.coffer.com/mac_find/

Here's a more technical discussion of MAC addresses:

http://www.erg.abdn.ac.uk/users/gorry/course/lan-pages/mac-vendor-codes.html

You plug an ethernet card into a local area network, and anything on the network that wants to get data to your ethernet card does so by putting your ethernet card's MAC address on the data.

Essentially it's as if the cards were all transmitting many short messages, each message starting with something like "This message is for MAC address 8c:a9:82:81:df:c8". All the ethernet cards on that local area network hear everything, but they ignore all messages that don't start with their MAC address.

That's what the local network uses to get bits along the wire from one machine (the hub, or switch) to another. One level up (in terms of organizational hierarchy) is the IP address. The infamous "dotted-quad", like 192.168.1.1.

Note: By the way, any time you see an address like 192.168.xxx.xxx, it's a private network address. The 192.168.xxx.xxx range is never assigned to anybody. It's meant to be used in-house, and then the gateway machine between the private network and the public internet translates addresses going in and out. There are a few other private IP ranges, like 10.xxx.xxx.xxx, 172.16.xxx.xxx and 169.254.xxx.xxx. Practically speaking, a machine on a private network can have any IP address you want, but the idea is that the official private IP ranges mean you can tell just from looking at an IP address whether it's supposed to be part of an internal, private network or not.

The IP address is like the MAC address, except that it isn't permanent. An IP address is a temporarily assigned number, that identifies a machine at the Internet level of organization. In general, the IP address is supposed to uniquely identify a single given machine at any one time. It's generally supposed to be one IP address per machine (not counting machines on private networks), but of course since the internet went public, that got mucked up.

An IP address identifies a specific computer at any given instant. Or more properly, a specific network card at any given instant, which is usually a one-to-one mapping, but not always. For example the gateway machine I mentioned above would probably have two network cards: one speaking to the private network and one speaking to the public internet.

The first generation of webservers that served multiple domain names off one box used multiple IP addresses to identify the same NIC, one per domain name, and the webserver figured out what page to give back based on the IP address the request came in on. Nowadays they've enhanced the HTTP protocol so the browser includes the domain name in the request itself, and the webserver can look at that.

Sometimes an IP address is semi-permanently assigned to a particular use; this is called a "static IP address", but it's still not at all permanent in a hardware sense; it's not built into the NIC, it's just that nobody else is supposed to answer to that IP address.

In fact there's nothing to prevent somebody from just screwing up and assigning the same IP address to two machines, which can cause all sorts of confusion and poor performance on your local network, not to mention flakiness.

The general nature of static IP addresses is that they are defined in a simple config file that is often (even usually) edited by hand. So, they tend to stay that way until somebody changes them.

IP addresses can be dynamically assigned per session (the kind you usually get with a dialup or with a consumer-level dsl/cable modem) or statically assigned.

Dynamic IP addresses are usually assigned in a range, i.e. 192.168.1.9 to 192.168.1.16.

Each ISP is assigned a block of IP addresses. This is typically done in a hierarchical fashion - a large "backbone" provider is assigned a very large block of IP addresses, they hand out smaller chunks of that block to their customers, who hand out smaller chunks to their customers. This turns out to be somewhat important, because of how "reverse DNS" works (see below).

Three or four levels down this hierarchy, a range of IP addresses is assigned to a DHCP server that belongs to the ISP (DHCP is Dynamic Host Configuration Protocol - a host being any machine connected to the internet). The DHCP server works at the infrastructure level of the network. When your machine connects, it broadcasts a request for a DHCP address. The DHCP server responds with an IP address that is "leased" to your machine for a given time (mintues, hours or even days). When the lease expires, the DHCP server erases the entry in its records that says it was assigned to your machine, and may assign it to somebody else. Hopefully your machine erases the lease at the same time (it can be a real pain when it doesn't).

Lately, small office routers and firewalls come with built-in DHCP servers. Recently, these DHCP servers provide an option to tell the DHCP server to always assign this IP address to this MAC address. So we have a sort-of-static, dynamically-assigned IP address. But at least it's centrally managed.

Domain Names

Okay, now we come to domain names. Domain names are basically a logical identifier assigned to a given IP address. Actually, let me rephrase that.

Domain names are human-readable, logical identifiers. In some cases not too human-readable :-).

Domain names are partly a mnemonic device, partly a logical structure.

An entry in the domain name system defines a logical name and what IP address is running the DNS server that keeps track of what IP address answers to that logical name. The logical names are defined in a fairly human-readable format, which we know as foo.com, bar.net, and so forth. The .com or .net or whatever part is called a TLD (Top Level Domain; we do love our acronyms, don't we?).

How DNS works is, you type in a domain name like www.google.com in your browser bar. One part of your browser handles DNS queries. It keeps track of domain names you've asked for addresses for recently, and the IP that it eventually tracked down for that domain.

(In case you're wondering what the heck "foo" is: Foo and bar are generic variables that programmers use in conversation, sort like somebody might say "John Doe", only less specific (since we usually mean a person when we say "John Doe", where foo or bar can be anything) The jargon term for them is "metasyntactic variable". This generally comes from the expletive acronym "fubar", which dates from World War II: Fucked Up Beyond All Recognition. It's explained further at:

http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=foo

Foldoc, by the way, is a generally useful information source. FOLDOC == Free On Line Dictionary Of Computing.)

Your computer has a DNS resolver, which basically has two jobs; first, to send a request to the Domain Name System (DNS) server to get the IP that a particular domain name points to. Second, to remember the recently requested IP addresss, to avoid having to go out and re-request all the time.

Your browser has a DNS cache - I'm not sure, it may be built into windows by now. I'm reliably informed that on modern mozilla browsers (e.g. firefox, mozilla, AOL's netscape distribution) the DNS cache runs as a separate process. On Linux, a separate program handles this.

In your local network, there's one specific machine that has the job of doing this for the local network. It's a lot like calling information on your phone. You call information, give them the name, they give you the number, then you call the number.

Your local DNS server does exactly the same thing - it keeps a list of addresses recently requested, and anything it can't handle, it sends a request up the line - usually at this point it's your organization's ISP DNS that gets the request. That DNS server keeps track of the numbers it's looked up recently, so it doesn't have to go up the hierarchy for every single request.

This hierarchy repeats all the way up the line to the "root level DNS servers".

The root level DNS servers used to be maintained by Network Solutions, aka the InterNIC (Internet Network Information Center).

Since the FTC split everything, they split the job of maintaining the root level servers out from the job of actually adding new names (in theory at least, though both organizations are still heavily intertwined with each other).

Adding new names is now referred to as being a registrar, and is meant to be competetive and customer-service oriented. Maintaining the root level servers is supposed to be not-for-profit and infrastructure related.

So what you do is, through your registrar, you tell the root level servers, "this machine over here is responsible for keeping track of what IP address SAGReiss.com points to." Usually that machine belongs to your registrar or your ISP, and you then tell them to have it point to whatever IP your site lives on.

One nuance to be aware of is, because of all of those lists-of-recently-requested-domains (aka DNS caches), if you make a change, it can take a while for the DNS hierarchy to catch up with it. Sometimes days. In the old days, typically 2-4 days. These days it generally seems faster... it all depends on how long the individual DNS servers in the hierarchy are configured to keep information in their cache. But I've recently heard people complain about a change taking days, which may have been the DNS system, or may have been organizational holdups at their ISP, who knows.

Bear in mind that all of the DNS servers are privately owned by different people. The root level servers are owned by the not-for-profit which was set up by the FTC - a gov't organization - and ICANN. ICANN is responsible for handing out blocks of IP addresses to the top of the hierarchy, as well as (apparently) setting policy for how things like the not-for-profit are handled.

Reverse DNS

This is a topic that's becoming more important, especially for sending email, these days. Ordinary DNS takes a domain name (foo.com) and converts it to an IP address (192.168.1.1). Reverse DNS takes an IP address (192.168.1.1) and tells you what domain name (foo.com) is assigned to it.

Why would you care? Mainly, as it turns out, because lots of machines run Micrsoft's OS, and funnily enough, these machines (that run Microsoft's OS) seem to get routinely subverted and used to send out spam. Any machine can be used to open a network connection to another machine, and pretend to be an email server (MTA), and ask it to accept a few million spam emails for delivery to addresses on that machine. These spam emails inevitably have a "from" address that is from a domain completely unconnected with the subverted Microsoft machine (or with the spammer, for that matter).

Because of this spam, a lot of mail servers, these days, are configured to check the reverse DNS of the IP address when it gets an incoming email connection. If the domain name on the email doesn't match the domain name assigned to the IP address, or if there's no reverse DNS listing, they drop the email (and often send back an error message: "Who? Never heard of 'em.")

Remember when I said, above, that IP address blocks are usually handed out in a hierarchical fashion? This hierarchy works a lot the way the regular domain name hierarchy works; somebody is in charge of the main block of IP addresses, and they tell you who's in charge of the smaller block, etc. Eventually they get to the ISP that assigned you the IP address, and hopefully if you're meant to be sending email from your machine, they've got a reverse DNS listing set up for your machine's IP address.

While it's possible to generate fake IP packets with bogus source IP addresses, it's less trivial than just taking over one of the millions of random, weakly-secured Microsoft machines.

History and Philosophy

Before I get into the details, there's one thing that you should bear in mind:

There Ain't No Such Thing As A Free Lunch (no matter what your contract says)

You have to temper your expectations with an understanding of the real world. Partly this means you have to expect people to fall down now and again, and deal with it. Partly it means you have to understand that energy is energy, whether it's thermal or financial, and if the energy ain't there, the energy ain't there. Any system that's broken is going to run down, sooner or later.

With DNS, there are basically three separate services you are getting. Before I explain that, let me take a moment to explain some basics of how the DNS system works.

The domains are all maintained in a set of databases on a set of "top level domain" (TLD) servers. The entire DNS system is essentially a hierarchy. When you configure your PC for the 'net, you tell it what numeric IP address to go to for DNS information. That server in turn has an address it goes to when it doesn't have the answer, and so on, all the way up the hierarchy until it gets to the top level. (It's slightly more complicated than this, see the explanation of the third part, below).

By the way, these servers also remember the answers they got, for a short while, and instead of going up the line, they just give you the answer they got the last time they went up the line. This means that when things change, it can take a while for the new data to percolate all the way down. How long is a short while? It used to be a few days, lately I've heard that a lot of DNS servers are configured more in the hours range. Note that if every DNS server was set to one day, it might still take several days for the data to percolate down.

I mention this second fact mainly because it's something that sort of catches people off guard, a lot of the time, and understanding that will help you understand what's going on if you ever change your domain name stuff.

So, three services. The first two are the original DNS stuff, which consists of a) maintaining the big database(s) of domain names and the hardware for the top level of DNS servers and b) the customer service aspect of getting domain names into the big database(s), and making changes to domain names already in the big database(s). (see the section below "The History" for fun history facts).

These services are in fact quite easy to provide for a very reasonable, low cost. I would generally recommend never paying more than $15/year for registering a domain, and in fact you might even be able to find it cheaper ($10 isn't wholly unreasonable). However, that's not all there is to it.

The third service is that the registry doesn't really contain the numeric IP address of the server where your domain lives. Instead, it's a bit more indirect (which helps spread the work around). The registry contains the numeric IP address of a "primary DNS server", the DNS server that has the primary responsibility for keeping track of what numeric IP address goes with your domain.

In the early days, when everybody involved was a research organization (either at a university or at various high tech defense firms and then later just high tech firms), you handled this yourself by setting up a primary DNS server before you did any of this, or by getting somebody else who already had one to be your primary.

As the Internet became more public and inevitably more commercial, ISPs provided this service for the domains they hosted. Eventually, people started offering domain hosting just by itself as their business, instead of offering dialup and domain hosting both. Eventually, primary DNS hosting was split out as a separate service (called "domain parking" at the time, since technically you're not allowed to have a domain without having a box somewhere that answers to it), and then eventually as separate companies that just provide DNS hosting.

Which is where we come to companies NameZero and EasyDNS and Nominum. Nowadays, a lot of companies offer additional services that are somewhere between real domain hosting and simply primary DNS hosting; email-forwarding, a place-holder webpage, etc. Even this stuff, in my semi-competent opinion, should not cost much per year; email is a drop in the bucket for resources, these days. (Although of course it's possible to swamp any channel -- start spamming a few million people a day with binary attachments and you'll do some serious damage to your provider's bandwidth, before they yank your account.) However, there is a certain minimum overhead associated with setting up and running such an organization.

The History

Back in the early days, the NSF (National Science Foundation), which was providing grant funding for a lot of the Internet stuff then, contracted with a company called Network Solutions to handle all of this, creating an organization/service called the InterNIC (NIC being an acronym for Network Information Center).

When the Internet became much more public and much more popular, this turned into a lucrative monopoly for Network Solutions (in part because, some people would argue (and I'd agree) Network Solutions exceeded their authority and went from charging a one-time fee (of $35) for registering a domain name to charging a yearly, excessively expensive fee ($50, first two years up front) for registering a domain name and for maintaining existing domain names. Network Solutions also did (and continues to do) a lot of other questionable things, like completely ignoring the guidelines (.com for commercial, .org for nonprofit, .net for network infrastructure), and a great variety of other abuses which I'm sure you can learn a lot about by just googling for "internic sucks" or "network solutions sucks".

Eventually this all came to a head and the NSF actually did something fairly smart (for them, that is... for us, it kinda sucks that they didn't give Network Solutions the spanking they so richly deserved). The NSF decided "Hey, y'know, this is all turning into commerce, and we do science, not commerce. So we're just going to let the contract with Network Solutions run out later this year and not renew it, and some other governmental organization might want to be thinking about stepping up and doing something."

That other government organization was the FTC (Federal Trade Commission), which acted with unusual (for a gov't bureacracy - I don't know, this may actually be characteristic of the FTC) swiftness and even more unusual creativity - they stepped up and said, "Hey, we do commerce, so that makes this our bailiwick." The truly startlingly creative bit is, they then said, "Y'know, we've never managed an Internet before, here's our draft RFC, and we'd like to hear comments from the Internet community on it, until this date, at which point we'll decide what we're going to do."

What they ultimately came up with is a classic governmental compromise; that is, they did something that is neither fair nor just, but is neither unjust enough to really piss off the people who were previously getting screwed (that'd be the public), nor just enough to really piss off the people doing the screwing (I'll leave that up to you to guess :-). Network Solutions got to keep the large (certainly tens of millions, maybe hundreds of millions, I'm sure you could learn more by some diligent googling) sums of money they'd, ahem, acquired and were not spanked. (Though a later lawsuit finally forced Network Solutions to both drop their rate back down to $35/year and return some of the overcharge. Tnhe extra $15 in the $50, instead of $35, was supposedly to be spent on buffing up the network infrastructure, but never was.)

However, Network Solutions lost their monopoly, and other people were allowed into the game. Here's how:

A not-for-profit was set up, spun off from Network Solutions in some manner (though I don't recall the exact details) to manage and maintain the infrastructure aspect of the registry and servers for the domains that Network Solutions was responsible for (.com, .net, .org being the most well-known, though they also maintained .edu and .gov and some others at the time; I'm not sure if they still do).

Meanwhile, the customer service aspect was made public and competetive. Network Solutions itself remained in the registrar business, but other companies could, by meeting certain requirements, get into the registrar business. These weren't trivial requirements, by the way, they involved some serious financial resources. For a while there still wasn't that much competition and the price still didn't drop significantly (other than that drop back down to $35 a year).

However, somebody managed to get a registrar and immediately turned around and started acting as a wholesale registrar, selling retail franchises quite cheaply, and the prices quickly got down to $15 and in some cases even $10 a year.

Recommendations

EasyDNS was highly recommended to me. I don't know if they provide webmail or even mail forwarding. You may want to consider using two separate services, one for DNS and one for basic hosting, mail forwarding, etc.

Back to my theme of "There Is No Security", you simply cannot and never will be able to count on any one company, regardless of how highly recommended they are at the time you choose to go with them. Companies change, grow, fall, the world changes, constantly.

This is something I first started noticing as the ISP market started exploding in the mid-nineties. All of these ISPs were providing dial-up service, which involves mostly phone lines and a connection to the internet and some hardware in between. The hardware in between is relatively speaking much easier to deal with - it just involves technological expertise and competence - whereas the other two involve dealing with other people and logistics.

For example, a lot of these ISPs started out small, often out of residential locations, which at the time were by default wired for up to 16 phone lines. Most customers would dial up only for a few hours each day. Later, as the net went even more mainstream and Ma and Pa Kettle got into the act, an hour a day, or an hour every couple of days.

That meant a small ISP could easily support several times 16 customers at once - typical "oversell" rates accepted in the ISP field over the last half of the nineties (after which I fell out of touch with this topic) were at least 3-to-1 and often up to 8-to-1. In our example with 16 lines, this would mean the ISP expected to provide reasonably good quality of service to up to 48 users. Or, if the ISP got greedy or desparate, acceptable (as in too much trouble to change providers or just forget about this newfangled internet thing) quality of service to 128 users.

The latter number, by the way, generally sucked.

The thing is, this or similar issues happened to every ISP. All ISPs, somewhere along the line, hit growth issues and started to suck (except AT&T worldcom, which started big, but STILL sucked, because it was run with a big company mentality, and a big PHONE company mentality to boot (i.e. you'll take what we offer you and you'll like it, because we're the phone company)).

Lest I sound too cruel and harsh to the big phone companies, I must say I have great amounts of respect for the phone company employees back in the days when the credo was "universal service". The fact that the US landline phone system quality and societal reach is still largely unmatched in the world, even in what we laughably think of as "the first world", is a testimony to the dedication and service philosphy of these folks...

Sadly, the US cellular system sucks rocks for various reasons, mostly deriving from the fact that the US gov't treats cellular bandwidth licenses as a monstrous cash cow, completely betraying the original purpose of the FCC, which was to manage a limited, common resource (radio spectrum) for the good of the nation, NOT to make lots of cash.

But getting back to why this happened to every ISP:

This happened because the Internet got more popular, and every ISP went through growth, and every ISP started exceeding its capacity. Obviously you want to order more phone lines, no problem. Except that it isn't that easy; new phone lines have to be ordered MONTHS in advance, and they cost not-insignificant money. So, inevitably, every one of these companies would start losing ground in the quality of service area, the ratio of resources to customers. The company that was rock solid last year started sucking this year. And the company that sucked rocks last year got its shit together and started performing well this year.

(Frankly, one thing I'd really like to see in broadband, for example DSL, is some dynamic reconfiguration ability. It should be possible for my DSL modem to monitor network performance and decide the current DSL provider is losing ground, bid out for other services and switch configuration to another provider for a few minutes, and then switch to another, etc. Of course, at the moment the DSL protocols and DSLAMs and etc don't support anything at all like this. My point is simply that there's no technical reason this couldn't be done, and it would truly encourage competition and innovation by putting the choice back in the hands of the customers.)

Also, and more pernicious (in my opinion) it was not uncommon, as these companies grew, for them to reach a growth point at which they went more formal, more corporate, often were bought out by various silver-haired business eminences. I know for a fact, for example, from a friend who worked there, that this happened to Netcom. Netcom was one of the earliest public ISPs, originally started as a sort of bandwidth co-op by a crowd of hackers, until it was bought out by corporate types.

Inevitably, these eminences (now speaking back in the generic case, again; I know nothing of the inner workings of Netcom) would then start to do various stupid things, in many cases degrading the service quality, in order to serve business agendas. Often mis-guided business agendas, or personal agendas contrary to actual agenda of the business itself, though certainly not always. Business are not, after all, in business to provide a service to customers; they're in business to generate money (in theory for the shareholders, but in reality for the people involved in making the decisions).

The point we return to is, you must always assume that things will change and that as a result of this, sooner or later, things will go wrong (for you). The only defense against this, in my opinion, is enabling greater choice and flexibility for the public, as in the DSL suggestion above. Not that this will prevent things from going wrong, but it will certainly tighten the financial feedback loop that will encourage people to clean up their act, and it will minimize the damage when things do inevitably go wrong.

(In fact, if you look hard, and squint the right way, you can maybe see that this lies at the core of the genius of internet technology. Packets and traffic flow around obstructions quickly and dynamically.)

Sadly, we are not there, and frankly, I'm not seeing a trend in that direction; quite the reverse, I'm seeing a trend in the opposite direction, towards lock-in.


See original (unformatted) article

Feedback

Verification Image:
Subject:
Your Email Address:
Confirm Address:
Please Post:
Copyright: By checking the "Please Post" checkbox you agree to having your feedback posted on notablog if the administrator decides it is appropriate content, and grant compilation copyright rights to the administrator.
Message Content: