The Domain Name System

by Steven J. Owens (unless otherwise attributed)

An Explanation of the Details, History and Philosophy of the Domain Name System (DNS)

Or: MAC addresses and IP addresss and Domain Names, Oh My!

This started life as two separate articles, which I've quickly and clumsily merged. I'm going to start with the gory details of the basic technology. Below, I'll talk about how the whole system is organized, and how it got that way, and how that should affect your philosophy.

Gory Details

The part that plugs your computer into the network is called the ethernet card, aka NIC (Network Interface Card). Ethernet isn't the only kind of networking, but it's the most popular.

Now, every ethernet card, when it's manufactured, has a MAC address set for it. Essentially, every ethernet card in existence has a unique serial number built into it, called the MAC address. Why is it called a MAC address? Good question, lemme look it up... Here's a fun URL, it appears to let you identify the manufacturer from your ethernet card's MAC:

http://www.coffer.com/mac_find/

Here's a more technical discussion of MAC addresses:

http://www.erg.abdn.ac.uk/users/gorry/course/lan-pages/mac-vendor-codes.html

Ah, reading here, the protocol that ethernet uses to encode data is called the "Medium Access Control" protocol, so the MAC address is the address under that protocol.

Okay, so you plug an ethernet card into a local area network, and the network gets data to the ethernet card by putting the MAC address on it. Essentially it's as if the cards were all transmitting many short messages, each message starting with something like "This message is for MAC address 12345", all the ethernet cards on that local area network hear everything, but only the ethernet card that caresabout that MAC address pays attention to messages that start with that MAC address.

So that's what the local network uses to get bits along the wire from one machine (the hub) to another.

Now, one level up, in terms of organizational hierarchy, is the IP address. The infamous "dotted-quad", like 192.168.1.1.

(By the way, any time you see an address like 192.168.xxx.xxx, it's a private network address. The 192.168.xxx.xxx range is never assigned to anybody. It's meant to be used in-house, and then the gateway machine between the private network and the public internet translates addresses going in and out. There are a few other private IP ranges, like 10.xxx.xxx.xxx, 172.16.xxx.xxx and 169.254.xxx.xxx. Practically speaking, a machine on a private network can have any IP address you want, but the idea is that the official private IP ranges mean you can tell just from looking at an IP address whether it's private or not).

In general, The IP addresss uniquely identify a single given machine at any one time. It's generally supposed to be one IP address per machine (not counting machines on private networks), but of course in the past seven years, that got considerably mucked up. An IP address identifies a specific computer at any given instant. Or more properly, a specific network card at any given instant, which is almost always a one-to-one mapping, but not always. For example the gateway machine I mentioned above would probably have two network cards, one speaking to the private network and one speaking to the public internet. The first generation of webservers that served multiple domain names off one box used multiple IP addresses to identify the same NIC, one per domain name, and the webserver figured out what page to give back based on the IP address the request came in on. Nowadays they've enhanced the HTTP protocol so the browser includes the domain name in the request itself, and the webserver can look at that.

So the IP address is like the MAC address, except that it isn't permanent. Sometimes an IP address is semi-permanently assigned to a particular use; this is called a "static IP address", but it's still not at all permanent in a hardware sense. It's just that nobody else is supposed to answer to that IP address. IP addresses can be dynamically assigned per session (the kind you usually get with a dialup or with a consumer-level dsl/cable modem) or statically assigned. An IP address is a temporarily assigned number, that identifies a machine at the Internet level of organization.

In fact there's nothing to prevent somebody from just screwing up and assigning the same IP address to two machines, which can cause all sorts of confusion and poor performance on your local network, not to mention flakiness.

The general nature of static IP addresses is that they are defined in a simple config file that is often (even usually) edited by hand. So, they tend to stay that way until somebody changes them.

Dynamic IP addresses are usually assigned in a range, i.e. 192.168.1.9 to 192.168.1.16. Each ISP is assigned a block of IP addresses. This is typically done in a hierarchical fashion - a large "backbone" provider is assigned a very large block of IP addresses, they hand out smaller chunks of that block to their customers, who hand out smaller chunks to their customers. This turns out to be somewhat important, because of how "reverse DNS" works (see below).

Three or four levels down, a range of IP addresses is assigned to a DHCP server that belongs to the ISP (Dynamic Host Configuration Protocol - a host being any machine connected to the net). The DHCP server works at the infrastructure level of the network. When your machine connects, it broadcasts a request for a DHCP address. The DHCP server responds with an IP address that is "leased" to your machine for a given time (mintues, hours, days). When the lease expires, the DHCP server erases the entry in its records that says it was assigned to your machine, and may assign it to somebody else. Hopefully your machine erases the lease at the same time (it can be a real pain when it doesn't).

Lately, small office routers and firewalls come with built-in DHCP servers. Recently, these DHCP servers provide an option to tell the DHCP server to always assign this IP address to this MAC address. So we have a sort-of-static, dynamically-assigned IP address. But at least it's centrally managed.

Domain Names

Okay, now we come to domain names. Domain names are basically a logical identifier assigned to a given IP address. Actually, let me rephrase that.

Domain names are human-readable, logical identifiers. In some cases not too human-readable :-).

Domain names are partly a mnemonic device, partly a logical structure.

A domain defines a logical name. The DNS system defines what IP address answers to that logical name. The logical names are defined in a fairly human-readable format, which we know as foo.com, bar.net, and so forth.

How DNS works is, you type in a domain name like www.google.com in your browser bar. One part of your browser handles DNS queries. It keeps track of domain names you've asked for addresses for, recently, and the IP that it eventually tracked down for that domain.

(In case you're wondering what the heck "foo" is: Foo and bar are generic variables that programmers use in conversation, sort like somebody might say "John Doe", only less specific (since we usually mean a person when we say "John Doe", where foo an bar can be anything) The jargon term for them is "metasyntactic variable". This generally comes from the expletive acronym "fubar", which dates from WorldWar II: Fucked Up Beyond All Recognition. It's explained further at:

http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=foo

Foldoc, by the way, is a generally useful information source. FOLDOC == Free On Line Dictionary Of Computing.)

Your computer has a DNS resolver, which basically has two jobs; first, to send a request to the Domain Name System (DNS) server to get the IP that a particular domain name points to. Second, to remember the recently requested IP addresss, to avoid having to go out and re-request all the time.

Now, your browser has a DNS cache - I'm not sure, it may be built into windows by now. I'm reliably informed that on modern mozilla browsers (e.g. firefox, mozilla, AOL's netscape distribution) the DNS cache runs as a separate process. On Linux, a separate program handles this.

In your local network, there's one specific machine that has the job of doing this for the local netowrk. It's a lot like calling information on your phone. You call information, give them the name, they give you the number, then you call the number.

Your local DNS server does exactly the same thing - it keeps a list of addresses recently requested, and anything it can't handle, it sends a request up the line - usually at this point it's your organization's ISP DNS that gets the request. That DNS server keeps track of the numbers it's looked up recently, so it doesn't have to go up the hierarchy for every single request.

This hierarchy repeats all the way up the line to the "root level DNS servers", which used to be maintained by Network Solutions, aka the InterNIC (Internet Network Information Center). Since the FTPC split everything, they split maintaining the root level servers out from actually adding new names (in theory at least, though both organizations are still heavily intertwined with each other).

Adding new names is now referred to as being a registrar, and is meant to be competetive and customer-service oriented. Maintaining the root level servers is supposed to be not-for-profit and infrastructure related.

So what you do is, through your registrar, you tell the root level servers, "this machine over here is responsible for keeping track of what IP address SAGReiss.com points to." Usually that machine belongs to your registrar or your ISP, and you then tell them to have it point to whatever IP your site lives on.

One nuance to be aware of is, because of all of those lists-of-recently-requested-domains (aka DNS caches), if you make a change, it can take a while for the DNS hierarchy to catch up with it. Sometimes days. In the old days, typically 2-4 days. These days it generally seems faster... it all depends on how long the individual DNS servers in the hierarchy are configured to keep information in their cache. But I've recently heard people complain about a change taking days, which may have been the DNS system, or may have been organizational holdups at their ISP, who knows.

Bear in mind that all of the DNS servers are privately owned by different people. The root level servers are owned by the not-for-profit which was set up by the FTC - a gov't organization - and ICANN. ICANN is responsible for handing out blocks of IP addresses to the top of the hierarchy, as well as (apparently) setting policy for how things like the not-for-profit are handled.

Reverse DNS

This is a topic that's becoming more important, especially for sending email, these days. Ordinary DNS takes a domain name (foo.com) and converts it to an IP address (192.168.1.1). Reverse DNS takes an IP address and tells you what domain name is assigned to it.

Why would you care? Mainly, as it turns out, because lots of machines run Micrsoft's OS, and funnily enough, these machines (that run Microsoft's OS) seem to get routinely subverted and used to send out spam. Any machine can be used to open a network connection to another machine, and pretend to be an email server (MTA), and ask it to accept a few million spam emails for delivery to addresses on that machine. These spam emails inevitably have a "from" address that is from a domain completely unconnected with the subverted Microsoft machine (or with the spammer, for that matter).

A lot of mail servers, these days, are configured to check the reverse DNS of the IP address when it gets an incoming email connection. If the domain name on the email doesn't match the domain name assigned to the IP address, or if there's no reverse DNS listing, they drop the email (and often send back an error message: "Who? Never hear of 'em.")

Remember when I said, above, that IP address blocks are usually handed out in a hierarchical fashion? This hierarchy works a lot the way the regular domain name hierarchy works; somebody is in charge of the main block of IP addresses, and they tell you who's in charge of the smaller block, etc. Eventually they get to the ISP that assigned you the IP address, and hopefully if you're meant to be sending email from your machine, they've got a reverse DNS listing set up for your machine's IP address.

While it's possible to generate fake IP packets with bogus source IP addresses, it's less trivial than just cracking one of the millions of random, weakly-secured Microsoft machines.

History and Philosophy

Before I get into the details, there's one thing that you should bear in mind:

There Ain't No Such Thing As A Free Lunch (no matter what your contract says)

You have to temper your expectations with an understanding of the real world. Partly this means you have to expect people to fall down now and again, and deal with it. Partly it means you have to understand that energy is energy, whether it's thermal or financial, and if the energy ain't there, the energy ain't there. Any system that's broken is going to run down, sooner or later.

With DNS, there are basically three separate services you are getting. Before I explain that, let me take a moment to explain some basics of how the DNS system works.

The domains are all maintained in a set of databases on a set of "top level domain" (TLD) servers. The entire DNS system is essentially a hierarchy. When you configure your PC for the 'net, you tell it what numeric IP address to go to for DNS information. That server in turn has an address it goes to when it doesn't have the answer, and so on, all the way up the hierarchy until it gets to the top level. (It's slightly more complicated than this, see the explanation of the third part, below).

By the way, these servers also remember the answers they got, for a short while, and instead of going up the line, they just give you the answer they got the last time they went up the line. This means that when things change, it can take a while for the new data to percolate all the way down. How long is a short while? It used to be a few days, lately I've heard that a lot of DNS servers are configured more in the hours range. Note that if every DNS server was set to one day, it might still take several days for the data to percolate down.

I mention this second fact mainly because it's something that sort of catches people off guard, a lot of the time, and understanding that will help you understand what's going on if you ever change your domain name stuff.

So, three services. The first two are the original DNS stuff, which consists of a) maintaining the big database(s) of domain names and the hardware for the top level of DNS servers and b) the customer service aspect of getting domain names into the big database(s), and making changes to domain names already in the big database(s). (see the section below "The History" for fun history facts).

These services are in fact quite easy to provide for a very reasonable, low cost. I would generally recommend never paying more than $15/year for registering a domain, and in fact you might even be able to find it cheaper ($10 isn't wholly unreasonable). However, that's not all there is to it.

The third service is that the registry doesn't really contain the numeric IP address of the server where your domain lives. Instead, it's a bit more indirect (which helps spread the work around). The registry contains the numeric IP address of a "primary DNS server", the DNS server that has the primary responsibility for keeping track of what numeric IP address goes with your domain.

In the early days, when everybody involved was a research organization (either at a university or at various high tech defense firms and then later just high tech firms), you handled this yourself by setting up a primary DNS server before you did any of this, or by getting somebody else who already had one to be your primary.

As the Internet became more public and inevitably more commercial, ISPs provided this service for the domains they hosted. Eventually, people started offering domain hosting just by itself as their business, instead of offering dialup and domain hosting both. Eventually, primary DNS hosting was split out as a separate service (called "domain parking" at the time, since technically you're not allowed to have a domain without having a box somewhere that answers to it), and then eventually as separate companies that just provide DNS hosting.

Which is where we come to companies NameZero and EasyDNS and Nominum. Now, a lot of companies offer additional services that are somewhere between real domain hosting and simply primary DNS hosting; email-forwarding, a place-holder webpage, etc. Even this stuff, in my semi-competent opinion, should not cost much per year; email is a drop in the bucket for resources, these days. (Although of course it's possible to swamp any channel -- start spamming a few million people a day with binary attachments and you'll do some serious damage to your provider's bandwidth, before they yank your account.) However, there is a certain minimum overhead associated with setting up and running such an organization.

The History

Now, back in the early days, the NSF (National Science Foundation), which was providing grant funding for a lot of the Internet stuff then, contracted with a company called Network Solutions to handle all of this, creating an organization/service called the InterNIC (NIC being an acronym for Network Information Center).

When the Internet became much more public and much more popular, this turned into a lucrative monopoly for Network Solutions (in part because, some people would argue (and I'd agree) Network Solutions exceeded their authority and went from charging a one-time fee (of $35) for registering a domain name to charging a yearly, excessively expensive fee ($50, first two years up front) for registering a domain name and for maintaining existing domain names. Network Solutions also did (and continues to do) a lot of other questionable things, like completely ignoring the guidelines (.com for commercial, .org for nonprofit, .net for network infrastructure), and a great variety of other abuses which I'm sure you can learn a lot about by just googling for "internic sucks" or "network solutions sucks".

Eventually this all came to a head and the NSF actually did something fairly smart (for them, that is... for us, it kinda sucks that they didn't give Network Solutions the spanking they so richly deserved). The NSF decided "Hey, y'know, this is all turning into commerce, and we do science, not commerce. So we're just going to let the contract with Network Solutions run out later this year and not renew it, and some other governmental organization might want to be thinking about stepping up and doing something."

That other government organization was the FTC (Federal Trade Commission), which acted with unusual (for a gov't bureacracy - I don't know, this may actually be characteristic of the FTC) swiftness and even more unusual creativity - they stepped up and said, "Hey, we do commerce, so that makes this our bailiwick." The truly startlingly creative bit is, they then said, "Y'know, we've never managed an Internet before, here's our draft RFC, and we'd like to hear comments from the Internet community on it, until this date, at which point we'll decide what we're going to do."

What they ultimately came up with is a classic governmental compromise; that is, they did something that is neither fair nor just, but is neither unjust enough to really piss off the people who were previously getting screwed (that'd be the public), nor just enough to really piss off the people doing the screwing (I'll leave that up to you to guess :-). Network Solutions got to keep the large (certainly tens of millions, maybe hundreds of millions, I'm sure you could learn more by some diligent googling) sums of money they'd, ahem, acquired and were not spanked. (Though a later lawsuit finally forced Network Solutions to both drop their rate back down to $35/year and return some of the overcharge (the extra $15 in the $50, instead of $35, was supposedly to be spent on buffing up the network infrastructure, but never was).

However, Network Solutions lost their monopoly, and other people were allowed into the game. Here's how:

A not-for-profit was set up, spun off from Network Solutions in some manner (though I don't recall the exact details) to manage and maintain the infrastructure aspect of the registry and servers for the domains that Network Solutions was responsible for (.com, .net, .org being the most well-known, though they also maintained .edu and .gov and some others at the time; I'm not sure if they still do).

Meanwhile, the customer service aspect was made public and competetive. Network Solutions itself remained in the registrar business, but other companies could, by meeting certain requirements, get into the registrar business. These weren't trivial requirements, by the way, they involved some serious financial resources. For a while there still wasn't that much competition and the price still didn't drop significantly (other than that drop back down to $35 a year).

However, somebody managed to get a registrar and immediately turned around and started acting as a wholesale registrar, selling retail franchises quite cheaply, and the prices quickly got down to $15 and in some cases even $10 a year.

Recommendations

EasyDNS was highly recommended to me. I don't know if they provide webmail or even mail forwarding. You may want to consider using two separate services, one for DNS and one for basic hosting, mail forwarding, etc.

Back to my theme of "There Is No Security", you simply cannot and never will be able to count on any one company, regardless of how highly recommended they are at the time you choose to go with them. Companies change, grow, fall, the world changes, constantly.

This is something I first started noticing as the ISP market started exploding in the mid-nineties. All of these ISPs were providing dial-up service, which involves mostly phone lines and a connection to the internet and some hardware in between. The hardware in between is relatively speaking much easier to deal with - it just involves technological expertise and competence - whereas the other two involve dealing with other people and logistics.

For example, a lot of these ISPs started out small, often out of residential locations, which at the time were by default wired for up to 16 phone lines. Now, most customers would dial up only for a few hours each day. Later, as the net went even more mainstream and Ma and Pa Kettle got into the act, an hour a day, or an hour every couple of days.

That meant a small ISP could easily support several times 16 customers at once - typical "oversell" rates accepted in the ISP field over the last half of the nineties (after which I fell out of touch with this topic) was at least 3-to-1 and often up to 8-to-1. In our example with 16 lines, this would mean the ISP expected to provide good quality of service to up to 48 users. Or, if the ISP got greedy or desparate, acceptable (as in too much trouble to change providers or just forget about this newfangled internet thing) quality of service to 128 users.

The latter number, by the way, generally sucked.

Now, the thing is, this or similar issues happened to every ISP. All ISPs, somewhere along the line, hit growth issues and started to suck (except AT&T worldcom, which started big, but STILL sucked, because it was run with a big company mentality, and a big PHONE company mentality to boot (i.e. you'll take what we offer you and you'll like it, because we're the phone company (*)).

(* Lest I sound too cruel and harsh to the big phone companies, I must say I have great amounts of respect for the phone company employees back in the days when the credo was "universal service". The fact that the US landline phone system is still largely unmatched in the world, even in what we laughably think of as "the first world", for quality and societal reach is a testimony to the dedication and service philosphy of these folks... Sadly, the US cellular system sucks rocks for various reasons (mostly deriving from the fact that the US gov't treats cellular bandwidth licenses as a monstrous cash cow, completely betraying the original purpose of the FCC).)

But getting back to why this happened to every ISP:

This happened because the Internet got more popular, and every ISP went through growth, and every ISP started exceeding its capacity. Obviously you want to order more phone lines, no problem. Except that it isn't that easy; new phone lines have to be ordered MONTHS in advance, and they cost not-insignificant money. So, inevitably, every one of these companies would start losing ground in the quality of service area, the ratio of resources to customers. The company that was rock solid last year started sucking this year. And the company that sucked rocks last year got its shit together and started performing well this year.

(Frankly, one thing I'd really like to see in broadband, for example DSL, is some dynamic reconfiguration ability. It should be possible for my DSL modem to monitor network performance and decide the current DSL provider is losing ground, bid out for other services and switch configuration to another provider for a few minutes, and then switch to another, etc. Now, of course, at the moment the DSL protocols and DSLAMs and etc don't support anything at all like this. My point is simply that there's no technical reason this couldn't be done, and it would truly encourage competition and innovation by putting the choice back in the hands of the customers.)

Also, and more pernicious (in my opinion) it was not uncommon, as these companies grew, for them to reach a growth point at which they went more formal, more corporate, often were bought out by various silver-haired business eminences. I know for a fact, for example, from a friend who worked there, that this happened to Netcom. Netcom was one of the earliest public ISPs, originally started as a sort of bandwidth co-op by a crowd of hackers, until it was bought out by corporate types.

Inevitably, these eminences (now speaking back in the generic case, again; I know nothing of the inner workings of Netcom) would then start to do various stupid things, in many cases degrading the service quality, in order to serve business agendas. Often mis-guided business agendas, or personal agendas contrary to actual agenda of the business itself, though certainly not always. Business are not, after all, in business to provide a service to customers; they're in business to generate money (in theory for the shareholders, but in reality for the people involved in making the decisions).

The point we return to is, you must always assume that things will change and that as a result of this, sooner or later, things will go wrong (for you). The only defense against this, in my opinion, is enabling greater choice and flexibility for the public, as in the DSL suggestion above. Not that this will prevent things from going wrong, but it will certainly tighten the financial feedback loop that will encourage people to clean up their act, and it will minimize the damage when things do inevitably go wrong.

(In fact, if you look hard, and squint the right way, you can maybe see that this lies at the core of the genius of internet technology. Packets and traffic flow around obstructions quickly and dynamically.)

Sadly, we are not there, and frankly, I'm not seeing a trend in that direction; quite the reverse, I'm seeing a trend in the opposite direction, towards lock-in.


See original (unformatted) article

Feedback

Verification Image:
Subject:
Your Email Address:
Confirm Address:
Please Post:
Copyright: By checking the "Please Post" checkbox you agree to having your feedback posted on notablog if the administrator decides it is appropriate content, and grant compilation copyright rights to the administrator.
Message Content: