Java Web App Development: Browser State

by Steven J. Owens (unless otherwise attributed)

Back to Overview

Browser State

In this modern age of AJAX and javascript, some of the following may seem quaint but they're still fundamentals of how the browser handles state.

Bouncing Data Off The Browser

Browsers are mostly stateless, but there's a difference between mostly stateless and completely stateless. Everybody, sooner or later, finds it useful to push a little state to the browser and then get it back, usually somewhere else in the application, or on rare occasions in completely different applications.

In a nutshell, this is done by causing the browser to make a request to the other site. There are only a few things that cause the browser to make a request without requiring a user to click on something, and there are only a few ways to sneak data into that second request. Put these together, and the list you get is:

Redirects with GET-style parameters

Pages with IMG tags with URLs that include GET-style parameters

Forms with hidden inputs and javascript auto-submits

Redirects with Cookies

Pages with IMG tags with Cookies

Framesets (in theory, but rarely in practice)

Most of these are pretty obvious to figure out.

Cookies

Everybody knows what a cookie is, but just for the sake of completeness, I'll define it: it's a little chunk of data that the web server asks the browser to hang onto for some length of time, and then some server side script uses it at a later point in time.

Go read up in your reference materials on cookies for attributes and such, but I'm going to briefly explain them, mainly because there are a couple nuances people usually miss.

Cookies are defined or redefined by the server by simply including a cookie header with any set of response headers. You can even do this in a redirect response.

include HTTP response with cookie header example here

Once you define a cookie, the browser just includes that cookie header in any request it sends to the same domain.

include HTTP request with cookie header example here

You can get more specific about it by using an attribute to tell the browser to only include the cookie header with requests to specific paths on the web server.

The cookie can only be included to requests to the same domain, to safeguard the user's privacy, but there are two important gotchas that people miss; image tags and subdomains.

Image tags in a page can point at entirely different sites. This is how nefarious types like doubleclick keep track of what sites you visit and build up a dossier of your interests. They have banner ad image tags on sites all over the place that point back to doubleclick.com. When your browser requests the banner image file directly from doubleclick, they can set a cookie. When an entirely unrelated site has another doubleclick ad banner, your browser happily includes the doubleclick.com cookie along with the request for that banner image file.

Typically this sort of slimy trick is combined with GET-style parameters in the URL in the image tag's source attribute. The GET parameter identifies what site the user is viewing, the cookie groups that together with the other sites the user viewed, and now Doubleclick knows way more about what websites you like to spend your time at.

Subdomains are much less often used, but can be useful if you're working on a complex web application for a company that has more than one site with the same root domain (e.g. they have bar.foo.com and baz.foo.com). Remember, if you set a cookie to have a domain of "foo.com", every time the browser requests something from "foo.com" it will include a copy of that cookie along wtih the request. You can also set the cookie with the domain ".foo.com" (note the leading period, that's important) and it will be included in requests to www.foo.com and also in reqeusts to bar.foo.com, baz.foo.com, etc. This can be handy if you control both servers, and want a convenient way to pass data from one server to the other for some reason. It's not super-useful; most of the time, if you control both ends, you'd be better off using redirects with parameters, or some back-end channel.

Caching and Double-Submitting and Redirects

There are all sorts of headers and meta tags you can try to use to fiddle with browser caching. Despite this, it is, strictly speaking, completely impossible to prevent the user from intentionally double-submitting, and pretty much impossible to prevent the possibility of an accidental double-submit.

My advice is don't bother trying to stop it from happening at the client, make sure you code the server to cope, either by recognizing it as a double submit or by making double submits harmless.

However, what I do want to avoid is having the browser generate lots of double-submits when the user uses the back and forward buttons. It's been a few years since I figured this out, so maybe browsers have gotten saner. The only way I've found is to have the response to the submit be a redirect to another page that then displays whatever you want, usually by pulling it from the user's session cache. The redirect doesn't show up in the browser history, though the browser automatically follows it anyway. The browser never pops up a dialog offering to re-post the data.

See original (unformatted) article