by Steven J. Owens (unless otherwise attributed)
If you POST to a CGI and display the returned HTML, it becomes very hard to avoid the browser either spontaneously rePOSTing or popping up a dialog to the user asking if they'd like to rePOST.
Due to differing cache behaviors, no-cache headers and directives are not a reliable way to prevent this.
The best fix is to have the CGI return a redirect to a viewable page, a page that is "idempotent", which means it doesn't actually alter the state of the application, just take the data passed in and displays it.
Any data that the CGI generated has to be passed somehow:
The problem with GET query parameters is that it's limited in size of data, and very visible. The visibility issue is mainly a matter of perception management risks:
Risk #1 isn't really that much of a risk - there's so much use of GET with query parameters on the net that users should be used to it by now.
In the case of a login form, risks #2, #3 and #4 are actually valid, since the bookmarked URL would contain the username and password.
Other than a login form, bookmarking and URL caching should not actually be a problem, since, as stated above, the redirect-target page should be "idempotent", what happens is simply that the page redisplays with that data. This can actually be a convenience, since the user can bookmark a particular bit of data display
One could make a weak argument that this convenience could be a security risk, in situations where the data itself isn't supposed to be exposed in a different context. This is a weak argument since most of the time the user's browser would try to reload the remote page, which would take the user to whatever login process.
There is a small amount of credible risk, because a technologically savvy user could copy & paste the query parameters manually deconstruct them. However, in any case where this is a significant security issue, you could encrypt the parameter string. In any event, you probably shouldn't be using a web UI in such a high-security context.
Still, between the general hassle of these issues, and the occasional instance where you really need to worry about them, you may want to implement a better mechanism, and once you have, you might as well use the same mechanism everywhere.
Setting the data on a cookie in the redirect response, and reading it in the viewable page, offers many of the advantages of the GET query parameter approach, without the disadvantages. The big problem with this approach is collision - if multiple windows to your application are open at the same time, then each response will write to the same cookie.
What you really want is a cookie that acts like a GET query parameter, and is local to that request sequence. The obvious fix is to uniquify each cookie name, but now you're proliferating a zillion cookies to the user's browser, which can be annoying to the user.
It also has the same problem as the GET query parameter, which is that you can't stuff a lot of data into it.
Note: In moderately-high-security contexts there's also a security risk with using cookies for login credentials or other information that shouldn't stick around, which is that there's no way to really make sure that the cookie is cleared.
If you set the cookie to not be cached, it'll hang around as long as the browser's open. If the user walks away from an open browser, there goes your security.
You could try to make sure there's always one last browser-server roundtrip to send back a cookie-setting header that sets the cookie to a harmless value, but there's no way to be certain this will happen.
This is generally the way this particular problem is solved. It does have some collision risks, but generally the caching is much closer to home, under your almost-complete control, and doesn't have to shove large chunks of data across the wire and back.
Note: In some cases the collision risks can actually work to your advantage. I've developed applications where there was a key set of cached server-side objects, one object per type. On each page that has a select dropdown to load the object, it defaulted to the currently cached object, which makes it quite convenient to hop around the application and always have, for example, the same customer record loaded in a different screen.
A thought I'm considering now, mainly for use in a java webapp input validation scheme I'm designing, is a generalized Redirect Cache.
Each user would have a RedirectCache in their session object. When, for reasons of security, I want to send a fairly complex set of data to another page, but have that page be idempotent, I would simply set the data on the redirect cache, get a unique key for the redirect cache entry, and send a redirect with that unique key as a GET query parameter.
For extra-high security, the Redirect Cache key could be configured to be single-use - clear that set of data from the cache on the first use.
Since I'm considering implementing this specifically for http input validation, I will probably go further, and have a generic request attribute/parameter mapping scheme. I'll be able to pass a request object in, and the cache will map all request parameters and attributes to a cache entry. When the app receives the redirect GET, it'll pass the request in and get back a HttpRequestWrapper with all of the parameters and attributes from the original request in it.