The second major choice [in designing the Web] was that the Web would
be "stateless". Imagine a network connection as your computer phoning
up HQ and starting a conversation. In a stateful protocol, these are
long conversations -- "Hello?" "Hello, welcome to Amazon. This is
Shirley." "Hi Shirley, how are you doing?" "Oh, fine, how are you?"
"Oh, great. Just great." "Glad to hear it. What can I do for you?"
"Well, I was wondering what you had in the Books department." "Hmm,
let me see. Well, it looks like we have over 15 million books. Could
you be a bit more specific?" "Well, do you have any by Dostoevsky?"
(etc.). But the Web is stateless -- each connection begins completely
anew, with no prior history.
This has its upsides. For one thing, if you're in the middle of
looking for a book on Amazon but right as you're about to find it you
notice the clock and geebus! it's late, you're about to miss your
flight! So you slam your laptop shut and toss it in your bag and dash
to your gate and board the plane and eventually get to your hotel
entire _days_ later, there's nothing stopping you from reopening your
laptop in this completely different country and picking up your search
right where you left off. All the links will still work, after all. A
stateful conversation, on the other hand, would never survive a
day-long pause or a change of country. (Similarly, you can send a link
to your search to a friend across the globe and you both can use it
without a hitch.)
It has benefits for servers too. Instead of having each client tie up
part of a particular server for as long as their conversation lasts,
stateless conversations get wrapped up very quickly and can be handled
by any old server, since they don't need to know any history.
Some bad web apps try to avoid the Web's stateless nature. The most
common way is thru session cookies. Now cookies certainly have their
uses. Just like when you call your bank on the phone and they ask you
for your account number so they can pull up your file, cookies can
allow servers to build pages customized just for you. There's nothing
wrong with that.
(Although you have to wonder whether users might not be better served
by the more secure Digest authentication features built into HTTP, but
since just about every application on the Web uses cookies at this
point, that's probably a lost cause. There's some hope for improvement
in HTML5 (the next version of HTML) since they're-- oh, wait, they're
not fixing this. Hmm, well, I'll try suggesting it.[^w])
[^w]: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-October/016742.html
The real problem comes when you use cookies to create sessions. For
example, imagine if Amazon.com just had one URL:
http://www.amazon.com/ The first time you visited it'd give you the
front page and a session number (let's say 349382). Then, you'd send
call back and say "I'm session number 349382 and I want to look at
books" and it'd send you back the books page. Then you'd say call back
and say "I'm session number 349382 and I want to search for
Dostoevsky". And so on.
Crazy as it sounds, a lot of sites work this way (and many more used
to). For many years, the worst offender was probably a toolkit called
WebObjects, which most famously runs Apple's Web store. But, after
years and years, it seems WebObjects might have been fixed. Still, new
frameworks like Arc and Seaside are springing up to take its place.
All do it for the same basic reason: they're software for building Web
apps that want to hide the Web from you. They want to make it so that
you just write some software normally and it magically becomes a web
app, without you having to do any of the work of thinking up URLs or
following REST. Well, you may get an application you can use through a
browser out of it, but you won't get a web app.
Aaron,
How do you see sessions can be carried between page requests? Do you want
to replace cookies with urls (e.g. http://some.url/page?sesid=1234)? Or
some other way?
tooymem
--
rootnode rulez ;)
No, that's even worse. I want to get rid of sessions altogether.
So how do you want to implement user authentication/authorization in this
case?
toomyem
--
rootnode rulez ;) (chyba)
Either Digest Auth or a cookie for the user. In which case the cookie
for the user isn't a session; it just identifies a request coming from
a user.
If it's a multi-step operation, will digest auth or user cookie
help in terms of remembering where the user is in the process?
Or does the user have to start over again?
Thanks,
Jack
For multi-step operations, you should design the forms so that all
relevant state is conveyed in the pages themselves. (Obviously the
details depend on the operation.)
Thanks requires a lot of information to be sent back and forth.
When the forms have many fields, and/or when there are many steps
in the operation, it doesn't sound optimal. Or am I missing
anything obvious?
Jack
So we are going back to the times, when all data was sent back and forth
wth every request. I thought that sessions were invented to aid this
problem, weren't they?
toomyem
--
rootnode rulez ;) (I guess so ...)
That's a good example. Personally, I'd have a cookie that listed the
product IDs they'd added to the cart and if they went over the
cookie-size limit I'd ask them to create an account.
No, this is the first time I've shared anything from it.
- Building Programmable Web Sites
- Introduction: A Programmable Web?
- Building for users: designing URLs
- Building for search engines: following REST
- Building for choice: allowing import and export
- Building a platform: providing APIs
- HTML
- JSON
- XML
- RDF
- Building a database: queries and dumps
- SPARQL
- compressed API
- Building for freedom: open data, open source
- open service definition
- open knowledge
- open source
- SQL dumps
- source repositories
- Conclusion: A Semantic Web?
I've written the first three chapters so far.
It still has the problems I mentioned, right: a) it uses the broken
MD5 hash, b) it requires passwords to be stored in cleartext.
Do you have a demo of the AJAX thing somewhere?
Wikipedia says that the new version of SSL uses SHA and MD5 and that
browsers refuse to use the MD5-only version.
Wikipedia says that Digest isn't yet broken but that researchers are
getting closer to using the collisions to break Digest.
> Digest authentication does not require you to store passwords in
> plaintext. You only need to store the hash of
> "username:realm:password".
Hmm, that's better than I thought.
I work for scientists doing air pollution research. I'm using web.py to
deliver emissions, weather and pollution data to anyone.
In this domain, sessions are obviously totally useless. The data is
public, so there's no authentication. It's read-only, so theres no
state. So I'm in a close to the original static hypertext system, except
I extract subsets of data and format it on the server side. Using
anything else than stateless http-get is pointless.
But sometimes I have to check out data from other organizations. Too
often, their site is an application, which requires several http-post
roundtrips to get what you want. Very difficult to access with scripts,
which is what I do all the time.
So my emphasis here is that tools, like ASP.NET, which make creating an
application with session a breeze, push people too much to create
stateful systems when stateless would be much much better.
Kari
Hi Aaron,
RFC 2616: HTTP/1.1, Section 15.6:
> 15.6 Authentication Credentials and Idle Clients
>
> Existing HTTP clients and user agents typically retain authentication
> information indefinitely. HTTP/1.1. does not provide a method for a
> server to direct clients to discard these cached credentials. This is
> a significant defect that requires further extensions to HTTP.
> Circumstances under which credential caching can interfere with the
> application's security model include but are not limited to:
>
> - Clients which have been idle for an extended period following
> which the server might wish to cause the client to reprompt the
> user for credentials.
>
> - Applications which include a session termination indication
> (such as a `logout' or `commit' button on a page) after which
> the server side of the application `knows' that there is no
> further reason for the client to retain the credentials.
>
> This is currently under separate study. There are a number of work-
> arounds to parts of this problem, and we encourage the use of
> password protection in screen savers, idle time-outs, and other
> methods which mitigate the security problems inherent in this
> problem. In particular, user agents which cache credentials are
> encouraged to provide a readily accessible mechanism for discarding
> cached credentials under user control.
(Source: <http://www.ietf.org/rfc/rfc2616.txt>)
That is the reason for HTTP authentication not finding wide deployment.
In my applications, at least. Of course, it doesn't "look pretty", yes,
that's an added problem, but using existing and well-tested mechanisms
like this requires less code and is thus less error-prone. If HTTP/1.1
provided a logout functionality we'd see a whole lot less SQL injection
attacks in login forms. Simply because they wouldn't exist.
I will leave your real point ("web applications should be stateless")
untouched, though :). For today.
Greetings.
Thanks :)
Anyway, this anti-static-RFC rant is going offtopic :) I guess I should
just read the news more ;)
Greetings