Caching Discussed

Brent Halliburton (hallib35@opim.wharton.upenn.edu)
Wed, 8 Feb 1995 14:30:42 -0800

Arnold says:

>Am I to understand that when user Alpha from a site with a caching proxy
>accesses a page from my server, that when user Beta accesses the same
>page within a certain period of time they will get what Alpha accessed,
>without going to my server? And this time period could be a day, or
>longer?
[Snip]
>Actually, I hope that *I* am the one who does not understand how this works.
>
>Finally, someone mentioned that you can put an "expiratio " on a
>page and that may affect (override?) caching. What's that about?

Brent here. Thought this beared a little explanation.

Arnold may be right, and he may be wrong. It really depends on how smart
the designer of the server is, and how smart the designer of the proxy is.

Lets look at a really smart proxy. A really smart proxy for a big site
like Prodigy takes advantage of everything. He caches every document under
the sun, he makes full use of expires headers where available, and he is
generally real careful. When he gets a request for a document he has
cached, he looks to see if it has an Expires header. If it doesn't, he
says, "Damn that server". Then he does a head request on the document.

If the last-modified date is older than the date the document was last
retrieved, then he displays the cached document. If it has been changed
since he last retrieved it, then he goes across the Net and fetches it. If
the document has an Expires header, the server is in an ideal position. He
can trash files that expire, allowing him to save space and sort through
files more efficiently, and if someone requests a document, by using the
expires header then he can easily look and see if the document needs to be
retrieved again. If the document expires, he can quickly retrieve the
document for the user without spending time making a head request. Or, he
could retrieve the latest version asynchronously, anticipating a user
request.

What this means is that users get service faster. We have all discussed
the problems of bandwidth. Caching is a solution. If Prodigy users were
eating up all of your other customers bandwidth, you would not be happy.
Here, you are getting lightning fast redistribution to them. That makes
them and you happy.

How can you be a part of this? Use a smart server. If you have a lot of
static documents (not the case for many of us, but if you go talk to the
NCSA...) then you should make full use of "Expires" headers to take care of
the proxy. You may say you need hits to justify it to the customer. But
the customers this is most useful for (For example, Sun) will realize the
benefits not in hit rates, but in happy consumers.

If you have a lot of rapidly moving documents, should you do things
different? No. Make full use of Expires headers. Set a document that
changes every two second to expire right away. If you don't then a smart
proxy will waste time that is valuable to a customer and you by making a
head request and then retrieving the full document.

Here is a nice example of a document that changes often and its header:

HTTP/1.0 200 Virtual doc follows
MIME-Version: 1.0
Server: NiceWaitress/3.0
Date: Tuesday, 07-Feb-95 05:48:59 GMT
Expires: Tuesday, 07-Feb-95 05:48:59 GMT

Very smart server.

Who has a server that provides this functionality? No one. Not enough
demand yet. Netscape caches documents and does head requests. There are
CGI programs that put in expires headers, but if you want a big answer, you
have to hack it yourself. Proxy's are in demand, but really smart,
bandwidth-concious servers? They are working on it. (Anybody bought the
10 times faster Netscape server, or is CERN and NCSA servers "good enough
for now"?)

Anywway, you don't have to let a proxy foil you. Remember, if you hack a
smart server now, you can trust that everyone else will write a smart
proxy. I can see the press release now:

-----
Latest Internet Ad Campaign launched by AOL

AOL announced their Internet browser to crazy fanfare. Steve Case was
quoted as saying: "We provide the best Web service an on-line network has.
Our caching system is state of the art. There won't be the accidents that
happened on Prodigy. People getting out of date information. Information
taking forever to download. On AOL, these are a thing of the past. I
guarantee that our cache will provide more accurate data more quickly."
-----

I don't think Prodigy or Compuserve or Microsoft will let that happen.
Proxying should be something they take quite seriously.

This is the future. Your one site will not support accesses from 100
million people all over the globe in however many years. Look at GNN.
Already they are setting up mirrors globally. Proxies are mirrors provided
by those who make moeny keeping customers happy. Because I am to poor to
set up mirrors nationwide, Prodigy does. They can use this as a selling
point. A Prodigy browser with a 9600 may get faster access to
www.playboy.com then me on a 56k. That is something they can sell to
people. And they profit.

Brent Halliburton

Brent Halliburton
Director of Business Operations, Group Cortex Philadelphia Design Building
Professional Internet Services 2300 Chestnut Street, Suite 230
Brent@cccc.com Philadelphia, PA 19104
http://Www.NetWeb.Com/cortex/ Voice: (215) 854-0646 Fax: (215) 854-066