10/02/2003 @10:50:31 ^12:01:37
SNAFU!
This site was put up nearly a year ago. Since then, a number of other people I know have put up similar things. A number of them were only updated two or three times, but some of them endure... so I stole the idea from newsdot to make:
Site News of Acquaintances From University (SNAFU)
It was described last night as "clever" and "quite useful" but it's not yet without its problems.
- not every site server gives you a Last-Modified header with its pages. In this case you have to fetch the entire file and check for updates.
- worse, Pete's web host's advertising policy puts a load of javascript into his pages that keeps changing, so it looks like his site's always updating when it isn't.
- It currently updates forty two minutes past every hour, I don't think that's often enough but any more and its bandwidth use becomes somewhat unsociable. There are some things you can do with HTTP request headers, like specifying an "if modified since" time to make the request conditional. I'll have to look into that further.
- I don't think it's got enough people's pages on it, but like I say a lot of people started weblogs but failed to keep them up. I'm open to requests but bear in mind if your site never updates you'll be embarrassed...
Peter said to me "how can a CGI script have a last modified time?". I think there are many ways in which a dynamically created page can return the time it was last updated.
- A weblog has the time it was last added to...see below
- More generally, any page that is an interface to some sort of database has the time the database was last modified. Even something as frequently updated as a share price index.
- A possible exception would be a form processor, you know, something that takes user input and does something with it. p3find, for example. You could possibly return the time the page source was last modified, but that might screw up caching.
Here's some PHP to fix the first case. For this to work you need a file whose last modified time changes when, and only when, your site updates. For example, I have my Berkeley database of updates. Then you write the following code, or similar:
header(sprintf("Last-Modified: %s GMT", gmdate("D, d M Y H:i:s", filemtime(/PATH/TO/FILE))));
This goes at the top of your index.php (or something else - I can have it check a url different from the one that is linked) but it must be before any actual content is sent. See header() in the PHP manual for more details.