Re: So many files, so few files.
- From: Phil Hobbs <pcdh@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 26 Jan 2006 10:08:00 -0500
Mac wrote:
Indeed, I'm sure you're right. But then what? As the months and years go by, one would want a program to automatically crawl the site and add updated material. Note, I say add, since one wouldn't want to follow the site by deleting material. But this brings up the issue of how to maintain the links, etc.
That's a good point. Thinking off the top of my head, the main thing you want is pdf's (maybe?). You could search for and delete redundant ones (automatically, of course). Meanwhile, you could just update the html (using wget) but never delete any pdf's (unless they are redundant). If some of the pdf's get orphaned from their links, that is OK, because you could just use google desktop or whatever it is called to index all your own disk space.
Since the htmls are usually pretty small compared with the pdfs, you could do the Wayback Machine thing--just keep snapshots of the link pages, with appropriately modified pdf file names to avoid overwriting them for a revision change. That way, you could ask for the latest data***, or the latest as of June 2005, or whenever.
Cheers,
Phil Hobbs
.
- References:
- So many files, so few files.
- From: Winfield Hill
- Re: So many files, so few files.
- From: Mac
- Re: So many files, so few files.
- From: Winfield Hill
- Re: So many files, so few files.
- From: Mac
- Re: So many files, so few files.
- From: Winfield Hill
- Re: So many files, so few files.
- From: Mac
- So many files, so few files.
- Prev by Date: Re: AC power control.
- Next by Date: Re: Analog Hole Bill Would Require Secret Tech No One Can Examine
- Previous by thread: Re: So many files, so few files.
- Next by thread: Re: So many files, so few files.
- Index(es):