|
Steven Karel
Administrator
November 25, 2002
02:23:16 PM
|
As suggested on the TWiki docs page, this is not a well supported service. In this case, the calendar was completely open (no password set), I hadn't ever thought to prevent web robots from visining /webcal/ on www.bio.brandeis.edu, and I'd never had a problem before. However, alexa.com's robot seemed to think it was a good idea to follow all the links for "delete this item":
209.237.238.165 - - [22/Nov/2002:15:12:22 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021108&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=280 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.165 - - [22/Nov/2002:15:16:44 -0500] "GET /webcal/webcal.cgi?function=webweek&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&year=2002&month=11&day=03 HTTP/1.0" 200 9537 "-" "ia_archiver"
209.237.238.165 - - [22/Nov/2002:15:17:42 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021123&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=309 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.164 - - [22/Nov/2002:15:18:32 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021109&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=281 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.164 - - [22/Nov/2002:15:19:41 -0500] "GET /webcal/webcal.cgi?function=webweek&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&year=2002&month=11&day=05 HTTP/1.0" 200 9426 "-" "ia_archiver"
209.237.238.165 - - [22/Nov/2002:15:19:42 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021102&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=241 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.164 - - [22/Nov/2002:15:20:16 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021116&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=300 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.165 - - [22/Nov/2002:15:20:23 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021115&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=275 HTTP/1.0" 200 2799 "-" "ia_archiver"
209.237.238.163 - - [22/Nov/2002:15:20:33 -0500] "GET /webcal/webcal.cgi?function=data_ops&op=del&return=1&date=20021109&cal=Real+Time+PCR+Machine+-+Rosbash+Lab&index=248 HTTP/1.0" 200 2799 "-" "ia_archiver"
Users should set a webcal password and I have added /webcal/ to the deny list in robots.txt
|
David Wisniewski
Administrator
November 25, 2002
02:51:57 PM
|
I have to say, that's an almost amusing result of an automated service impacting web data. [until, of course, one realizes that it was real data that disappeared]. Gotta love 'dem 'bots.
|
Elliot Kendall
November 25, 2002
06:22:39 PM
|
I would advise not putting /webcal/ in the robots.txt deny list. It may prevent legitimate web spiders from visiting that directory, but there are spam bots out there which will be sure to specifically visit any directories listed in robots.txt, and which are almost certainly poorly written enough to trigger effects like this. I agree that the best solution is to make sure that administration functions are passworded, or possibly to restrict access to the appropriate directories to on-campus IP addresses if you're really going for the kind of site where everyone has full access.
|
Steven Karel
Administrator
November 25, 2002
06:41:59 PM
|
Elliot's points |
| Generally I agree with what Elliot says above. In this particular case, though, there is nothing of lasting interest (it's a calendar), so I don't mind excluding well-behaved robots. I would limit it to on-campus addresses, except that at least one group is using a (passworded) calendar to schedule a resource that's used by people from neighboring universities, at least in principle. I thought I saw something from Dave somewhere saying that the next mod of the my.brandeis calendar might include resource scheduling. If so, bye bye WebCal.
|
David Wisniewski
Administrator
December 2, 2002
08:01:30 AM
|
the rebuild of calendar may include resource reservation. We're focusing on building the functionality that we have now, but better. We may be able to extend it for your needs as well.
|
Post A Response
|