web services - Too aggressive bot? -
i'm making little bot crawl few websites. now, i'm testing out right , tried 2 types of settings :
about 10 requests every 3 seconds - ip got banned, said - ok , that's fast.
2 requests every 3 seconds - ip got banned after 30 minutes , 1000+ links crawled .
is still fast ? mean we're talking close 1.000.000 links should message "we don't want crawled ?" or still fast ?
thanks.
edit
tried again - 2 requests every 5 seconds - 30 minutes , 550 links later got banned .
i'll go 1 request every 2 seconds suspect same happen. guess i'll have contact admin - if can find him.
here guidelines for web crawler politeness.
typically, if page takes x amount of seconds download, polite wait @ least 10x-15x before re-downloading.
also make sure honoring robots.txt well.
Comments
Post a Comment