You are not logged in.
- Topics: Active | Unanswered
#1 2008-07-30 17:19:57
- hcgtv
- Member

- From: Charlotte, NC
- Registered: 2008-05-07
- Posts: 419
- Website
Web crawlers
Seems like every day a new search engine is on the scene.
Just saw ScoutJet hitting my sites and wondered what it was, then realized it's just another crawler eating up my bandwidth. The 3 main engines, Google, Yahoo! and MSN eat up enough of my server's time, don't see the need for any others.
Maybe it's time to start blocking spiders?
Bert Garcia - When all you have is a keyboard
Offline
#2 2008-07-30 17:25:48
- Aabaz
- Member

- From: Paris, France
- Registered: 2008-05-11
- Posts: 35
Re: Web crawlers
Just another search engine : cuil
Offline
#3 2008-07-30 17:32:55
- hcgtv
- Member

- From: Charlotte, NC
- Registered: 2008-05-07
- Posts: 419
- Website
Re: Web crawlers
Cuil started hitting me a few months ago, I mean hitting me really hard, I though it was DOS attack.
Bert Garcia - When all you have is a keyboard
Offline
#4 2008-07-30 17:39:57
- Reines
- Lead developer

- From: Scotland
- Registered: 2008-05-11
- Posts: 3,163
- Website
Re: Web crawlers
I've never really had an issue with spiders using bandwidth or CPU, but I do tend to use a dedicated server so maybe just don't notice it.
Offline
#5 2008-07-30 17:55:52
- hcgtv
- Member

- From: Charlotte, NC
- Registered: 2008-05-07
- Posts: 419
- Website
Re: Web crawlers
Reines, sign on to your server and run iftop.
This will give you a real time idea of what is eating up your bandwidth, spiders never stop crawling.
Bert Garcia - When all you have is a keyboard
Offline
#6 2008-07-30 18:31:45
- Reines
- Lead developer

- From: Scotland
- Registered: 2008-05-11
- Posts: 3,163
- Website
Re: Web crawlers
Reines, sign on to your server and run iftop.
This will give you a real time idea of what is eating up your bandwidth, spiders never stop crawling.
Maybe true, but I wouldn't really say the amount of bandwidth they use up is an issue.
Offline
#7 2008-07-30 18:42:26
- SuperMAG
- Member
- Registered: 2008-05-10
- Posts: 707
Re: Web crawlers
how much bandwidth they can eat any way, its not that much of a problem. may be 1, 2, or may up to 10 mb
you get visitors from that search engine later.
Offline
#8 2008-07-30 20:12:33
- Smartys
- Former Developer
- Registered: 2008-04-27
- Posts: 3,135
- Website
Re: Web crawlers
how much bandwidth they can eat any way, its not that much of a problem. may be 1, 2, or may up to 10 mb
640K ought to be enough for anybody.
The two statements are equally as true.
Offline
#9 2008-07-30 20:32:48
- SuperMAG
- Member
- Registered: 2008-05-10
- Posts: 707
Re: Web crawlers
640KB, mmm why you would worry on this much bandwidth
Offline
#10 2008-07-30 20:35:02
- Smartys
- Former Developer
- Registered: 2008-04-27
- Posts: 3,135
- Website
Re: Web crawlers

Offline
#11 2008-07-30 22:05:46
- yemgi
- Member

- From: Crawley, West Sussex
- Registered: 2008-05-09
- Posts: 69
- Website
Re: Web crawlers
ROFL I could not have expressed it better.
how much bandwidth they can eat any way, its not that much of a problem. may be 1, 2, or may up to 10 mb
They can eat the bandwidth if you have fluxBB on your server because FluxBB is a fork
Sorry, I could not resist ![]()
Last edited by yemgi (2008-07-30 22:07:14)
Offline
#12 2008-07-31 10:02:02
- elbekko
- Former Developer

- From: Leuven, Belgium
- Registered: 2008-04-30
- Posts: 1,131
- Website
Re: Web crawlers
Ahh, the eternal wisdom of Picard.
Ben
SVN repository for my extensions - The thread
Quickmarks 0.5
“Question: How does a large software project get to be one year late? Answer: One day at a time!” - Fred Brooks
Offline
#13 2008-07-31 10:26:08
- SuperMAG
- Member
- Registered: 2008-05-10
- Posts: 707
Re: Web crawlers
what you guys are brambling about
Offline
#14 2008-07-31 11:33:26
- Smartys
- Former Developer
- Registered: 2008-04-27
- Posts: 3,135
- Website
Re: Web crawlers
I quoted your post and what Bill Gates said and pointed out that they were equally true (in other words, you were both way off). Not that you were talking about the same thing.
Offline
#15 2008-07-31 12:02:07
- SuperMAG
- Member
- Registered: 2008-05-10
- Posts: 707
Re: Web crawlers
I quoted your post and what Bill Gates said and pointed out that they were equally true (in other words, you were both way off). Not that you were talking about the same thing.
so that much bandwidth is truly eaten by the crawlers.
Offline
#16 2008-07-31 12:16:30
- Smartys
- Former Developer
- Registered: 2008-04-27
- Posts: 3,135
- Website
Re: Web crawlers
According to the Google Webmaster Tools, Google (on average) crawls 52 MB of data per day from FluxBB.org. That is 1.5 GB of bandwidth per month.
Offline
#17 2008-07-31 12:50:07
- damaxxed
- Member

- From: Germany
- Registered: 2008-05-16
- Posts: 353
Re: Web crawlers
If you really intent to block some spiders, you only strengthen the 3 (soon to be 4?) large search engines.
Offline
#18 2008-07-31 14:27:44
- Connor
- Former Developer
- Registered: 2008-04-27
- Posts: 1,127
Re: Web crawlers
According to the Google Webmaster Tools, Google (on average) crawls 52 MB of data per day from FluxBB.org. That is 1.5 GB of bandwidth per month.
According to our logs, 1.68 GB last month ![]()
Offline
#19 2008-07-31 14:48:27
- hcgtv
- Member

- From: Charlotte, NC
- Registered: 2008-05-07
- Posts: 419
- Website
Re: Web crawlers
The last time I made a major change to the cross references at PHPXref was November of last year:
Googlebot 1269448+76 4.91 GB 30 Nov 2007 - 21:17Keep in mind that the Xrefs generate over 2 million crawlable pages.
Bert Garcia - When all you have is a keyboard
Offline
#20 2008-07-31 19:47:36
- SuperMAG
- Member
- Registered: 2008-05-10
- Posts: 707
Re: Web crawlers
wow, that some bandwidth
its like the bigger your site the bigger bandwidth.
Offline
#21 2008-07-31 19:58:21
- elbekko
- Former Developer

- From: Leuven, Belgium
- Registered: 2008-04-30
- Posts: 1,131
- Website
Re: Web crawlers
No shit.
Ben
SVN repository for my extensions - The thread
Quickmarks 0.5
“Question: How does a large software project get to be one year late? Answer: One day at a time!” - Fred Brooks
Offline
#22 2008-07-31 20:35:51
- yemgi
- Member

- From: Crawley, West Sussex
- Registered: 2008-05-09
- Posts: 69
- Website
Re: Web crawlers
I have this for one of mines in Google Webmaster tools Crawl Stats:
Number of kilobytes downloaded per day
Maximum 400944
Average 169390
Minimum 69074
Offline
#23 2008-07-31 20:55:29
- hcgtv
- Member

- From: Charlotte, NC
- Registered: 2008-05-07
- Posts: 419
- Website
Re: Web crawlers
No shit.
![]()
If you really want to get the spiders crawling, just add Google Site Search to your website.
Bert Garcia - When all you have is a keyboard
Offline
#24 2008-07-31 21:23:23
- Gotipe
- Member
- Registered: 2008-05-10
- Posts: 181
Re: Web crawlers
Did not know search engines affected your bandwith. ![]()
EDIT: Check out the post time, 23:23:23, ![]()
Last edited by Gotipe (2008-07-31 21:24:07)
Offline
#25 2008-07-31 22:13:13
- elbekko
- Former Developer

- From: Leuven, Belgium
- Registered: 2008-04-30
- Posts: 1,131
- Website
Re: Web crawlers
You think they use magic to get the content of your site?
Ben
SVN repository for my extensions - The thread
Quickmarks 0.5
“Question: How does a large software project get to be one year late? Answer: One day at a time!” - Fred Brooks
Offline
