What is the BecomeBot Spider?
Become.com is building a new generation of shopping search engine and therefore their recently released spider, BecomeBot, is focusing on e-commerce sites. While that is not a bad thing for e-commerce sites (especially if Become.com builds a significant presence with online shoppers), the spider itself is currently somewhat "undisciplined" and is gaining a reputation as a bad spider that should be blocked. There are reports that the BecomeBot spider is crippling some site because the spider is indexing sites too aggressively. When any search engine spider begins to hit a site with a rapid series of requests, the site performance can degrade significantly.
There is also a concern about the amount of bandwidth being consumed. We have seen this spider gobble up gigabytes of bandwidth, which can be a concern to site owners who pay stiff prices for exceeding their allocated monthly bandwidth. Some hosting companies shut a site down when the bandwidth limit is reached, which can be an even greater concern. We did
have one of our sites shut down due to excessive and unanticipated bandwidth consumption from the BecomeBot spider.
Become.com has addressed this issue and recommends a modification to the robots.txt file if you wish to slow down the indexing of a site. From the feedback we have seen from site owners, the BecomeBot spider does appear to be obeying the rules set in the robots.txt file. If you are not familiar with the robots.txt file, read the article, Using the robots.txt File.
The following code should be added to the robots.txt file to slow the BecomeBot spider’s requests down to one every 30 seconds.
User-agent: BecomeBot
Crawl-Delay: 30
If you do not want this spider to index your site, the addition of the following code should block it entirely.
User-agent: BecomeBot
Disallow: /
For more information about the BecomeBot spider, visit the Become.com Site Owner’s page.





