Back in March 2013, I moved to a new host because the host I was using at the time just wasn't cutting it. But after the move, I had updated my site software and had forgotten to activate the site firewall.
I was flabbergasted when I looked at the resource monitor! My server CPU usage was through the roof! We're talking from 80% to 100%+ according to the CPanel resource monitor. It was literally off the charts.
I'm on an "unlimited" hosting plan, but "unlimited" doesn't mean the host won't shut down your site for being a resource hog. If I was running a business that's oriented around my web sites, being shut down without warning would be disastrous!
The cause was non-compliant and black hat bots; the kind of bots that swamp your site trying to crawl every single page, every single link. The more heinous of the bad bots were the content scrapers! These bots attempt to leach virtually everything from your site; articles, compressed archives, media files, the kitchen sink, you name it!
When you get multiple bad bots like the above visiting your web site every day and attempting to load every single page, the server load can be significant to say the least.
When the bots were running roughshod over my sites, my bandwidth usage for Nebulous, in April, was 582.40MB. That's a lot considering that I have little content published.
After activating my site firewall, updating its signatures file and banning key hosting networks from which the bots were operating, my bandwidth usage from May to July was, on average, 179MB.
That's an average reduction of about 69%! More importantly, my server resource usage for all five sites now spikes at an average of around just 20%.
On another, older site of mine, the savings was even more significant. April bandwidth usage for that site was reduced from a whopping 5.97GB to an average of 1.71GB. That's an average reduction of about 71%!
The take-away is that you don't need to be running an important business site before considering banning bad bots.
Studies show that 51% of all Internet traffic are from automated sources comprising hacking tools, "surveillance/spy" tools (including SEO analyzers etc.), scrapers and spammers. So stopping the abusive automated traffic becomes vital to all site owners.
My sites are PHP-based so I use ZB Block. It does take a little bit of PHP know-how to write your own signatures but its default configuration alone will reduce bad bot activity significantly.