Tuesday, January 04, 2005

Most web activity is now non-human - implications for personal web site traffic limits

MacInTouch Home Page
[Cameron Knowlton] ... The mass majority of web traffic is from non-human surfers, such as positioning agents and other such web goodies. Many sites are poorly designed to work only with IE/Windows... these sites detect the 'agent string' from the users browser, and work only when they see an expected agent string. Accordingly, developers of web agent software mimic the lowest common denominator browser -- IE/Windows. Even Safari is wired to do this.

Search engine marketing studies have estimated as much as 75% of search engine traffic is from non-humans... other web activity (email crawlers, web site crawlers, etc.) would be similarly skewed.
This is becoming a cost issue, as well as messing up data on what browsers are being used (suggests Firefox use may actualy be close to 10% of human web access). ISPs charge for bandwidth. I have a fair bit of data on my personal site, but most of it is of limited current interest. I keep it there for my purposes, and for archival retrieval by others. The bots don't know this however, and they suck the entire site. This adds up to hundreds of MBs of traffic a month; and that can start heading towards my ISP's traffic limits. Since I switched to LunarPages I've been ok, but if the trend continues I'll run into problmes there too.

No comments: