Elliott C. Back: Technology FTW!

Is Blogspot really bad?

Posted in Blogging, Computers & Technology, Google, SEO, Science, Spam by Elliott Back on October 16th, 2005.

The Theory:

Blogspot is more spam than blogs. Or any one or more of the following:

However, no one ever posts numbers for their theories, so I’ll do a quick sample, just like from a statistics textbook on binary distributions.

The Facts:

I chose to browse 50 Google Blogger blogs at Blogspot, looking for any that might be splogs. I use their random blog link, which I will assume to be truly random. I keep track of four variables: last update, front-page size (wordcount), splog or not, and adsense or not. Here is the raw data in xls format:

Blogspot Splogs.xls

The rough descriptive statics, where 1 means total spam and 0 means totally clean, are:

Variable N N* Mean SE Mean StDev
splog 50 0 0.2800 0.0641 0.4536

This bar chart should give you a better idea:

Splogs to blogs

Yes, approximately 72% of Blogspot blogs are real. The other 28% are spam blogs, or splogs. And, we can give a bound, by the central limit theorem, on the accuracy of this experiment. A 95% CI for a single-sample proportions test is (0.162311, 0.424905), with a p-value 0.003.

This entry was posted on Sunday, October 16th, 2005 at 12:30 pm and is tagged with central limit theorem, statistics textbook, google inc, binary distributions, splog, google, statics, raw data, subdomains, splogs, proportions, variables, front page, accuracy, blogs. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.

Viewing 11 Comments

 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus