Elliott C. Back: Internet & Technology

Excessive Amazon Packaging

Posted in Amazon, Scalability by Elliott Back on July 28th, 2009.

I bought two Dap 21500 Plastic Wood Filler, 1.5-Ounce on Amazon. These tiny little tubes of plastic wood repair kit contain just 1.5 Oz of fluid. Unfortunately for the environment, Amazon shipped them in two boxes:

I’m not alone, check out the Excessive Packaging category on Sustainable Is Good. Fortunately, Amazon is starting a Frustration-Free Packaging initiative which aims to “determine the “right-sized” box for any given item to be shipped to a customer, based on that item’s dimensions and weight.” You can also leave packaging feedback, something I’ve just done.

WP Super Cache Benchmark

Posted in Blogging, Performance, Plugins, Scalability, WP, Wordpress by Elliott Back on September 28th, 2008.

If you’ve thought about whether upgrading from WP Cache 2.0 to WP Super Cache is a good idea, hopefully this benchmark will convince you. I followed my instructions on benchmarking Wordpress with Apache Bench on four configurations of this blog’s main page to measure performance:

  1. Without any caching plugins
  2. With WP Cache 2.0
  3. With WP Super Cache (no compression)
  4. With WP Super Cache (compression enabled)

wp-caching-plugins.png

The results show that WP Super Cache is a clear winner, performing 225% better than the older WP Cache. Here is the raw data I gathered during the test:

No caching:
Requests per second: 22.81 [#/sec] (mean)
Time per request: 4383.559 [ms] (mean)
Time per request: 43.836 [ms] (mean, across all concurrent requests)
Transfer rate: 613.75 [Kbytes/sec] received

WP cache:
Requests per second: 872.30 [#/sec] (mean)
Time per request: 114.640 [ms] (mean)
Time per request: 1.146 [ms] (mean, across all concurrent requests)
Transfer rate: 23549.46 [Kbytes/sec] received

Super cache (no compression):
Requests per second: 1518.90 [#/sec] (mean)
Time per request: 65.837 [ms] (mean)
Time per request: 0.658 [ms] (mean, across all concurrent requests)
Transfer rate: 41150.81 [Kbytes/sec] received

Super cache (compression):
Requests per second: 1960.39 [#/sec] (mean)
Time per request: 51.010 [ms] (mean)
Time per request: 0.510 [ms] (mean, across all concurrent requests)
Transfer rate: 53108.70 [Kbytes/sec] received

For more tips on how to improve your Wordpress performance, check out Wordpress Performance: Why My Site Is So Much Faster Than Yours. Another interesting WP caching plugin is Batcache, which uses the memcached backend to serve requests out of a cluster of machines’ RAM memory.

How many users does DIGG have?

Posted in Blogging, Quantitative, Scalability, Science, Web 2.0 by Elliott Back on February 3rd, 2008.

When John Graham-Cumming asked the question How Many Users Does Digg Have?, there were a few things he couldn’t tell you, since his data consisted of randomly self-sampled users. Well, with the power of two PHP scripts, we can pull large amounts of user data and form queries. Our first question is how has DIGG grown over time?

digg-users-over-time.png
A graph of 187,054 digg users, randomly plotted against when they joined

This doesn’t tell us much, though, about how many DIGG users there actually are, or how active they are, so I plotted a histogram of the number of times these 200k users’ profiles had been viewed; the answer, unsurprisingly, is not very often in most cases:

digg-profile-views-histogram.png
83% of users had less than 50 profile views

And what about users who are active? How many people are digging stories every day? The answer is very few. I took a sample of 29,225 users from the previous sample (randomly) and used the DIGG API to query for their last digg. It turns out 31% (9125) had never dugg anything! After I removed those, here is the histogram I got:

digg-last-dugg.png
About 15% of Digg users dugg a story in the last week

Concluding thoughts

Digg boasts an official tally of 2.2M users, but at most 20% of them can be considered real, active users. That would bring their user count down to 440,000, far far less than a popular web 2.0 boom child can boast about, and significantly hurting that $300M (or ~$700 a user) valuation that they keep trying to get.

Code Appendix

The {digg user, time joined, digg id, profile page views} information was gathered by the following script:

<?php
    error_reporting
(E_ALL);
    
ini_set(‘user_agent’‘My-Application/2.5′);
    
ini_set(“include_path”“.:/usr/share/pear”);
    require_once 
‘Services/Digg.php’;
    require_once 
‘Services/Digg/Response/php.php’;

    $base ‘http://services.digg.com/users/?appkey=http://example.com&type=php’;
    
$data unserialize(file_get_contents($base.‘&count=0′));
    
    
$total $data->total;
    echo 
“There are $total total users\n”;
    echo 
“ID,Number,Name,Date,Views\n”;

    for($i 0$i 1000$i++){
        
$offset rand(0$total 100);
        
$data unserialize(@file_get_contents($base.‘&count=100&offset=’.$offset));

        $j 0;
        foreach(
$data->users as $user){
            
$page = @file_get_contents(‘http://digg.com/users/’.$user->name.‘/’);

            if(!$page)
                continue;

            preg_match(‘/id=”userid” value=”(\d+)”/i’$page$matches);
            echo 
$matches[1] . “,”;
            echo (
$offset $j++) . “,”
            echo 
$user->name “,”;
            echo 
$user->registered “,”;
            echo 
$user->profileviews .“\n”;
        }
    }
?>

Next Page »