Elliott C. Back: Internet & Technology

Black Diversity in IT and Computer Science

Posted in Computers & Technology,Education,Quantitative,Science by Elliott Back on July 7th, 2008.

If you haven’t had a chance to read Why Black Nerds are Unpopular by David Adewumi, you should run over there right now. It gives an interesting cultural explanation for why the author believes we don’t see many African Americans in IT / Computer Science. It’s the inspiration for this post, a sort of state of the world of black diversity in IT. In his article, David writes how few of his black friends are “nerds:”

I would say, as a young black male, there is a strong inverse correlation between being a nerd and black, and being popular. I’ve seen many black friends who are fairly intelligent that were mediocre students in high school, and either failed out or were equally mediocre at the University level. Why? Popularity is, as Paul mentions, often times a choice of priorities — some sacrifice intelligence for popularity — and for blacks, this probably happens for 9 out of every 10.

I would go so far as to say that the lack of black nerds is probably a cause for major concern, but within the scope of this writing, possibly too large a problem to properly address, although certainly an interesting one.

After some Googling, I was able to find data from the National Science Foundation (NSF), Science and Engineering Degrees, by Race/Ethnicity of Recipients: 1995-2004, with information about degree recipients partitioned by self-identified race:


In 2004, 5,934 black students out of 57,405 total (10.33%) received undergraduate degrees in computer science. Overall, among all degrees, 4.84% of black students chose a degree in Computer Science as opposed to just 3.15% of white students. I don’t have enough personal or intellectual background to discuss these figures, but to my uninformed eye, they look quite promising. More blacks (by percentage) are choosing to study Computer Science than whites (our baseline majority in the US). And, while at 8.4% black undergraduate students feel underrepresented, the news indicates graduation rates are improving.

America needs to moving forward on providing excellent education to all Americans, not just the privileged majority. Perhaps our next President–who looks to be Barack Obama–will be tougher on education than his “no child left behind” predecessor and we’ll see these numbers get even better.

Twitter is Shared Perception, Not Science

Posted in P2P,Quantitative,Science,Web 2.0 by Elliott Back on May 12th, 2008.

Today’s post by Robert Scoble on the earthquake that rocked China brings out an important distinction about the nature of a distributed messenging service like Twitter. Scoble eulogizes over the speed of information delivery in his post, thrilled that he knew about the earthquake 50 miles from Chengdu three minutes before anyone else did:

I reported the major quake to my followers on Twitter before the USGS Website had a report up and about an hour before CNN or major press started talking about it. […] Several people in China reported to me they felt the quake WHILE IT WAS GOING ON!!!

While this is a great leap in keeping the world informed about what is going on in any part of it literally at the speed of light, what Twitter does is let you share perception and opinion with the rest of the world. This is different than sharing facts about what is going on. For example, the USGS report which came out three minutes after Chinese citizens began twittering that there was seismic activity, is full of precise details:

Magnitude: 7.9
Monday, May 12, 2008 at 06:28:00 UTC

Location: 31.099°N, 103.279°E
Depth: 10 km (6.2 miles) set by location program
Distances 90 km (55 miles) WNW of Chengdu, Sichuan, China
Location Uncertainty: horizontal +/- 5.8 km (3.6 miles); depth fixed by location program
Event ID: us2008ryan

If you look at Robert Scoble’s twitter stream, what you get instead is a succession of misinformation, subsequent corrections, noise, predictions of doom, and frenzy:

  • 06:37:49 – @dtan just reported an earthquake in Beijing. Wonder how large it is?
  • 06:40:50 – @keso reported earthquake too. @dtan said it lasted 10 seconds. I’d guess it’s a 4.5 then.
  • 06:41:21 – @michaelrice says it was a 7.8.
  • 06:44:14 – @gaberivera says it’s 57 miles from Chengdu, which has 11 million residents.
  • 06:57:46 – @jwalkerjr says to hold off on predictions. Well, I need to pass along my experience with earthquakes. This is a HUGE one.
  • 07:15:20 – @casperodj just said it felt like the earth was going to split. Literally everything was shaking.
  • For more just wade through the mud

To his credit, you can get an impression of the event, as seen through his and others’ eyes. You can get an idea of the scope, and the impact it has had on people around the world. But, you can’t get trustworthy facts from listen to what the general public is saying in the face of a disaster. A calm rationality is needed that Twitter cannot provide.

Still, Rory Cellan-Jones of the BBC is holding out hope that Twitter can mature into a real-time news service:

Let’s see, as this story unfolds, whether this is the moment when Twitter comes of age as a platform which can bring faster coverage of a major news event than traditional media, while allowing participants and onlookers to share their experiences.

Unfortunately, I don’t think that will happen. Twitter is fast, and it will let you share your experiences, but it will never replace solid journalism and hard facts. What do you think?

How many users does DIGG have?

Posted in Blogging,Quantitative,Scalability,Science,Web 2.0 by Elliott Back on February 3rd, 2008.

When John Graham-Cumming asked the question How Many Users Does Digg Have?, there were a few things he couldn’t tell you, since his data consisted of randomly self-sampled users. Well, with the power of two PHP scripts, we can pull large amounts of user data and form queries. Our first question is how has DIGG grown over time?

A graph of 187,054 digg users, randomly plotted against when they joined

This doesn’t tell us much, though, about how many DIGG users there actually are, or how active they are, so I plotted a histogram of the number of times these 200k users’ profiles had been viewed; the answer, unsurprisingly, is not very often in most cases:

83% of users had less than 50 profile views

And what about users who are active? How many people are digging stories every day? The answer is very few. I took a sample of 29,225 users from the previous sample (randomly) and used the DIGG API to query for their last digg. It turns out 31% (9125) had never dugg anything! After I removed those, here is the histogram I got:

About 15% of Digg users dugg a story in the last week

Concluding thoughts

Digg boasts an official tally of 2.2M users, but at most 20% of them can be considered real, active users. That would bring their user count down to 440,000, far far less than a popular web 2.0 boom child can boast about, and significantly hurting that $300M (or ~$700 a user) valuation that they keep trying to get.

Code Appendix

The {digg user, time joined, digg id, profile page views} information was gathered by the following script:


    $base ‘http://services.digg.com/users/?appkey=http://example.com&type=php’;
$data unserialize(file_get_contents($base.‘&count=0′));
$total $data->total;
“There are $total total users\n”;

    for($i 0$i 1000$i++){
$offset rand(0$total 100);
$data unserialize(@file_get_contents($base.‘&count=100&offset=’.$offset));

        $j 0;
$data->users as $user){
$page = @file_get_contents(‘http://digg.com/users/’.$user->name.‘/’);


            preg_match(‘/id=”userid” value=”(\d+)”/i’$page$matches);
$matches[1] . “,”;
            echo (
$offset $j++) . “,”
$user->name “,”;
$user->registered “,”;
$user->profileviews .“\n”;

Next Page »