Elliott C. Back: Internet & Technology

Optimizing Wordpress Performance & Speed

Posted in Blogging, Code, Performance, Wordpress by Elliott Back on July 27th, 2006.

We start with a fresh copy of Wordpress 2.03, fresh from the Wordpress download center. Our goal is to figure out what parts are slow and can be improved. We’re not interested in sacrificing features or existing code for speed, like Lightpress has done. Rather, we’d like to identify the worst performers and improve them, if possible, in the default install.

Our microscope will be XDebug, a PHP performance tool that we’ve installed as a zend extension on our server. It’s a C module that keeps track of the time spent running PHP code, so we can grab and analyse its output to determine what Wordpress is doing under the hood.

Our data set will be all the posts, comments, and pages from this blog. Currently, that is 1405 posts with 6977 comments and 4 pages. This should provide ample material for testing. The database will be optimized after importing.

Initial profiling

Loading the main Wordpress page produces the following traces:

wordpress-trace-01.jpg

10,000 calls to preg_replace, 6,410 to str_replace, and 2,465 to strstr.

wordpress-trace-02.jpg

wptexturize is the heaviest function at 24.4% of its own code, 5.5% on mysql, and 4.5% on apply_filters. Template loading, and php compilation (require_once) are together 30-40% of the loading, as well.

First optimizations

The first thing I notice is that index.php sets a config variable and just includes another file. Why not just move that define() into wp-blog-header.php? It’ll save a function call, and not a lot of time, but it’s cleaner. Also, the ABSPATH define should be done before any other files are included, and thus copied from wp-config.php to index.php to save a few more function calls. There’s a line that sets the timer with an extra assignment statement.

Nitpicking’s not going to get me anywhere. Let’s take a look at wptexturize. We note that we can replace some preg_replace calls with str_replace, because only static strings are replaced, so we add the ['\'s','’s'] to the cockneyreplace array hack. We apply this process to all static strings. We can also convert the strings from slower dynamic double quotes to faster static single quotes, and put the dynamic section into another array, like the static section.

These simple operations reduce the time spent inside wptexturize from 24% to 16%, and the time from 600ms to 200ms. We’ve also reduced the number of preg_replace calls dramatically, from 10,439 to 3,289 and total time from 74ms to 36ms. We’ve gone from 54ms of 6,410 str_replace calls to 29ms of 1,405 calls. Why is this? By calling each function only once we save a lot of extra PHP operations and just hand off a bunch of data to fast, underlying C functions. Interestingly, using array_walk/array_map is not faster than a plain loop.

Here’s the new wptexturize function:

<?php

function wptexturize($text) {
    $next = true;
    $output = '';
    $curl = '';
    $textarr = preg_split('/(<.*>)/Us', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
    $stop = count($textarr);

    for($i = 0; $i < $stop; $i++){
        $curl = $textarr[$i];

      if (isset($curl{0}) && '<' != $curl{0} && $next) { // If it's not a tag
          // static strings
          $static_characters = array('—', ' — ', '–', 'xn--', '…', '“', '\'tain\'t', '\'twere', '\'twas', '\'tis', '\'twill', '\'til', '\'bout', '\'nuff', '\'round', '\'cause', '\'s', '\'\”, ' ™');
          $static_replacements = array('—', ' — ', '–', 'xn–', '…', '“', '’tain’t', '’twere', '’twas', '’tis', '’twill', '’til', '’bout', '’nuff', '’round', '’cause', '’s', '”', ' ™');
          $curl = str_replace($static_characters, $static_replacements, $curl);

          // regular expressions
          $dynamic_characters = array('/\'(\d\d(?:’|\')?s)/', '/(\s|\A|”)\'/', '/(\d+)”/', '/(\d+)\'/', '/(\S)\'([^\'\s])/', '/(\s|\A)”(?!\s)/', '/”(\s|\S|\Z)/', '/\'([\s.]|\Z)/', '/(\d+)x(\d+)/');
          $dynamic_replacements = array('’$1','$1‘', '$1″', '$1′', '$1’$2', '$1“$2', '”$1', '’$1', '$1×$2');
          $curl = preg_replace($dynamic_characters, $dynamic_replacements, $curl);
      } elseif (strstr($curl, '<code') || strstr($curl, '<pre') || strstr($curl, '<kbd' || strstr($curl, '<style') || strstr($curl, '<script'))) {
          // strstr is fast
          $next = false;
      } else {
          $next = true;
      }

      $curl = preg_replace('/&([^#])(?![a-zA-Z1-4]{1,8};)/', '&$1', $curl);
      $output .= $curl;
    }

    return $output;
}
?>

Update: As of change #4511 in 11/21/06 Ryan Boren merged this into Wordpress core code. So you now have it!

What’s next

A new look at the numbers shows the next biggest culprit is apply_filters. Even on a default installation, it’s slow, taking 106ms of its own time, and 438ms total. Unfortunately, there doesn’t look to be an easy way to optimize it. Other slow spots, like get_settings and list_cats are likewise difficult to immediately improve. An easy way to increase performance would be to rewrite the plugins architecture and improve the filters mechanism, but that would mean an API change.

Some good news

It’s not all bad though; here’s how the change to wp-texturize performs, tested on PHP 5 windows on a 1.8 GHz machine, with 10 large entries on the homepage:

ab -n 100 -c 1 http://localhost/test/
Document Length: 190017 bytes
Requests per second: 0.78 [#/sec] (mean)
Time per request: 1278.438 [ms] (mean)

Now, here’s the default install of Wordpress 2.0.3:

ab -n 100 -c 1 http://localhost/test/
Document Length: 189786 bytes
Requests per second: 0.77 [#/sec] (mean)
Time per request: 1297.344 [ms] (mean)

Conclusion

While the new wptexturize function performs well on a server with an opcode cache, the performance increase is lost among other considerations on an “out of the box” configuration. Also, a quick glance at the Wordpress core code shows no easy patches to increase performance by large orders, just design ideas that could be gradually improved over time.

Bye bye, Amazon.com!

Posted in Computers & Technology, Family, Friends by Elliott Back on August 19th, 2005.

Today was my last day at Amazon.com. I finished up my project, made sure the documentation was all in place, and sent some farewell emails. Then at the end of the day when all my coworkers told me to go home, I turned in my badge(s), laptop, and other miscellaneous stuffs and took off. I had a great time working there. The team was great, and I think for a first internship I really learned a lot about how corporate work in software engineering is done. Now I just need to update my resume, cross my fingers, and hope they call me back for another shot!

Update:

Funny thing–a guy just emailed me and asked:

I was looking around about amazon internships and stumbled onto your blog. I just wanted to e-mail
you and see if you’d be willing to let me know a little about the experience. I would really like to intern
there next summer, although I am not sure I will have enough experience to pass the interview. How did
you like it? How was life outside the building? Look forward to hearing back from you :)

I sent him the following more detailed note about my summer:

Today was my last day working at Amazon.com, but I loved every minute of it. My team was great! I worked for the Procurement team, which is responsible for buying all of the products that Amazon sells from their respective vendors. My job over the summer was to write a metrics framework for evaluating group progress towards business goals. I eventually created a start-to-finish solution that took raw data and after some work gave you metrics numbers. The metrics framework had to be scalable, extendable, sturdy, and fast.

Outside the Amazon buildings (there are four in Seattle), there’s a lot to do. Just the other day I went out with a friend from Cornell and a bunch of other people I didn’t know, and a couple Seattle party crashers for dinner and afterparty. There shouldn’t be any lack of fun.

From your email I would assume you’re a rising sophomore? What classes have you taken? Read this article here:

elliottback.com/wp/archives/2005/08/04/how-to-hire-the-best/

Then tell me, at the bottom, if you can write good versions of strstr, is_anagram, and atoi in at least two of c, c++, c#, java, lisp, or perl.

It’s interesting that young people are so worried about interviews, and preparing for them. I’m not going to give away any interview secrets, but not because there actually are any. Rather, you just need to be competent, self-confident, experienced, and winsome!

How to hire the best

Posted in Computers & Technology, Deals & Savings, Education, Science by Elliott Back on August 4th, 2005.

The infamous Mark Jen has posted his take on Joel’s hiring essay. Basically, Joel makes the argument that hiring the absolute best programmers is the best thing for a software company, because superb programmers are investments that more than pay for themselves. It’s basically an argument of averages–everyone can build software, but the few companies that can build great software are few and noticeable. To give a concrete example:

When everyone is making ugly square mp3 players, a stylish mp3 player with rounded edges and careful design will be king.

A coworker and I were discussing this yesterday and today. Obviously, when hiring candidates for positions, we want good ones. However, we go beyond the code of hiring the best of the best–we actually do what we say here. If there’s a candidate that you can’t respect as an equal or greater skill, a candidate who doesn’t appear to possess basic skills, or who is any way lacking is simply not good enough. A company shouldn’t hire someone that limps over the corporate minimum bar to fill a position.

Until there’s someone you find who can leap over a bar twice as high with ease, you don’t want to fill that position. So, don’t make your interviews easy. If you’re doing an interview, make it moderately challenging for someone of your level. Include a “screener” technical question that you think anyone with similar skills and general knowledge should be able to easily answer. Some good interview question choices include:

  • Tell me if there are two numbers in an array that sum to x
  • How do hashmaps work? How would you hash a string?
  • Generate permutations of x
  • Reverse a c string
  • Write a tree to linked-list function
  • Write an efficient recursive function to garbage collect memory
  • Describe how a compiler works.
  • Give an overview of DNS, TCP, filesystems, process scheduling, pipelining, or some other high-level CS topic

Once you’ve passed them through an easy coding question and another general question, you can start to interview them based on their resume, because you know that they’ve met a minimum requirement to do their job. If you’re impressed at the end, hire them. Otherwise, why bother? The negative cost of hiring someone who doesn’t impress you and your teammates is greater than the benefit of filling that vacant position.

Update:

I just noticed Shelly’s comment on this old hiring posts. It reads:

That is the worst interview question I’ve heard of. It is guaranteed to discriminate in favor of a certain type of developer, and not necessarily a good one.

No wonder you people can’t find good engineers. You don’t know how to interview worth a damn. You’re looking for code monkeys, but interviewing engineers. I had a feeling this was what was happening when I talked with someone who interviewed at Microsoft and the same thing happened. Absolutely silly questions-and yes, very biased. Your HR department has done a poor job.

Asking somebody how to do code the strstr function. I’d hire the person who looked at you like you were daft and said, “I’d use the function built into the language. Now what _job_ is it you want me to do?”

I just have to add to the conversation, and point out that asking for an interviewee to code any basic function like that is industry best practice. It’s the absolute lowest bar. Sure, if you actually can code, then these questions will seem ridiculous, but otherwise? You don’t hire a programmer who can’t write code, so you need to see if they can write code. Shelley would rather have interviews, I guess, that go like this:

Interviewer: So, you can code basic functions, do recursion, handle arrays, right?
Shelley: You bet I can! And more!
Interviewer: Fantastic–just had to check.
Shelley: Let’s move onto more interesting things…

Nope, it doesn’t work like that, because we can’t trust you to tell us the truth. Your abilities have to be assessed. Unfortunately, in another comment, Shelley goes on to say:

Any interview that resorts to having the interviewee code is a bad interview. Shows that your staff is too inexperienced to know how to interview.

She also makes a big hand-waving pseudoscientific argument about long term / short term memory with regards to coding. See, the thing is, the most basic part of this kind of job description is writing code. Sure, we create systems, do designs, model databases, and create relational object oriented structures, but then a software developer sits down and implements. Writes code. You wouldn’t believe how many people cannot write a function to reverse the elements of an array, in any language.

Here’s your challenge:

O readers, show your might. I’m going on vacation this weekend, but when I come back, I want efficient implementations of strstr, is_anagram, atoi for any base, and edit_distance. Log the time it takes you to write each one, too. Remember–these are basic interview “crawl over the bar” questions…