Meet the new Loop – a lesson about the PHP SPL

A single Query for all loops

Some days ago I found an interesting question about displaying posts. The user wanted to first show posts that have a thumbnail and then those that don’t have one. To add on that he as well wanted to show posts which got different categories assigned on different parts of the page. Overall a pretty common request.

Take your time as I’m going into detail. You’ll need approximately 6 minutes to read the post without reading the code. The result of the approach explained in this post can be used in themes. It can be used as library, included from within your `functions.php` file or as a stand alone plugin. Either way, it will greatly simplify your life when crafting a new theme from scratch or altering an existing one.

Multiple Loops: The basics

The first thing that a novice designer who read some (painfully wrong) tutorials would do is to try to use one call to query_posts() for each loop. A more seasoned designer or developer would then maybe add multiple get_posts() calls after the primary loop where s/he just skipped the posts that were of no use. Or s/he would just use the main loop, pushing all unused posts into a new array and then using a simple foreach()-loop for those left over posts later on.

Advanced Loops

A good developer would suggest to use the WP_Query class. And a really good dev would alter the output using the pre_get_posts or (if s/he knows a bit about SQL or can spend some time) the posts_clauses and similar filters to reduce the impact on the database. Real master might even extend the WP_Query class to reduce the footprint, wrap everything up as a nice little package and make it portable and easy to share. Here’s an example of such a “master”-query object:

class My_Book_Query extends WP_Query
{
    function __construct( $args = array() )
    {
        add_filter( 'posts_fields', array( $this, 'posts_fields' ) );

        // Forced/default args
        parent::__construct( wp_parse_args( array(
            'posts_per_page' => -1
        ) ), $args );
    }

    public function posts_fields( $sql )
    {
        return "{$sql}, {$GLOBALS['wpdb']->terms}.name AS 'book_category'";
    }
}

You can then use it like this:

$book_query = new My_Book_Query();
if ( $book_query->have_posts() )
{
    while ( $book_query->have_posts() )
    {
        $book_query->the_post();
        # ...do stuff...
    }
    wp_reset_postdata();
}

Thoughts about “advanced”

Now all this is nice. But often it takes quite some time to alter the query and get it right. And then there’s the huge problem that you never know if there isn’t a plugin or theme adding another callback to the same filter, messing up your carefully crafted query. And finally there’s the need to call those queries in templates and loop over them. And if you’re into the separation of concerns, then you probably already split those queries off of your theme, moved them into a plugin and are calling them with a filter in your templates. And then you got a real pile of code that needs documentation, maybe unit tests and so on.

As a result of all that, I thought about going down a different route and asked myself the following question: What’s the purpose of the loop? The answer is simple. It helps people to avoid the depth of WordPress core code and offers an easy way to adjust the look of your site by just adding or removing some function calls. Now all those functions rely on the global $wp_query object. So calling a “template tag” in fact just calls something from what’s currently inside the global $post. And the conclusion is, that if I want to simplify things, then I only need to get something that constantly sets the global to the object the WP Query object would currently point to. Then it would give us all the easiness of Template Tags and I only have to care about simplifying how to actually split one query into multiple loops. Easy, right?

Going beyond “advanced” – Meet the SPL

And the solution is right in front of us: The PHP SPL, or Standard PHP Library. This library, which is part of PHPs core and is activated per default for PHP 5.2.x. (It still can be turned off, but why would anyone do that.), brings us a nice set of interfaces like Countable and one extremely useful thing: Iterators.

Iterators are something like “advanced” loops. They exist above the basic foreach, while, etc. loops, but basically do the same thing – looping over “things”. By utilizing them you’re able to be much more specific and have access to lots of information. In other words: No workarounds need as you got every bit of needed info at your fingertips.

For example a RecursiveIteratorIterator allows looping through a multidimensional array, accesses all levels and lets you identify them by calling $this->getDepth() easily. You also got stuff like GlobIterator or its multidimensional pendant RecursiveDirectoryIterator() which is kind of an advanced glob function that lets you loop through directory structures and easily skip . and .. filesystem pointers. The list of Iterators is long and you should really give it some time and play around with it to see its full power. Of course there’re other benefits we gain. Normally a foreach loop over an object would just expose public properties. With iterators we can exactly tell which part of an object is loopable (or Traversable).

Let’s get back to our task and look into the benefits we’d gain from using Iterators on our loop. First, let’s sketch out a possible scenario: We’re building a special category page for a category named books with the ID of 25. The template would be named either category-books.php or category-25.php and be used as archive for book reviews. Then we want to first show all books that have a featured image and later show a list of the ten recent reviews that we haven’t already displayed in our Thumbnail list.

Book Preview Archive MockUp

MockUp for our fictional book review archive page – crafted in MS Paint.

As I’ve written previously, the default thing to do would be to do add a meta_query to the main query from inside a callback to pre_get_posts and search only those posts that have _thumbnail_id set. Then we’d need to figure out how to order those that have that meta value set, etc. Or we could use a FilterIterator.

The FilterIterator is an abstract class. This means that it’s not meant to be used directly. You need to implement it in your own class by extending it and adding the mandatory method accept(). This method tells the iterator (while looping through our Array or Object) whether we want to add/loop a property of current key/value pair or not. Let’s take a look at our real world example and the ThumbnailFilter class – the explanation follows below:

class ThumbnailFilter extends FilterIterator
{
    protected $wp_query;

    protected $allowed;

    protected $total = 0;

    public function __construct( Iterator $iterator, WP_Query $wp_query )
    {
        NULL === $this->wp_query AND $this->wp_query = $wp_query;

        $this->total = $this->wp_query->query_vars['posts_per_page'];

        // Save some processing time by saving it once
        NULL === $this->allowed
            AND $this->allowed = $this->wp_query->have_posts();

        parent::__construct( $iterator );
    }

    public function accept()
    {
        if (
            ! $this->allowed
            OR ! $this->current() instanceof WP_Post
        )
            return FALSE;

        // Switch index, Setup post data, etc.
        $this->wp_query->the_post();

        // Last WP_Post reached: Setup WP_Query for next loop
        $this->wp_query->current_post === $this->total -1
            AND $this->wp_query->rewind_posts();

        // Return whether it does meet the criteria:
        return $this->deny();
    }

    public function deny()
    {
        return ! has_post_thumbnail( $this->current()->ID );
    }
}

Basically it’s nothing more than a plain vanilla FilterIterator. But the part that makes it “compatible” with the WP_Query – and therefore allows the usage of Template Tags – is the important bit. I added a private property named $wp_query. So when you call an instance of our ThumbnailFilter, you have to add the global $wp_query, which holds the main query object, as second argument. That way we have access to this object and can keep track like WordPress native loop does. Let me quickly walk you through what happens:

  1. We have the private $wp_query property. If it’s not set and the argument is provided, it gets set to the one and only global $wp_query object. It got a type hint to WP_Query to let PHP throw read- and understandable error messages when we provide the wrong object.
  2. The private $allowed property gets set: It is set to the result of WP_Query::have_posts(), so we don’t run into an empty query result by accident.
  3. Inside the accept() method, the $allowed property gets checked. Additionally we check if the current()ly looped object really is an instance of the WP_Post object. If so, we proceed. If not, we skip forward to the next object. Everything that isn’t a WP_Post clearly isn’t anything that we want to loop.
  4. If we passed our test at the beginning of accept(), we call $wp_query->the_post(). This triggers WP_Query::the_post(), which then sets up the first post, calls any callbacks attached to loop_start and sets the internal pointer to the next post in the main loop.
  5. Next we check if we already reached the end of all posts for this request. If so, we let WordPress reset the query and all its pointers.
  6. Finally we call the deny() method. And this is what it all is about. Hint: $this->current() always points to our currently looped WP_Post object. To target a post, we need to call the ID as property of this method: $this->current()->ID() for e.g.

The deny() method is there for one single reason: The ability to extend our *Filter class. It contains only one thing: The native WordPres Conditional Tags that tell if we want to output that post or not. Doing so, we simplify things massively for our (fictional) designer who crafts the final templates. Let’s take a look at the specific implementation in the category-books.php template:

global $wp_query;
$loopObj = new ArrayObject( $wp_query->get_posts() );
$primaryLoop = new ThumbnailFilter( $loopObj->getIterator(), $wp_query );
 
foreach ( $primaryLoop as $post )
{
    the_post_thumbnail();
    the_title( '<h2>', '</h2>' );
}

And we’re done. We haven’t had to deal with separating any query with WP_Query args nor fighting with any SQL statements to get our stuff in the right order. And as benefit on top, our template file is nice and clean. Yes, we could go even further and separate and abstract the ArrayObject creation to a template tag like get_loop_object() or similar, but so far we’ve already done a big step.

But there’s another, maybe even more interesting part: We can reuse that pattern for later loops and even tell our designer how he can easily use that pattern without the need to learn anything new. As explained above, we separated the deny() method out of accept(). We now take a look at how a secondary query might look like to give you a better understand the reason behind this:

class NoThumbnailFilter extends ThumbnailFilter
{
    public function deny()
    {
        return has_post_thumbnail( $this->current()->ID );
    }
}

This second class extends our previous ThumbnailFilter and does nothing else than define the conditional tag that keeps our Iterator from looping through. In above case, we want to loop every post that we haven’t looped before. And this would be all posts that don’t have a Thumbnail. Example implementation in the template:

// global $wp_query; -> Already set earlier in our template file
// $loopObj = new ArrayObject( $wp_query->get_posts() ); -> Set earlier
$secondaryLoop = new NoThumbnailFilter( $loopObj->getIterator(), $wp_query );
 
foreach ( $secondaryLoop as $post )
{
    the_title( '<h2>', '</h2>' );
    the_excerpt();
}

Quite easy, isn’t it? You can easily extend the base filter iterator by just defining deny() with the Conditional tags. You can even go one step further, abstract the base class without defining it to get a more general approach. You could as well implement Countable and $counter that counts up on each accepted post and return it inside a count() method to easily output the number of posts for the current loop. Overall everything I’ve shown in this post is just a starting point. This Gist contains a proof-of-concept plugin. If you want to see how far I get with this approach, just follow me on Github. If I manage to build a base theme, then I’ll publish it there and announce it on this blog.

Happy crafting!

5 thoughts on “Meet the new Loop – a lesson about the PHP SPL

  1. I guessed you don’t intend the posted code as a real world scenario, but what I wanted to say is that I’m strongly agree that iterators are better then foreach/while loops on array for different reasons, including performance, but to filter query results I think is better act on SQL, possibly using a class that extends WP_Query and implements IteratorAggregate, in this way the FilterIterator implementations can be used on sql-filtered results, instead on ArrayObject created from an unfiltered WP_Query::$posts array.

    • It’s not actually “filtering” as in “dropping stuff”. FilterIterators are used to “filter for this current loop”. But yes, you’ll have to reduce the overall number of results to those that you need to process. You just don’t have to add them in a specific order up front – this can easily be done with (Filter)Iterators. The example I’ve shown in this post should just show a very basic usage. It’s assuming that you’re already happy with your settings and the default set of posts that you get per request and your only need is to display parts of them in different locations.

  2. Very intersting. Recently I’m working on integrate SPL on WordPress as well. However, IMHO even if filter iterator can be very useful, the example you posted is not probably the best one. Reason is performance. The filter iterator doesn’t act on SQL query, so if you have dozen of posts and only few have a thumbnail, you waste resources on getting all posts and filter them after that. In addtion, ‘deny’ method, run an additional db query (has_post_thumbnail = get_post_meta), so if the actual query returns, let’s say, 400 posts (you use posts_per_page = -1), and only 5 have a featured image, your code runs 401 db queries to show 5 posts. In this specific example I’d exended WP_Query adding filters on posts_join and posts_where to retrieve only posts with thumbnail. Same result but 1 db query VS 401.

    • The example plugin sets the posts_per_page value to -1 for demonstration – loading all your posts of that post type. In a real world scenario you wouldn’t do that. I just did that to really catch posts that have a thumbnail. Else someone who wants to give it a try would have to go in and edit posts, depending on her/his set order for that archive, just to be able to give it a try. Please also keep in mind that this post is about multiple loops in a template/request.

      About the performance: Actually the performance increases. The reason is simple: SPL Iterators (opposite to a foreach/while loop) don’t generate a copy of the Array and therefore don’t process each single item.

    • Just to respond to the performance part, WordPress pulls down all meta-data for any posts in a `WP_Query` and stuffs it in a cache, so using `get_post_meta()` doesn’t (generally) cause any additional db queries.

Comments are closed.