Build complete RSS feed reader app – PHP RSS to JSON API

Hello again! in this 3 parts post series I am going to show you how to build a complete RSS feed reader app using Vue.js and Vuex in which you can add channels, favorite posts, search and more and for backend we will use a PHP API which will get and cache the feed for a day and return simple JSON which will be later consumed by our Vue app.

Source Code

In this first part, we will be creating PHP RSS parser which will return the response as JSON. Here is our requirement for this API.

  • Get the XML feed from URL
  • Validate it for XML
  • Cache it
  • Parse XML and return JSON

Its always good idea to start with the API of your class how you will be using it in a perfect scenario, I am sure you know there are lots of packages out there to parse RSS feed, but most of them come with lots of extra features which we don’t use in most of the time, so we are going to build a simple class QRss() which we can use with syntax something like below to get the RSS feed response as JSON.

// get the json from feed
new QRss($feed_url)->json();
// get cache for a week
new QRss($feed_url)->cache_for('+ 6 days')->json();
// get fresh copy
new QRss($feed_url)->fresh()->json();

Structure of RSS Feed

Let’s see what’s the structure of an RSS feed XML document, below is simple RSS feed response with only the required fields, it can have more elements in the real world, here you can read full specs for RSS 2.0.

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
        <channel>
            <title>QCode</title>
            <link>http://qcode.in</link>
            <description>All about Full Stack Web Development</description>
            ...

            <item>
		        <title>Post title</title>
		        <link>http://qcode.in/post-link</link>
		        <description>...</description>
	        </item>
	    </channel>
</rss>

Initialize with URL

In QRss($url) constructor we are accepting an URL, the first thing we do is to validate the URL and assign it to url property, if URL is invalid we simply return JSON response with the error message.

public function __construct($url)
    {
        if ( ! filter_var($url, FILTER_VALIDATE_URL) ) {

            $this->json_response([
                "{$this->error_msg_key}" => 'Invalid feed URL'],
            400);
        }

        $this->url = $url;
    }

Fetch the Feed

Now we have a valid url, we can make a request on it to get the XML feed. We do it when we call ->json() method on object. This method first fetches the url and if it’s valid XML, then it delegates caching , parsing & finally outputs the JSON response.

public function json()
    {
        if ( ! $xml = $this->fetch() ) {
            $last_error = error_get_last();

            $this->json_response( [
                "{$this->error_msg_key}" =>  "Unable to connect to URL",
                'error' => $last_error['message']],
            500);
        }

        $this->json_response($this->parse($xml));
    }

Fetch Method

Since we need caching we check first, if $this->fresh_copy is set to false which can be changed by calling ->fresh() on object. If its false we check first in the cache if we have a valid cache we will return its content. If not we simply move on to fetch the content from URL and cache it if it’s a valid XML.

private function fetch()
    {
        // check if fresh copy needed
        if( ! $this->fresh_copy ) {
            if( $cached = $this->get_cache($this->url)) {
                return $cached;
            }
        }

        $content = @file_get_contents($this->url);

        if ( ! $content ) return false;

        // validate the xml
        if( $xml = $this->validate_xml($content) ) {
            // put it in cache
            return $this->cache_it($this->url, $content);
        }

        return false;
    }

Validate the XML

We need to check if returned content from URL is a valid XML feed, to do it we can simply try to load it in simplexml_load_string($content) function, this will return false if it’s not a valid XML content. I used @ to suppress the warnings since we will be sending error in JSON output.

private function validate_xml($xml)
    {
        $rss = @simplexml_load_string($xml);

        return ($rss && $rss->channel) ? $rss : false;
    }

Cache it

We have a simple caching implementation which checks if a cache file exists with last modified time is not greater than current TTL, if it’s expired remove the file and flow will go ahead and fetch new content from URL and it will be put in the cache for later use, its all handled in fetch() method.

private function cache_it($url, $content)
    {
        $file_path = $this->get_cache_dir() . '/' . $this->generate_filename($url);

        $this->setup_cache_dir();

        file_put_contents($file_path, $content);

        return $this->validate_xml($content);
    }

Is Cache expired

In this, we simply check to see if last modified date using filemtime($file_path) of cached file is under cache TTL

private function is_cache_expired($file_path)
    {
        return filemtime($file_path) < (time() - $this->get_cache_ttl());
    }

Parse RSS Feed

Now we have the XML, it’s time to parse it and convert to json, We already have SimpleXMLElement object so we can traverse it very easily, parse($xml) method is protected so you can change the parsing by extending and overriding this method. For our app, we only need channel info and posts title, description, permalink and publish date so simple loop will do it.

protected function parse($xml)
    {
        if( is_object($xml) ) {
            // channel info
            $this->parser['channel'] = [
                'title' => (string) $xml->channel->title,
                'link' => (string) $xml->channel->link,
                'img' => (string) $xml->channel->image->url,
                'description' => (string) $xml->channel->description,
                'lastBuildDate' => (string) $xml->channel->lastBuildDate,
                'generator' => (string) $xml->channel->generator
            ];

            // feed items
            $this->parser['items'] = [];

            foreach ( $xml->channel->item as $item ) {
                array_push($this->parser['items'], [
                    'title' => (string) $item->title,
                    'link' => (string) $item->link,
                    'description' => (string) $item->description,
                    'description_text' => strip_tags($item->description->asXml()),
                    'pubDate' => (string) $item->pubDate
                ]);
            }

            return $this->parser;
        }

        $this->json_response([ "{$this->error_msg_key}" => 'Unable to Parse xml format.'], 500);

        return false;
    }

QRss class is ready

Now our class is ready. let’s use it with our desired API syntax.

require 'QRss.php';

// Get the feed and cache for 10 min
(new QRss('https://en.blog.wordpress.com/feed/'))->cache_for('10 minutes')->json();

// Get the fresh feed ignoring cache
(new QRss('https://en.blog.wordpress.com/feed/'))->fresh()->json();

In the next part, we will be making the front end using Vue.js and Vuex for state management hope to see you in next one, You can grab the source code from git repo. Please do comment if you need any help or some suggestions.

Source Code