11 November 2016

Blogger Integration

Soon after launching this website, it became clear to me that this blog wasn't visible enough on the homepage. It was very easy to miss it entirely, hiding behind a solitary grey button in the navigation.

My original knee-jerk response was to consider just placing a bigger central button on the home page: "Come read my blog! Please!" Certainly, this would be an easy thing to do, but visitors to the home page still can't know of any new blog posts I've published, and that I'm still active. It's just a static page.

What I needed was a way to list the latest blog posts on the home page. Then I thought it'd be cool if I could highlight newly added posts the visitor hadn't yet seen. How would I go about doing this?


As my blog is actually hosted by Blogger, they have a pretty comprehensive API. They also have client libraries for a bunch of programming languages. Given all I needed was a way to retrieve the top N blog posts, I felt that including the whole library (and its necessary dependencies) was excessive. Instead, I could call the REST endpoint directly, and because this website is a Spring Boot application, I knew I could do it easily using RestTemplate.

Because the top N blog posts is publically available information, I didn't need to worry about authenticating my request. I just needed to generate an API key.

So, the first step for me is always just going to the endpoint in a browser, to see what's returned. Which turned out to be far more than I needed!

{
 "kind": "blogger#postList",
 "nextPageToken": "PAGETOKEN",
 "items": [
  {
   "kind": "blogger#post",
   "id": "POSTID",
   "blog": {
    "id": "BLOGID"
   },
   "published": "2016-11-11T02:21:00-08:00",
   "updated": "2016-11-11T02:21:46-08:00",
   "etag": "\"ETAG\"",
   "url": "http://TESTACCOUNT.blogspot.com/2016/11/new-post.html",
   "selfLink": "https://www.googleapis.com/blogger/v3/blogs/BLOGID/posts/POSTID",
   "title": "New post",
   "content": "\u003cdiv dir=\"ltr\" style=\"text-align: left;\" trbidi=\"on\"\u003e\nTesting post\u003c/div\u003e\n",
   "author": {
    "id": "AUTHORID",
    "displayName": "Mark Simkins",
    "url": "https://www.blogger.com/profile/PROFILEID",
    "image": {
     "url": "//lh3.googleusercontent.com/-ODggBYX_s5k/ABCDEFG/photo.jpg"
    }
   },
   "replies": {
    "totalItems": "0",
    "selfLink": "https://www.googleapis.com/blogger/v3/blogs/BLOGID/posts/POSTID/comments"
   }
  },
  <!-- Many, many more items -->
 ],
 "etag": "\"ETAG\""
}
(using a test blog, and sanitised)

Reading the documentation a little more, I found parameters I could pass to reduce the amount of information being returned. It's always a good idea to do this, as it speeds up the queries, and simplifies the process.

For a start, it returns many more posts than I'd ever want to show on the homepage, so adding "maxResults=3" as a parameter reduced that down. The response for each post also contained a bunch of information I wasn't interested in, like "kind", "etag", "content", "replies" etc. Adding "fields=items(id,published,title,url)" as a parameter tidied this up.

Now, I was getting this as a response:

{
 "items": [
  {
   "id": "POSTID",
   "published": "2016-11-11T02:21:00-08:00",
   "url": "http://TESTACCOUNT.blogspot.com/2016/11/new-post.html",
   "title": "New post"
  },
  {
   "id": "POSTID2",
   "published": "2016-11-10T05:22:00-08:00",
   "url": "http://TESTACCOUNT.blogspot.com/2016/11/post-2.html",
   "title": "Post 2"
  },
  {
   "id": "POSTID3",
   "published": "2016-11-10T05:21:00-08:00",
   "url": "http://TESTACCOUNT.blogspot.com/2016/11/post-3.html",
   "title": "Post 3"
  }
 ]
}

Much better. So now, I needed to define some Java classes to hold this data. Clearly, from the response above, all we need is one class with an "items" field, which is a list of post classes.

(Note that these classes are simplified for the sake of this article - one minor frustration is that you will need to define setter methods for all private fields, so that Jackson deserialisation will work properly.)

public class BlogPostsResponse {
    private List<BlogPost> items;

    public void setItems(List<BlogPost> items) {
        this.items = items;
    }

    public List<BlogPost> getItems() {
        return items;
    }
}

Now for the BlogPost class:

public class BlogPost {
    private long id;
    private String title;
    private OffsetDateTime published;
    private String url;

    /* ... getters and setters ... */
}

That's all we need to handle the API response. I then set up a new Controller in my application for testing, then added the following code to actually make the request.

RestTemplate restTemplate = new RestTemplate();
BlogPostsResponse response = restTemplate.getForObject(POSTS_URL, BlogPostsResponse.class);
(where POST_URL is a constant to the full Blogger API URL for retrieving posts)

Very simply, that's all you need to do to retrieve the latest blog posts. You call

model.addAttribute("blogPosts", response.getItems());

to push it to your template, and then you can iterate over the list using your templating engine of choice. You will have to do something with the published date to render it nicely, but I'll leave that to you.

Limitations

As always, the simplest way of implementing a feature isn't necessarily sufficient for production use. There are a number of limitations of the basic approach above.

  1. No error handling. If anything changes, or goes wrong (and it will, this is the internet), it will blow up in fun ways. You should wrap the restTemplate call in a try/catch, and implement this whole feature in such a way that if it breaks, it doesn't stop the whole page from rendering. This is known as graceful degradation. In this case, the "Latest Blog Posts" section won't show, but the rest of the page will load as normal.
  2. Related to the above, No timeout handling. What if Blogger is down? Does your entire website now take 30 seconds/forever to load? You should set fairly strict timeouts on your RestTemplate and handle the error appropriately. I've set my timeouts to 2 seconds.
  3. Excessive API use. Every time someone loads your homepage, you're making a largish API call to Blogger. Ok, this is unlikely to really cause any issues unless you get regular traffic, but it should be a consideration. Simply caching the response in your application isn't sufficient though - how does it get updated when you publish a new post?
    The approach I went with is to cache the posts locally on application startup. Then, for each page view, I'm making a tiny API call to Blogger (with a single line response) to get the time the blog was last updated. If it matches what I have, I just use the cached blog posts. If it doesn't, I update the local cache. This isn't ideal, as it still involves making API calls on each page view (albeit much smaller), so I'll think about this a little more. It also makes my application "stateful", which is generally best avoided if at all possible.
  4. No new post highlighting. With this basic implementation, it doesn't highlight new posts the past visitor hasn't yet seen.

New Post Highlighting

How can this be done?

Well, the first thing that comes to mind is by making use of a persistent cookie.

  1. On first view, no cookie exists, so there's no point highlighting all posts as New. Store a persistent cookie containing a CSV of the post IDs seen.
  2. On subsequent view, load the cookie, split the CSV of post IDs, and compare against the blog posts we have loaded. If there is a loaded blog post that isn't in the cookie, that's new.
  3. Add something to the template to render any "New" highlights where necessary.

This is pretty straightforward, but I fell into a trap on my first implementation! Initially, I rather naively added a "isNew" flag to the BlogPost object, which I set to true if it wasn't referenced in the visitor's cookie. I then simply checked the "isNew()" method when displaying the post, and set the highlight accordingly.

Did you spot the problem with this?

Because the list of BlogPosts is cached, whenever I set the "isNew" flag, I was setting it globally. For everyone viewing the page. This became clear to me when I tested the page in a different browser, and still saw the "New" highlight. And when I saw that I needed to set the "isNew" flag to false, I realised my mistake.

Thankfully, once I noticed this, it was quite easy to fix. I removed the "isNew" flag from the BlogPost object. Because I had already extracted the cookie handling code into its own class, I simply passed that object to the template. I had already added a "hasPostId" method to this class, so I checked that method for each blog post, rendering the highlight if necessary. Because this cookie class is generated fresh for each visitor, the highlighting is now unique for everyone. Sorted.

I didn't want to complicate this system any further, so I simply rewrite the cookie with the new IDs after new posts have been discovered. This does mean that on the next page load, all highlighting disappears, but for this use case I don't think that's a problem.

Where is it then?

You may've noticed that (at the time of writing) there isn't yet a "Latest Blog Posts" section on the homepage.

While I implemented everything mentioned above in a single afternoon, I now have a bigger challenge ahead... visually designing it!

I can really struggle with visual design, hence why I'm a server-side developer. I just need to figure out where and how it should go on the homepage, while keeping the content and asthetic neat and appropriate... watch this space!

This is now live on the homepage!

No comments:

Post a Comment