Monday, June 9, 2014

Building a Customized News Service with Google Apps Script

A big part of my job involves staying on top of current technology events and news. There are several technology-related news sites that I read daily, and typically I scan them first thing in the morning for anything I consider to be important before reading other items of interest. It’s a bit tedious to open each site and read all the individual stories for the ones I care about, and even the blog readers I use aren't well-suited for a quick morning summary of important items.

I've been playing around with Google Apps Script for a while now, and I’m pretty impressed with the versatility and power available via the APIs coupled with the simplicity of creating and running scripts. So, I decided to see if I could write a script that would scan the news sites I read each morning and deliver a customized news feed to my Gmail inbox, along with a summary of my day’s calendar. I was originally going to try the same thing with AppEngine and cron tasks, but it turns out Apps Script is more than enough to handle this job. I’m pretty happy with the results, and in this post I’ll discuss how I created the script.

Overview

The script itself is pretty simple: it scans a set of news feeds, creates links to the stories, scans the headlines for specific words, and calls them out separately from the rest of the news. It also reads my calendar for the day’s events and tells me if I have any upcoming events that I have not yet RSVP’d to.

Getting started with Google Apps Script development is pretty easy; just make sure that you have a Google account, are signed in, and visit script.google.com.

To get this script to work, there are a couple of things you'll need to do:
  1. Enable the Calendar API. From within the script editor, select Resources -> Advanced Google Services, then enable Calendar API.
  2. You also need to enable the Calendar API from within the Google Developers Console. There's a link to do this in the dialog in step 1.

What It Looks Like

You create scripts using the Apps Script IDE, which is of course hosted in the cloud. It's not exactly Eclipse or Visual Studio, but it's very functional and gets the job done:

[click image for a larger view]

When the script is run, it delivers an email to the address of whoever the person running the script is. The screen shot below shows what the resulting email looks like:

[click image for a larger view]

At the top, I have my daily events including ones that need my response, followed by my curated top stories, then the rest of the news items.

The script relies on some global settings to work, which I've listed here:

The main function of this script is called deliverNews(). It looks like this:

The deliverNews function does several things:
  1. Reads the Calendar for the day's upcoming events
  2. Checks the week ahead to see if there are any events I have not RSVP'd to
  3. Reads the news feeds and builds the Top Stories list
  4. Puts the rest of the news stories into their own sections

Reading the Calendar

Let’s start by looking at the code to generate the calendar summary. The Google Apps Script API has an advanced service called Calendar. This service is a very precise wrapping of the actual Calendar REST API itself.

To get the events for the upcoming day, I use the Calendar service to retrieve the events using time bounds and a setting to expand recurring events into single instances:

The getEventsForToday() function returns a list of Event resources. I then just need to scan each event for the title, URL, and date and build the resulting list:


You’ll probably notice that I wrote my own convertDate function to generate a Date. Why? Because Apps Script returns dates as ISO Date strings. In ECMAScript 5, you can pass ISO Date strings directly to the Date() constructor to create a date, but as of this writing the version that Apps Script appears to be using doesn’t support this, so I made my own:


Next, I need to find events that I have not RSVP’d to. I use the Calendar service to retrieve events for the current day, then scan each one to find the attendee record that corresponds to my email address. If found, I check the responseStatus feed, which will be "needsAction" if I haven't responded to it:

Getting the News

There are two parts to this portion of the script - one is getting the news headlines and generating the categorized results, the other is extracting the headlines that are important and elevating them to the Top Stories section. To do this, I use the UrlFetchApp service to retrieve the data content, then the XmlService to parse each feed.

The code uses UrlFetchApp to read each feed and XmlService to parse the result into a document tree that I can use to extract each item’s title and link. Right now, the script only parses RSS 2.0 feeds, but since practically everyone supports that format, it’s what I decided to code to. Adding support for ATOM would be easy enough as well.

Now, by itself, this would result is a pretty large set of headlines, and I don’t want to have to scan each one to see if it is important. Instead, I have the script look at each headline for a set of keywords that I’ve defined. If a headline contains a keyword, then that headline is put into a special array. If the headline didn’t contain a keyword, it gets added to the feed’s category normally.

Sending the Email

Now all that’s left to do is send the email. The GmailApp service is used for this, and it provides the ability to send an email with HTML content. Let's revisit the last part of the deliverNews function:

At this point in the script, the string variables calEventsStr, topStoriesStr, and feedsStr contain the HTML code for each of the sections. All that’s left to do is put them together into one result and send that content via the GmailApp.

Automating the Script

Now I have a script that can be manually run whenever I want to generate the email, but that’s obviously not ideal. What I really want is a script that runs for me on a pre-set schedule. Google Apps Script makes this possible by enabling what are called “Triggers”.

I want my email delivered to me every morning before I wake up. To do so, under the Resources menu, select “Project Triggers”. This will bring up a dialog that lets me set a timed trigger:


[click image for a larger view]

In this case, I have my Trigger set to run daily between 6 and 7 in the morning. That’s really all there is to it - you just choose the function you want executed and when you want it run. Apps Script does the rest!

The full script code is available here if you want to download it and modify it for your own needs.

5 comments:

  1. I get a DNS-error : "var feedSrc = UrlFetchApp.fetch(feedUrl).getContentText();"
    this is the message:
    http://undefined. What does it mean?

    ReplyDelete
    Replies
    1. Did you modify the contents of the feeds array? It probably means that you're not passing in a valid URL to the fetch() function. Try stepping through the code in the debugger.

      Delete
  2. Is there a way to only show the new feed items, particularly if the items don't have pubdate

    ReplyDelete
    Replies
    1. I suppose you could add each link to a DB and check to see if the link was already listed before adding it to the result, and then clean the DB each week to prevent runaway growth. That would eliminate the need to rely on pubDate.

      Delete
  3. Great script, if i want to cc n bcc colleagues or add a few more email jnto the maoling list, how can i do so in the script?

    thanks

    ReplyDelete