IN THIS ARTICLE
Madison Smith is a intern at PubNub, who’ll be cooking up awesome demos and projects all summer, including this realtime RSS feed with Python and JavaScript.
Social news websites have changed the way we find, read, and share content. However, these websites are static and fresh content is not presented as it happens. To this, I aimed to change this, and I picked Hacker News as my news source. By leveraging the power of PubNub’s global data stream network, and scraping a little RSS, nobody will ever miss a new Hacker News article again.
I built a news and social content feed, that auto updates in realtime as new content is posted to Hacker News. However, this can be applied to pretty much any social news website (all you need is a RSS feed). This otherwise static content is now pushed in realtime to the browser.
If you want to see it working live, check out the Hacker News Feed Demo. It uses the JavaScript PubNub SDK and will display the updates to the Hacker News feed. To see it in action locally, clone the source from Github and run the Python scraper from the command line.
RSS Scraping
You’ll first need to sign up for a PubNub account. Once you sign up, you can get your unique PubNub keys in the PubNub Developer Portal.
The first task is to grab the RSS feed from Hacker News. There is a plethora of ways to do this and you can quickly write your own RSS scraper if you want, but I decided to use Python and feedparser. With a quick “pip install feedparser” we have our RSS.
With no customization, you’ll get every single post that’s posted. However, I decided the most interesting information was the rank of the post, title of the post, the link to the article, and the comments link.
Python Command Line
The Python Argparse module is used, which very powerfully gives you robust command line options.
You can python hn.py --help
to see descriptions of all the options from the command line. The Python module gives you options for specifying how often you want to poll Hacker News for changes and if you want to get a new page after every change to the site or just the new posts that appear on the site. For instance, if you wanted to poll every five seconds and get the entire page you could run to be up and going:
Argparse also gives defaults, so run the following to use the defaults:
Go Global
Now that we have the information that is important to us, and know how to run the scraper locally, it’s time to send it global. PubNub provides our incredibly simple API to publish the message. Quickly “pip install Pubnub” and publish our information from Hacker News.
Now it’s up to you. PubNub offers over 50 different SDK’s for your use. Take your pick. When trying to consume the information simply subscribe to the channel (in our case “hacker-news”) and you’re off. There are publicly available publish and subscribe keys to use for demos.
And the final result should look something like this: