Relaxed RSS with sfeed
While the golden age of huge platforms offering RSS feeds may be over, lots of webpages still have feeds, especially niche blogs. Over the years, I have tried many RSS readers, but have always given up because they did not work for me. However, by switching to the more obsucre sfeed, I found a workflow that worked (and flowed) for me.
Why RSS and what for?
But before getting into it, I think I need to clarify what I use RSS for. And just to say it once, this post does not distinguishes between RSS and Atom, as the difference is a technicality. To stay on that technical level just for a moment, RSS is only a file format to describe feeds. Major news outlets use it to announce multiple news items per hour, some social media sites allow each post to be read via RSS, code forges publish releases or even commits via RSS, and it is the secret ingredient in the podcasting sauce.
The bottom line is that RSS may be out of sight, but it is still everywhere. But if one is going to subscribe to everything, one might end up with hundreds or thousands of posts in no time. This would be, for me at least, an information overload, resulting in stop reading. Besides, for daily news, I just go to one or several of the big news sites anyway.
So I mainly subscribe to smaller private blogs or niche news sites with a few articles per month. Supplemented by a careful selection of meta-aggregators and mailing list archives, both further filtered. At the moment, there must be round about 130 feeds. This number is growing slowly but steadily as I find something new more often than I decide to drop a source. Depending on the day of the week, this results in about twenty posts each day.
User Experience Matters
In my experience, most RSS readers follow an inbox architecture that expects me to interact with each new post (or to give up and mark them all as read). So even after pre-filtering, the inbox would fill up quickly. While this inbox concept works well for things which are actually important, like the one in a hundred email you need to respond to, it makes reading feeds uncomfortably stressful, like working against an ever-growing to-do list. This is an information overload and a FOMO scenario all over again.
Using sfeed and some degree of customization, I was able to built myself a private news feed. This feed is generated daily and is then completely static, there is no unread counter and if I skip some days, I can read back, but no software bullies me into it.
In particular, there are two (or three) ways how I consume my feeds. First, there is a custom web feed that shows the last 256 entries, grouped and ordered by date - called the bytefeed. This has become one of the first pages I open in the morning, usually before checking to see if anything of world importance has happened. From there, I can follow each aggregated feed to my archive, listing all known feeds, also as a webpage.
Furthermore, the previous day’s posts are also sent to a private IRC channel. This may seem a bit quirky, but personally I am an excessive IRC user, not only for communication, but also as a message broker. Using my favorite IRC client WeeChat, I am able to receive both monitoring and news events both in WeeChat itself and on my smartphone.
00:31 -- Notice(xdf) -> #feeds: HTB: Editorial https://0xdf.gitlab.io/2024/10/19/htb-editorial.html
00:31 -- Notice(analognow) -> #feeds: unpatchable fourth wall breaking sentience https://analognowhere.com/_/ghtmnt
00:31 -- Notice(analognow) -> #feeds: illegal book fair https://analognowhere.com/_/rnituc
00:31 -- Notice(dragasitn) -> #feeds: Outdated Infrastructure and the Cloud Illusion https://it-notes.dragas.net/2024/10/19/outdated-infrastructure-and-the-cloud-illusion/
00:31 -- Notice(fsfeplane) -> #feeds: TSDgeos' blog: KDE Gear 24.12 release schedule https://tsdgeos.blogspot.com/2024/10/kde-gear-2412-release-schedule.html
00:31 -- Notice(grumpyweb) -> #feeds: nikitonsky is being grumpy https://grumpy.website/1582
[...]
As initially stated, the user experience of a tool matters a lot. While this particular UX may not be appealing to some (or most), it does a pretty decent job for me: Getting daily news digests without having to work through them or actively interact with an “app”.
sfeed?
So far I have only talked about my fear of software and my obsessions with workflows. But what exactly is sfeed?
In a nutshell, sfeed is a collection of small tools to convert RSS feeds into a TAB-separated value (TSV) file and then to present these TSVs in another human or machine-friendly way. These tools are written either in C or as portable POSIX shell scripts, so they can be used on most operating systems. Everything comes with a well-written man page and an exhausting README.
While this may sound boring at first, sfeed’s simple architecture makes it easy to build a custom RSS reader. Representing feeds as a TSV instead of some weird XML allows writing filters or further output generators in almost any script language, like awk.
While sfeed is not restricted to be used only on servers, I configured a pipeline like the following to be run nightly via cron on a small VM:
- Call
sfeed_update
to refresh all RSS feeds configured. - Create two HTML files served by httpd: the feed archive via
sfeed_html
and the bytefeed shown above via a customsfeed_bytefeed
script. - Send day’s posts to the IRC via my custom
sfeed_irc
script.
All this is being triggered from one small script, being configured as a cron job. Afterwards, everything is static until the next iteration starts.
How the sfeed sausage is made
To get started, the essential sfeed tools are required.
Several package managers provide a sfeed
package.
Otherwise, compiling sfeed yourself should be a walk in a park due to the minimal dependencies.
In an attempt to make it all less abstract, I will add commands usable on OpenBSD. A certain degree of cognitive flexibility is assumed.
user@openbsd:~> doas pkg_add sfeed
While most of the sfeed tools come with a pledge(2)
promise by default, sanity and reason recommend creating a custom user.
user@openbsd:~> doas useradd -g =uid -m -s /sbin/nologin _rss
After switching to this unprivileged _rss
user, building on the sfeed example configuration is a good starting point.
One can either find an example installed by your package manager or upstream.
_rss@openbsd:~> mkdir ~/.sfeed
_rss@openbsd:~> cp \
/usr/local/share/examples/sfeed/sfeedrc.example \
~/.sfeed/sfeedrc
A quick look at the .sfeed/sfeedrc
file along with skimming sfeedrc(5)
should explain the basics.
In a nutshell, the feeds
function contains multiple feed
function calls, representing the remote RSS feeds to be fetched.
How the fetching is done can be overridden by a custom fetch
function.
In particular, a custom filter
function allows manipulating the fetched data, but more on that later.
To demonstrate this further, let’s start with this very minimal .sfeed/sfeedrc
and fetch the feeds via sfeed_update(1)
.
_rss@openbsd:~> cat .sfeed/sfeedrc
feeds() {
feed "undeadly" "https://undeadly.org/cgi?action=rss"
feed "xkcd" "https://xkcd.com/rss.xml"
}
_rss@openbsd:~> sfeed_update
[20:25:48] xkcd OK
[20:25:48] undeadly OK
Each feed is stored in its own file, represented by the feed name, within ~/.sfeed/feeds
.
So you might end up with something like this.
As mentioned above, sfeed works on TSV, and these files are already in the TAB-separated format described in sfeed(1)
.
For example, one can extract the titles of each XKCD.
_rss@openbsd:~> ls -l .sfeed/feeds/
-rw-r--r-- 1 _rss _rss 13727 Oct 27 20:25 undeadly
-rw-r--r-- 1 _rss _rss 1740 Oct 27 20:25 xkcd
_rss@openbsd:~> awk -F '\t' '{ print $2 }' .sfeed/feeds/xkcd
Sandwich Helix
RNAWorld
Temperature Scales
Experimental Astrophysics
But in most cases there is no need to manually inspect a feed file.
There is a sfeed tool for that!
sfeed_plain(1)
gives a nice terminal listing while sfeed_html(1)
creates an HTML rendered output.
This works both for single or multiple feeds.
_rss@openbsd:~> sfeed_plain .sfeed/feeds/xkcd
2024-10-25 06:00 xkcd Sandwich Helix https://xkcd.com/3003/
2024-10-23 06:00 xkcd RNAWorld https://xkcd.com/3002/
2024-10-21 06:00 xkcd Temperature Scales https://xkcd.com/3001/
2024-10-18 06:00 xkcd Experimental Astrophysics https://xkcd.com/3000/
_rss@openbsd:~> sfeed_html .sfeed/feeds/*
<!DOCTYPE HTML>
<html>
[. . .]
Custom Scripts
While sfeed comes with a multitude of tools to build your RSS reader with, it is a very hackable ecosystem, as both mentioned and demonstrated.
To build my RSS reader, the following tools have emerged over time.
Some of them were C programs derived from sfeed_plain
, but to celebrate this year’s awktober, they were rewritten in awk.
For this blog post, I have cleaned up my local clone of the sfeed repository and moved them to sfeed-contrib.
There are two types of script in this repo: custom sfeed formatter like sfeed_bytefeed
and helper scripts, especially for automation.
_rss@openbsd:~> git clone https://codeberg.org/oxzi/sfeed-contrib.git
_rss@openbsd:~> ls -l sfeed-contrib/
total 32
drwxr-xr-x 2 _rss _rss 512 Oct 26 22:35 LICENSES
-rw-r--r-- 1 _rss _rss 1995 Oct 26 22:35 README.md
-rwxr-xr-x 1 _rss _rss 1858 Oct 26 22:35 sfeed_bytefeed
-rwxr-xr-x 1 _rss _rss 307 Oct 26 22:35 sfeed_edit
-rwxr-xr-x 1 _rss _rss 1405 Oct 26 22:35 sfeed_irc
-rwxr-xr-x 1 _rss _rss 1104 Oct 26 22:35 sfeed_run
-rwxr-xr-x 1 _rss _rss 192 Oct 26 22:35 sfeed_test
-rw-r--r-- 1 _rss _rss 1169 Oct 26 22:35 style.css
Both sfeed_bytefeed
and sfeed_irc
have already been roughly explained above, but I am going to repeat myself.
They take sfeed feed files as parameters and create an HTML feed of the last 256 entries or post the posts of the day to an IRC channel, respectively.
The style.css
file is a customization of the upstream stylesheet to support both sfeed_html
and sfeed_bytefeed
.
The entire workflow of updating feeds and utilizing formatters was glued together in sfeed_run
.
This script does exactly that, after some preflight checks for the configuration expected in the .sfeed/sfeedrc
.
In order to share this script (command listing would be the better word) with the world, some potentially sensitive values have been moved elsewhere.
Thus, sfeed_run
expects sfeedpath
to be set according to sfeedrc(5)
, sfeedwwwroot
to point to a directory where the _rss
user has write access to dump the HTML files and a gzipped version, and sfeedirchost
and sfeedircport
to point to an open IRC server.
_rss@openbsd:~> head -n 4 .sfeed/sfeedrc
sfeedpath="$HOME/.sfeed/feeds"
sfeedwwwroot="/var/www/htdocs/rss.example.internal"
sfeedirchost="irc.example.internal"
sfeedircport="6667"
Since sfeed_run
executes the whole pipeline, it is run as a nightly cron job.
As some feeds may fail, the output - sfeed_run
only prints errors - is be sent to another user who may log in from time to time.
_rss@openbsd:~> crontab -l
MAILTO=user
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin
30 0 * * * /home/_rss/sfeed-contrib/sfeed_run
Configuring sfeed
While a .sfeed/sfeedrc
file may look as short as posted as the example above, it can grow over time.
However, the most common changes are updated feed
entries in the feeds
function.
To verify if a URL to some feed works, I built the little sfeed_test
helper script.
It sources the .sfeed/sfeedrc
file and uses the fetch
function to retrieve the remote content.
For information on fetch
, consult sfeedrc(5)
.
In the example above, no such function was defined, resulting in curl
to be used.
In my setup, I a use OpenBSD’s ftp(1)
command, which is also present in the example configuration of the OpenBSD port.
_rss@openbsd:~> grep -A 2 '^fetch' .sfeed/sfeedrc
fetch() {
ftp -M -V -w 15 -o - "$2"
}
This allows a simple test like the following, where the latest posts will be shown, formatted with sfeed_plain(1)
.
_rss@openbsd:~> ./sfeed-contrib/sfeed_test 'https://lobste.rs/t/openbsd.rss'
2024-10-13 11:14 OpenBSD is Hard to Show Off https://atthis.link/blog/2024/16379.html
2024-10-07 22:35 OpenBSD 7.6 https://www.openbsd.org/76.html
2024-10-04 15:20 I Solve Problems https://it-notes.dragas.net/2024/10/03/i-solve-problems-eurobsdcon/
[. . .]
Thus, after verifying that the URL to a new feed entry actually works, I use sfeed_edit
to edit the .sfeed/sfeedrc
file.
Again, this script is mostly a wrapper around opening the configuration file with vim
.
This script’s magic is keeping track of configuration file changes via rcs(1)
- yes, the Revision Control System, that single-file CVS thingy.
If changes have been detected, one must commit them and sfeed_run
is executed, otherwise, nothing happens.
The rcs(1)
part is mostly there because I personally like to put my configurations into some version control system.
Doing so gives me backups, at least to a some extent, and a wonderful history via rlog(1)
.
Filtering and Transforming
Initially mentioned, I prefer to filter some feeds.
Fortunately, sfeed supports this via sfeedrc(5)
’s filter
function, which is described as follows.
filter(name, url)
Filter sfeed(5) data from stdin and write it to stdout, its
arguments are:
name Feed name.
url URL of the feed.
Thus, this function is called on every feed, receiving all incoming TSV data on stdin and continuing with the TSV data sent back to stdout. This architecture allows stream-based filtering, even chaining multiple filters. However, since the output is in the domain of this function, it also allows rewriting of feeds, e.g., to enrich titles.
Helpful filters, at least for me, were to remove posts on mixed announcement and discussion mailing lists, to show only the original post. When subscribing to meta-aggregators, it was useful to prefix the title with the origin, based on the URL. In the past, I have subscribed to a YouTube channel where I was only interested in certain videos, allowing to drop others based on the title.
Since these are just some ideas of what is possible, maybe posting a stripped down version of my current filter
will make it more concrete.
# filter(name, url)
filter() {
case "$1" in
"freifunk community news")
# Prefix title with domain
awk '
BEGIN { FS=OFS="\t" }
match($3, /:\/\/[a-z0-9.-]+\//) {
$2 = "[" substr($3, RSTART+3, RLENGTH-4) "] " $2
print $0
}
' ;;
"oss-sec")
# Drop replies; first posts only.
awk -F '\t' '$2 !~ /^Re: /' ;;
*)
cat ;;
esac |\
# Use URL as title if title is missing.
awk 'BEGIN { FS=OFS="\t" } { sub(/^$/, $3, $2); print $0 }' |\
# Prefix YouTube links with [VIDEO].
awk 'BEGIN { FS=OFS="\t" } $3 ~ /https:\/\/www.youtube/ { $2 = "[VIDEO] " $2} //'
}
But why?
Right now, this blog post clocks in at over 2k words. That is a lot of text to describe how to use an RSS reader. Why even bother?
As I said at the beginning, good software should not get in your way. Not only should it work, but it should work for you, without forcing you to bend to its design. This is not limited to RSS readers, of course.
However, in the RSS domain, sfeed has done the trick for me. So I just wanted to say “thank you” for this wonderful piece of software. I also wanted to showcase how to get started with it and how easy it is to extend it. If the normal RSS reader workflow does not work for you, I would encourage you to give sfeed a shot.