Site Map - skip to main content

Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes every weekday Monday through Friday.
This page was generated by The HPR Robot at


hpr3082 :: RFC 5005 Part 1 – Paged and archived feeds? Who cares?

An interview with two passionate RFC 5005 fans on how to handle big Atom feeds

<< First, < Previous, , Latest >>

Thumbnail of clacke
Hosted by clacke on Tuesday, 2020-05-26 is flagged as Clean and is released under a CC-BY-SA license.
rss, atom, rfc, interview, feedreader, podcatcher. 1.

Listen in ogg, spx, or mp3 format. Play now:

Duration: 00:35:08

general.

This conversation took almost an hour, so I split it into two shows:

  • Part 1 talks mostly about the RFC itself, what it means and why.
  • Part 2 goes into personal experiences with the RFC and with syndication in general, in particular in the context of web comics. This is part 1.

The why

When serving most RSS/Atom feed readers today, you have to choose: Do you make a complete feed with all the things you ever published, or do you make a shorter feed with just the latest entries?

This is a trade-off with pros and cons, and it seems like a trade-off you have to make, but a solution to let your Atom feed have the cake and eat it too existed already 13 years ago, if only any of our feed readers would adhere to it: RFC 5005, Feed Paging and Archiving

The what

https://tools.ietf.org/html/rfc5005 was published in September 2007

  • The XML namespace for RFC 5005 elements is https://purl.org/syndication/history/1.0, aliased as fh below.
  • Section 2 defines the complete feed: It is one document (Atom file) that contains the entire set the feed describes. The document is marked with an fh:complete element.
  • Section 3 defines the paged feed: It is a series of documents connected with Atom link elements with rel set to the link relations first, last, previous or next.
  • Section 4 defines the archived feed: It has a subscription document that may change at any time, and a series of archive documents that are expected to have stable contents and URIs. The link relations defined are current, prev-archive and next-archive. The semantics are clearer: prev-archive refers to previously published entries, and because the contents are stable you can stop when you see a URI to a document you already have. Archive documents are marked with the fh:archive element.

The who

In this show I’m talking to:

fluffy

Jamey

Conversation notes

  • Google Reader was terminated 2013-07-01, all subscription data permanently gone on 2013-07-15:
    https://www.google.com/reader/about/
  • Mastodon had Atom feeds with paging, but the feeds went away when OStatus went away:
    https://github.com/tootsuite/mastodon/pull/11247
  • HTML4 does indeed define the HTML link relations:
    https://www.w3.org/TR/html4/types.html#h-6.12
    It has prev rather than the previous of RFC 5005, but mentions that some browsers support previous as an alias.
  • HTML5 also defines the HTML link relations:
    https://html.spec.whatwg.org/multipage/links.html
    Here previous is a lower-case must for historical reasons.
  • IANA manages the Registry of Link Relations:
    https://www.iana.org/assignments/link-relations/link-relations.xhtml
    It references RFC 5005 for the Section 4 relations, but not the Section 3 ones.
  • RFC 5005 singles out its own Section 3 (Paged Feeds) as the best-effort, loose, discouraged model.
    • Section 3:
      Therefore, clients SHOULD NOT present paged feeds as coherent or complete, or make assumptions to that effect.
    • Section 4:
      Unlike paged feeds, archived feeds enable clients to do this without losing entries.
  • I’m confused about it in the show, but the RFC is clear that an archived feed has one dynamic subscription document, which points to a chain of immutable archive documents.
  • Back in 2002, Aaron Swartz published his joke MIME-header-based RSS 3:
    https://www.aaronsw.com/weblog/000574
    The cultural context at the time and the rivalry between RSS 0.91+, RSS 1.0, RSS 2.0 and Atom deserves a show of its own.

Comments

Subscribe to the comments RSS feed.

Comment #1 posted on 2020-06-02 00:52:51 by clacke

Atom "tombstones" RFC

fluffy mentioned Atom "tombstones", defined in 'The Atom "deleted-entry" Element', https://tools.ietf.org/html/rfc6721

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Title:
Comment:
Anti Spam Question: What does the letter P in HPR stand for?
Are you a spammer?
Who is the host of this show?
What does HPR mean to you?