HCF EP 005: Cursor based pagination

2024.12.16

Jan 04, 2025

This is Episode 5 of HandCraftedForum.

In these first few episodes we're doing an introduction to our homegrown mini framework and data storage layer that we'll be using through out the project.

In the previous episode, we wrote the backend code for creating users and posts and indexing posts by hashtag, but we did not implement the UI.

In between the last episode and this episode I implemented the basic UI and fixed a minor bug along the way.

There isn't anything special in the UI code that we didn't previously cover so I did not think it's worth much to explain in detail. The code will be attached at the end of the article as usual, so make sure to check it out. Here's a quick demo of the UI.

Creating a user account
Creating a post on behalf of the user
Finding posts by hashtags
Finding posts by users

One small change I did to the backend code is to read entries off the index in reverse. We want to see newest posts first, but the index sorts by the creation time in ascending order.

To make this happen, I had to update VBolt to support iterating the index in reverse order.

When the client creates a new post, it just pushes it to the top of the list:

The topic of this episode was not planned, but I think it's a neat little thing that just comes out of using B-Trees for indexing, and being able to directly make use of its properties, and would probably be quite painful to mimic using higher level concepts like relational tables.

In the previous episode I did a quick overview of the concept behind the "Index" storage and explained that it stores tuples in a list that gets sorted by the underlying B-Tree.

Here's one way to visualize how the sorted term -> target mapping is stored in the B-Tree. It's basically a list of `[]byte` keys, each key is internally composed of a three-tuple: (term, priority, target). They are arranged such that when the B-Tree sorts these keys in byte order, it's equivalent to sorting the list of tuples in order.

To find targets for term 'A', we iterate the B-Tree on keys that start with "A" by seeking to the first key that has the prefix (A ...), then iterating one by one until we find an entry that does not have that prefix.

We can stop the iteration after N steps, and ask the B-Tree to return the full byte representation of the key it stopped at.

This will serve as a "cursor". We can encode it to base64 (or any ascii representation really) and use it later to continue the iteration where it stopped.

Compared to pagination by passing a page number, this has some advantages and some disadvantages. The advantage is that it performs better if you have thousands of elements to skip; it's more work for the computer to read hundreds of entries from the B-Tree only to drop them.

The B-Tree is really good at jumping to a particular key, but it's not very good at jumping N keys ahead. It has to move there one step at a time.

On the other hand, using a cursor means that as an end-user, we do not have random access: we can only grab the next page. If for some reason the user wants to skip to page 20, they have to hit "next page" 20 times, which is a lot more wasteful than the computer skipping 20 pages when reading the B-Tree.

Another advantage of the cursor is that it's more stable. It's like you left a bookmark where you were reading so you can come grab it again next time to continue where you left off. Even if items got added or deleted somewhere else in the list, you will still continue exactly where you left off.

If you are interested in the topic, you can lookup "cursor pagination vs offset pagination". Here's a Grok summary:

https://x.com/i/grok/share/6pA5jISvoRiKDFrdwQerQkiNV

The most suitable problem for cursor navigation is "infinite scrolling".

Now, here's how we can incorporate cursor pagination into our UI:

Add a cursor field to the response and the request `Cursor: []byte`
Pass the cursor to form the request to the query function `vbolt.ReadTermTargets` and pass output cursor back to the response.
UI retains the cursor value
"fetch more" button sends the cursor value

Here are the changes to the backend code:

We also update the PostsByHashtag function in the same way (not shown for brevity).

Here are the changes to the UI code:

Notice how the callbacks for fetching by userid and fetching by hashtags are almost identical.

This is not very good. We'll discuss how to collapse these duplicate functions in the next episode, which although was not planned, is still a very good topic that deserves its own independent treatment, and not be shoved as a side quest in an article about cursor pagination.

Now, if you watch closely, there's a bug: fetching more does not seem to be working for hashtags! Even though the response does contain the correct data!

What gives?

A little bit of debugging reveals that the code for listing the posts was using `data.Posts`, which is the posts from the initial page fetch, when it should have been using `form.posts`, which is the list we retain in the UI state.

This is one of those problems that happens when you have mostly duplicate code doing the same things with slight variations! As you keep making changes, you have to remember to keep updating both places, and it's easy to forget things. Things are still small now so the mistakes are harmless and easy to find, but you can imagine how things would get out of hand as the combinations of code path increases exponentially.

In the next episode, we'll show how we collapse the code paths without resorting to OOP like abstractions. The solution will be so much better and so much simpler! Stay tuned!

Here's the code for today's episode.

Note: I updated vbeam and vbolt. If you are pulling from github, make sure to run `go mod tidy` after pulling.

Download the code:

EP005.zip

View the code online:

HandCraftedForum/tree/EP005

Hasen Judi

Discussion about this post