API Pagination Problem

When we think of pagination – we think SELECT * FROM MYTABLE LIMIT {skip}, {take} – ahhhhhh… ignorance is bliss.

Let’s go through some of the ways we expose pagination through a Web API and dig a little deeper.

Standard Pagination

This is probably the most common type of pagination you’ll see with majority of Web API’s. You know, {baseUrl}?p=1, or {baseUrl}/page/1.

As a response from an API like this, you’ll have a page count and/or number of items, where starting page may start with 1 or 0.

So a full response might look something like this:

{
  "result": [
    { "title": "Title 1", "description": "Description of Item 1" },
    { "title": "Title 2", "description": "Description of Item 2" },
  ],
  "pageCount": 25
}

The other slightly different variation is hypermedia style, where the response links you to the next response.

{
  "result": [
    { "title": "Title 1", "description": "Description of Item 1" },
    { "title": "Title 2", "description": "Description of Item 2" },
  ],
  "next": "https://api.mydomain.com/items?p=2"
}

This is probably the simplest pagination you can implement, it’s easy to understand – majority of use cases can be satisfied with this solution.

Offset Pagination

Another way of solving pagination is by using an offset rather than paginating through content page by page like a book – now this is basically a slightly lower level version of the standard pagination solution, but this can give us a bit more flexibility.

The idea is that, we use some property as our pointer / cursor.

For example:

{baseUrl}?skip=15&take=15

This example uses the positional index as our cursor.

This would start from the 16th item (index 15 if 0 based index) and take 15 items, which would result in items 15...30.

But why would you design an API like this? Isn’t it just a lower level version of the standard pagination approach – which means it’s actually more work for consumers that now has to keep track of the number of items shown so far, their index etc. etc. just to figure out how to paginate your content? – Yes, this is all true, it’s a pretty bad API design… but bear with me.

Let’s see another example that might make a bit more sense..

{baseUrl}?from=db3f73dc-0acc-4382-99b3-0240e0e497b3

Now, using a natural id as a cursor – we can essentially paginate through our dataset by just keeping track of the last item that we have in the client – sound familiar? this is a great solution for infinite scroll.

What About Highly Volatile Data?

What if your data changes every second, affect the results from the first API call to the next?

One solution to this problem is by using some form of session… where by your initial requests response contains some type of session token, and this is passed to the subsequent requests.

Another way of doing this would be to use hypermedia style, and use a token for next and previous that knows about the session – this is what twitter and facebook do.

The session could be implemented in various ways – depending on how you want to go about things – the simplest implementation would be to timestamp when the initial request was performed and filter the data you’re including in the paginated request to only include those created before the timestamp.

Caveat

If you’re using some form of ranking algorithm based on the items you’re paginating over, then you’re pretty much out of luck – order stability becomes impossible.

For example, if you were pagination through search result, a change in data of an item could shift the ranking of the item from page 1 to page 2, so as you navigate from page 1 to 2, you get presented the same item twice.

Thoughts?

What are your thoughts? Have you seen other solutions out in the wild? know of any paper that discusses ranking algorithms and order stability?

Hit me up 🙂

Standard Pagination

Offset Pagination

What About Highly Volatile Data?

Caveat

Thoughts?

Related Posts