When we think of pagination – we think SELECT * FROM MYTABLE LIMIT {skip}, {take}
– ahhhhhh… ignorance is bliss.
Let’s go through some of the ways we expose pagination through a Web API and dig a little deeper.
Standard Pagination
This is probably the most common type of pagination you’ll see with majority of Web API’s. You know, {baseUrl}?p=1
, or {baseUrl}/page/1
.
As a response from an API like this, you’ll have a page count
and/or number of items
, where starting page may start with 1
or 0
.
So a full response might look something like this:
{
"result": [
{ "title": "Title 1", "description": "Description of Item 1" },
{ "title": "Title 2", "description": "Description of Item 2" },
],
"pageCount": 25
}
The other slightly different variation is hypermedia
style, where the response links you to the next response.
{
"result": [
{ "title": "Title 1", "description": "Description of Item 1" },
{ "title": "Title 2", "description": "Description of Item 2" },
],
"next": "https://api.mydomain.com/items?p=2"
}
This is probably the simplest pagination you can implement, it’s easy to understand – majority of use cases can be satisfied with this solution.
Offset Pagination
Another way of solving pagination is by using an offset rather than paginating through content page by page like a book – now this is basically a slightly lower level version of the standard pagination solution, but this can give us a bit more flexibility.
The idea is that, we use some property as our pointer / cursor.
For example:
{baseUrl}?skip=15&take=15
This example uses the positional index as our cursor.
This would start from the 16th item (index 15 if 0 based index) and take 15 items, which would result in items 15...30
.
But why would you design an API like this? Isn’t it just a lower level version of the standard pagination approach – which means it’s actually more work for consumers that now has to keep track of the number of items shown so far, their index etc. etc. just to figure out how to paginate your content? – Yes, this is all true, it’s a pretty bad API design… but bear with me.
Let’s see another example that might make a bit more sense..
{baseUrl}?from=db3f73dc-0acc-4382-99b3-0240e0e497b3
Now, using a natural id
as a cursor – we can essentially paginate through our dataset by just keeping track of the last item that we have in the client – sound familiar? this is a great solution for infinite scroll.
What About Highly Volatile Data?
What if your data changes every second, affect the results from the first API call to the next?
One solution to this problem is by using some form of session
… where by your initial requests response contains some type of session token
, and this is passed to the subsequent requests.
Another way of doing this would be to use hypermedia
style, and use a token for next
and previous
that knows about the session – this is what twitter and facebook do.
The session
could be implemented in various ways – depending on how you want to go about things – the simplest implementation would be to timestamp when the initial request was performed and filter the data you’re including in the paginated request to only include those created before the timestamp.
Caveat
If you’re using some form of ranking algorithm based on the items you’re paginating over, then you’re pretty much out of luck – order stability becomes impossible.
For example, if you were pagination through search result, a change in data of an item could shift the ranking of the item from page 1
to page 2
, so as you navigate from page 1
to 2
, you get presented the same item twice.
Thoughts?
What are your thoughts? Have you seen other solutions out in the wild? know of any paper that discusses ranking algorithms and order stability?
Hit me up 🙂