Skip to main content

What’s the Best Way to Handle Pagination and Large Datasets?

Updated over a month ago

When you work with a large number of mentions, the BrandMentions API returns results in pages. Instead of sending all data at once, the API splits it into smaller chunks that you retrieve page by page.

Handling pagination correctly is essential if you:

  • Need to fetch all mentions for a project

  • Work with large datasets

  • Want a reliable and scalable integration

How pagination works in the BrandMentions API

When you call an endpoint that returns lists, such as GetProjectMentions, the response contains:

  • A single page of results (for example up to 100 mentions)

  • Metadata about pagination (for example current page and total pages, depending on the endpoint)

You control which page you get with the page parameter:

  • page=1 → first page

  • page=2 → second page

  • And so on, until you reach the last page

Your integration needs to:

  1. Fetch the first page

  2. Read the pagination info from the response

  3. Loop over all remaining pages until you have retrieved all mentions you need

Best practices for handling pagination and large datasets

1. Use a loop to iterate through pages

Use a for or while loop in your code to step through all pages.

Basic pattern:

  1. Call page 1 or stop when a page returns no more data

  2. Loop from page 1 until the current page returns an empty data array

  3. For each page, process or store the mentions

This ensures you do not miss any results.

2. Respect rate limits

When you fetch many pages, you might send many requests in a short time. To avoid rate limit or abuse errors:

  • Add a small delay between requests (for example 200–500 ms)

  • Avoid tight loops that hammer the API

  • Implement retry logic with exponential backoff if you get transient errors

This keeps your integration stable and friendly to the API.

3. Store data incrementally, not all in memory

Large datasets can easily contain tens of thousands of mentions. Loading everything into memory at once can:

  • Slow down your script

  • Cause memory issues in long running jobs

Instead:

  • Stream each page into a database, data warehouse, CSV, or another storage layer

  • Optionally process and aggregate as you go (for example counting mentions by date)

This makes your integration more scalable.

4. Handle errors gracefully

When you are making many API calls in sequence, some will eventually fail:

Good practices:

  • Wrap each request in try/except (or equivalent)

  • On failure:

    • Log the error code and page number

    • Retry a few times with exponential backoff

    • Optionally skip the page and continue, if appropriate for your use case

Your pagination loop should not crash completely just because one page failed once.

5. Tune page size with per_page

Endpoints like GetProjectMentions support a per_page parameter that controls how many mentions are returned per page.

  • Larger per_page → fewer requests, more data per response

  • Smaller per_page → more requests, but lighter responses

A good starting strategy:

  • Use the maximum page size allowed (typically 100) for batch jobs or exports

  • Use smaller sizes only when you have strict memory or latency constraints

To handle pagination and large datasets with the BrandMentions API:

  • Use a loop to iterate through all pages

  • Store data incrementally, not all in memory

  • Implement robust error handling and retries

  • Use per_page to balance request count and response size

With these practices in place, your integration will handle even very large projects efficiently, reliably, and safely.

Did this answer your question?