Building a Mailchimp Campaign Extractor with Python

Recently, I needed to extract campaign data from Mailchimp for the extraction project of data engineering training. This process becomes pretty straightforward if you’re using a data integration platform such as Airbyte, I had all the tables moved into Microsoft Azure blob storage within the hour. The more involved part was using Python to call on the API and working out how to incrementally update it. I ended up writing a pair of Python scripts —one for historical data, and one to run daily as an incremental update.

The Goal

The goal was simple:

  • Fetch all past campaigns every week
  • Fetch only yesterday’s campaigns every day going forward
    Both sets of data are saved to a local JSON file for further processing.

 

Step 1: Setting Up the Environment

Working in Virtual Studios and to keep secrets out of my codebase, I used python-dotenv to manage my Mailchimp API key. After adding it to a .env file, the script loads it at runtime:

from dotenv import load_dotenv

import os

load_dotenv()

api_key = os.getenv('MAILCHIMP_API_KEY')

 

With that in place, I initialized the Mailchimp client:

import mailchimp_marketing as MailchimpMarketing

client = MailchimpMarketing.Client()

client.set_config({ "api_key": api_key })

As this is confidential information, its important to create a .gitignore file and add the .env file to it.

 

Step 2: Full Load - Getting All Campaigns

For the full load script, I wanted to retrieve every campaign Mailchimp had on record up to yesterday. The API supports pagination, so I used a loop to handle offset-based paging, fetching 1000 campaigns at a time:

while True:

    response = client.campaigns.list(

        count=1000,

        offset=offset,

        before_create_time=yesterday_str

    )

All campaigns were aggregated into a list and saved as mailchimp_campaigns.json. The maximum number of records able to be retrieved by the API was 1000 so pagination was necessary to ensure that not just the first 1000 records were returned.

 

Incremental Load: Yesterday's Campaigns Only

For the daily run, I filtered the API query using since_create_time and before_create_time, targeting just the 24-hour window of "yesterday." This avoids duplicate data and keeps each day’s run lightweight:

response = client.campaigns.list(

    since_create_time="2025-05-11T00:00:00+00:00",

    before_create_time="2025-05-11T23:59:59+00:00"

)

The script checks if any campaigns were found and logs the result before saving.

 

Error Handling and Logging

I wrapped the API calls in a try/except block to catch and print ApiClientErrors from the Mailchimp SDK. Basic console logging gives feedback on how many campaigns were fetched and whether the output file was written.

 

Final Thoughts

This was a simple but useful project. Writing clean, reusable scripts for both full and incremental data loads ensures flexibility for any ETL or analytics pipeline down the line.

You can find the code on GitHub.

Author:
Asha Daniels
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab