From API to Snowflake in 3 steps (part 2)

In this 3-part blog series, we’ll walk through the process of extracting data from the Amplitude API using Python, uploading the output files to an Amazon S3 bucket, and finally loading that data into a Snowflake table.

Part two: Testing your Python connection to your S3 bucket

Before building your full data pipeline, it’s a good idea to test your connection to AWS S3. This helps ensure your credentials are working, your bucket permissions are correct, and your file upload logic is sound.

Here’s a step-by-step outline to get you started:

Step 1: Create and Activate a Virtual Environment

Create an isolated Python environment for dependency management:

python -m venv venv
venv\Scripts\activate 

Step 2: Set Up Environment Variables

Create a .env file to store your AWS credentials securely:

AWS_ACCESS_KEY_ID = your_key_id
AWS_SECRET_ACCESS_KEY = your_secret_key
AWS_REGION = your_region

Step 3: Update .gitignore

To avoid pushing sensitive files to GitHub, update your .gitignore:

.env
venv/

Step 4: Create a Git Branch for Testing

Create and commit your setup in a dedicated Git branch:

git checkout -b test-s3-connection
git add .
git commit -m "Initial setup for S3 connection test"

Python Script to Test S3 Upload

Now, let’s write a Python script to test your S3 connection and upload functionality.

Step 1: Import Required Packages

import boto3
import os
from dotenv import load_dotenv

Step 2: Load AWS credentials

load_dotenv()
s3 = boto3.client(
's3',
aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
region_name=os.getenv('AWS_REGION')
)

Step 3: (Optional) Test External API Before Upload

If you're retrieving data from an API before uploading to S3, validate the response:

response = requests.get("your_api_url")
if response.status_code != 200:
raise Exception("API request failed with status code: " + str(response.status_code))

Step 4: Upload a Single File to S3 (Without Saving Locally)

def upload_to_s3(bucket, key, content):
s3.put_object(Bucket=bucket, Key=key, Body=content)

Test the function with a sample file:

upload_to_s3("my-bucket-name", "test/test_file.json", '{"test": "data"}')

Step 5: Upload Multiple Files (Scale Up)

def upload_multiple_data_to_s3(bucket, data_list):
    for key, content in data_list:
        print(f"Uploading to s3://{bucket}/{key}")
        s3.put_object(Bucket=bucket, Key=key, Body=content)

Author:
Archie Boswell
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab