Retrieving Data from Cloudflare for Reporting: Python SDK, API, and CLI Comparison
READER BEWARE: THE FOLLOWING WRITTEN ENTIRELY BY AI WITHOUT HUMAN EDITING.
Introduction
Cloudflare is a leading content delivery network (CDN) and security platform that protects and accelerates millions of websites worldwide. When managing Cloudflare infrastructure, you often need to extract configuration data, analytics metrics, and operational information for reporting, monitoring, and auditing purposes. Cloudflare provides three primary methods for retrieving this data:
- Python SDK (
cloudflare) - Official Python library for programmatic access - REST API - Direct HTTP requests to Cloudflare’s API endpoints
- CLI (Wrangler) - Command-line tool for Cloudflare operations
This guide explores all three approaches, demonstrating authentication methods, comparing their strengths and weaknesses, and showing practical examples for common reporting scenarios.
Why Extract Cloudflare Data Programmatically?
Common Use Cases
- Analytics Reporting: Extract traffic metrics, bandwidth usage, and threat analytics
- Security Auditing: Monitor firewall rules, rate limits, and security events
- Configuration Management: Export DNS records, page rules, and worker configurations
- Compliance: Document security settings and access controls for audits
- Cost Analysis: Track usage patterns to optimize Cloudflare plans and features
- Multi-Account Management: Aggregate data across multiple Cloudflare accounts
- Incident Response: Quickly query logs and analytics during security incidents
- Capacity Planning: Analyze historical trends to predict future resource needs
Benefits of Automation
- Consistency: Automated data extraction eliminates manual errors
- Speed: Retrieve data in seconds instead of clicking through the dashboard
- Scalability: Handle multiple zones and accounts simultaneously
- Integration: Connect Cloudflare data with your existing monitoring and BI tools
- Scheduling: Set up automated reports with cron jobs or CI/CD pipelines
- Historical Analysis: Programmatically collect data over time for trend analysis
Authentication Methods
All three approaches require authentication using API tokens or API keys. Cloudflare recommends using API tokens (scoped, more secure) over legacy API keys.
API Tokens vs API Keys
API Tokens (Recommended):
- Scoped permissions (read/write access to specific resources)
- Can be restricted to specific zones or accounts
- Can expire automatically
- Can be regenerated without affecting other integrations
- More secure and follows principle of least privilege
API Keys (Legacy):
- Full account access
- Cannot be scoped to specific resources
- No expiration
- Should be avoided for new integrations
Creating an API Token
- Log in to the Cloudflare Dashboard
- Go to My Profile → API Tokens
- Click Create Token
- Choose a template or create a custom token:
- Read All Resources: For reporting and analytics (recommended for read-only access)
- Edit Zone DNS: For DNS management
- Custom Token: Define specific permissions
- Set token permissions:
- Zone → Zone → Read: Access zone configuration
- Zone → Analytics → Read: Access analytics data
- Account → Account Settings → Read: Access account information
- Optional: Restrict by zone, account, or IP address
- Click Continue to summary → Create Token
- Important: Copy the token immediately; you won’t be able to see it again
Storing Credentials Securely
Environment Variables (Recommended):
# For API Token (recommended)
export CLOUDFLARE_API_TOKEN="your_token_here"
# For API Key (legacy)
export CLOUDFLARE_API_KEY="your_api_key_here"
export CLOUDFLARE_EMAIL="your_email@example.com"
Configuration File:
# ~/.cloudflare/credentials
[default]
api_token = your_token_here
# Or for legacy API key
[default]
email = your_email@example.com
api_key = your_api_key_here
Never commit credentials to version control. Always use environment variables, secure credential storage, or secret management systems.
Method 1: Python SDK (cloudflare)
The official Cloudflare Python SDK provides a high-level, Pythonic interface to the Cloudflare API.
Installation
pip install cloudflare
Basic Setup and Authentication
Using API Token (Recommended):
import CloudFlare
import os
# Initialize with API token from environment variable
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
Using API Key (Legacy):
import CloudFlare
import os
# Initialize with email and API key
cf = CloudFlare.CloudFlare(
email=os.getenv('CLOUDFLARE_EMAIL'),
key=os.getenv('CLOUDFLARE_API_KEY')
)
Configuration File:
# Reads from ~/.cloudflare/cloudflare.cfg by default
cf = CloudFlare.CloudFlare()
Example 1: List All Zones (Domains)
import CloudFlare
import os
def list_zones():
"""List all zones in the account."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
try:
zones = cf.zones.get()
print(f"Total zones: {len(zones)}")
print("\nZone Details:")
print("-" * 80)
for zone in zones:
print(f"Name: {zone['name']}")
print(f"ID: {zone['id']}")
print(f"Status: {zone['status']}")
print(f"Plan: {zone['plan']['name']}")
print(f"Name Servers: {', '.join(zone['name_servers'])}")
print("-" * 80)
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error: {e}")
if __name__ == "__main__":
list_zones()
Example 2: Retrieve DNS Records
import CloudFlare
import os
import json
def export_dns_records(zone_name):
"""Export all DNS records for a zone."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
try:
# Get zone ID
zones = cf.zones.get(params={'name': zone_name})
if not zones:
print(f"Zone {zone_name} not found")
return
zone_id = zones[0]['id']
# Get all DNS records
dns_records = cf.zones.dns_records.get(zone_id)
print(f"DNS Records for {zone_name}:")
print(json.dumps(dns_records, indent=2))
# Export to file
with open(f'{zone_name}_dns_records.json', 'w') as f:
json.dump(dns_records, f, indent=2)
print(f"\nExported {len(dns_records)} DNS records to {zone_name}_dns_records.json")
# Summary by type
record_types = {}
for record in dns_records:
record_type = record['type']
record_types[record_type] = record_types.get(record_type, 0) + 1
print("\nRecord Type Summary:")
for record_type, count in sorted(record_types.items()):
print(f" {record_type}: {count}")
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error: {e}")
if __name__ == "__main__":
export_dns_records("example.com")
Example 3: Retrieve Analytics Data
import CloudFlare
import os
from datetime import datetime, timedelta
import json
def get_zone_analytics(zone_name, days=7):
"""Retrieve analytics data for a zone."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
try:
# Get zone ID
zones = cf.zones.get(params={'name': zone_name})
if not zones:
print(f"Zone {zone_name} not found")
return
zone_id = zones[0]['id']
# Calculate time range
since = (datetime.now() - timedelta(days=days)).isoformat() + 'Z'
until = datetime.now().isoformat() + 'Z'
# Get analytics dashboard
analytics = cf.zones.analytics.dashboard.get(
zone_id,
params={
'since': since,
'until': until,
'continuous': 'true'
}
)
# Extract key metrics
totals = analytics['totals']
timeseries = analytics['timeseries']
print(f"Analytics for {zone_name} (Last {days} days):")
print("=" * 80)
print(f"\nTotal Requests: {totals.get('requests', {}).get('all', 0):,}")
print(f"Cached Requests: {totals.get('requests', {}).get('cached', 0):,}")
print(f"Uncached Requests: {totals.get('requests', {}).get('uncached', 0):,}")
print(f"Bandwidth (bytes): {totals.get('bandwidth', {}).get('all', 0):,}")
print(f"Cached Bandwidth: {totals.get('bandwidth', {}).get('cached', 0):,}")
print(f"Threats Blocked: {totals.get('threats', {}).get('all', 0):,}")
print(f"Unique Visitors: {totals.get('uniques', {}).get('all', 0):,}")
# Cache hit ratio
total_requests = totals.get('requests', {}).get('all', 0)
cached_requests = totals.get('requests', {}).get('cached', 0)
if total_requests > 0:
cache_hit_ratio = (cached_requests / total_requests) * 100
print(f"Cache Hit Ratio: {cache_hit_ratio:.2f}%")
# Export to file
report = {
'zone': zone_name,
'period': f'Last {days} days',
'generated': datetime.now().isoformat(),
'totals': totals,
'timeseries': timeseries
}
filename = f'{zone_name}_analytics_{days}days.json'
with open(filename, 'w') as f:
json.dump(report, f, indent=2)
print(f"\nDetailed analytics exported to {filename}")
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error: {e}")
if __name__ == "__main__":
get_zone_analytics("example.com", days=30)
Example 4: Firewall Rules Audit
import CloudFlare
import os
import csv
from datetime import datetime
def audit_firewall_rules(zone_name):
"""Audit firewall rules for compliance and reporting."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
try:
# Get zone ID
zones = cf.zones.get(params={'name': zone_name})
if not zones:
print(f"Zone {zone_name} not found")
return
zone_id = zones[0]['id']
# Get firewall rules
firewall_rules = cf.zones.firewall.rules.get(zone_id)
print(f"Firewall Rules for {zone_name}:")
print("=" * 100)
# Export to CSV
filename = f'{zone_name}_firewall_rules_{datetime.now().strftime("%Y%m%d")}.csv'
with open(filename, 'w', newline='') as csvfile:
fieldnames = ['Rule ID', 'Description', 'Action', 'Expression', 'Enabled', 'Priority']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for rule in firewall_rules:
print(f"\nRule ID: {rule['id']}")
print(f"Description: {rule.get('description', 'N/A')}")
print(f"Action: {rule['action']}")
print(f"Expression: {rule['filter']['expression']}")
print(f"Enabled: {rule.get('paused', False) == False}")
print(f"Priority: {rule.get('priority', 'N/A')}")
writer.writerow({
'Rule ID': rule['id'],
'Description': rule.get('description', 'N/A'),
'Action': rule['action'],
'Expression': rule['filter']['expression'],
'Enabled': 'Yes' if not rule.get('paused', False) else 'No',
'Priority': rule.get('priority', 'N/A')
})
print(f"\n{'=' * 100}")
print(f"Exported {len(firewall_rules)} firewall rules to {filename}")
# Summary by action
action_summary = {}
for rule in firewall_rules:
action = rule['action']
action_summary[action] = action_summary.get(action, 0) + 1
print("\nAction Summary:")
for action, count in sorted(action_summary.items()):
print(f" {action}: {count}")
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error: {e}")
if __name__ == "__main__":
audit_firewall_rules("example.com")
Example 5: Multi-Zone Report
import CloudFlare
import os
import pandas as pd
from datetime import datetime
def generate_multi_zone_report():
"""Generate a report across all zones."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
try:
zones = cf.zones.get()
report_data = []
for zone in zones:
zone_name = zone['name']
zone_id = zone['id']
# Get analytics for each zone
analytics = cf.zones.analytics.dashboard.get(
zone_id,
params={'since': -10080} # Last 7 days in minutes
)
totals = analytics['totals']
report_data.append({
'Zone': zone_name,
'Status': zone['status'],
'Plan': zone['plan']['name'],
'Total Requests': totals.get('requests', {}).get('all', 0),
'Cached Requests': totals.get('requests', {}).get('cached', 0),
'Bandwidth (GB)': totals.get('bandwidth', {}).get('all', 0) / (1024**3),
'Threats Blocked': totals.get('threats', {}).get('all', 0),
'Unique Visitors': totals.get('uniques', {}).get('all', 0)
})
# Create DataFrame
df = pd.DataFrame(report_data)
# Calculate cache hit ratio
df['Cache Hit Ratio (%)'] = (
(df['Cached Requests'] / df['Total Requests']) * 100
).round(2)
# Sort by total requests
df = df.sort_values('Total Requests', ascending=False)
# Display report
print("\nMulti-Zone Analytics Report (Last 7 Days)")
print("=" * 120)
print(df.to_string(index=False))
# Export to CSV
filename = f'cloudflare_multi_zone_report_{datetime.now().strftime("%Y%m%d")}.csv'
df.to_csv(filename, index=False)
print(f"\nReport exported to {filename}")
# Summary statistics
print("\n" + "=" * 120)
print("Summary Statistics:")
print(f"Total Zones: {len(df)}")
print(f"Total Requests: {df['Total Requests'].sum():,.0f}")
print(f"Total Bandwidth: {df['Bandwidth (GB)'].sum():.2f} GB")
print(f"Total Threats Blocked: {df['Threats Blocked'].sum():,.0f}")
print(f"Average Cache Hit Ratio: {df['Cache Hit Ratio (%)'].mean():.2f}%")
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error: {e}")
if __name__ == "__main__":
generate_multi_zone_report()
Method 2: Direct API Calls (HTTP Requests)
Using direct HTTP requests gives you complete control and works with any programming language or tool that can make HTTP requests.
API Documentation
Cloudflare’s API documentation: https://developers.cloudflare.com/api/
Base URL
https://api.cloudflare.com/client/v4/
Authentication Headers
Using API Token:
Authorization: Bearer YOUR_API_TOKEN
Using API Key:
X-Auth-Email: your_email@example.com
X-Auth-Key: YOUR_API_KEY
Example 1: List Zones with cURL
# Using API Token
curl -X GET "https://api.cloudflare.com/client/v4/zones" \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
-H "Content-Type: application/json" | jq '.'
# Using API Key
curl -X GET "https://api.cloudflare.com/client/v4/zones" \
-H "X-Auth-Email: $CLOUDFLARE_EMAIL" \
-H "X-Auth-Key: $CLOUDFLARE_API_KEY" \
-H "Content-Type: application/json" | jq '.'
Example 2: Python with requests Library
import requests
import os
import json
class CloudflareAPI:
"""Direct API access using requests library."""
def __init__(self, api_token=None, email=None, api_key=None):
self.base_url = "https://api.cloudflare.com/client/v4"
if api_token:
self.headers = {
"Authorization": f"Bearer {api_token}",
"Content-Type": "application/json"
}
elif email and api_key:
self.headers = {
"X-Auth-Email": email,
"X-Auth-Key": api_key,
"Content-Type": "application/json"
}
else:
raise ValueError("Must provide either api_token or (email and api_key)")
def get(self, endpoint, params=None):
"""Make GET request to API."""
url = f"{self.base_url}/{endpoint}"
response = requests.get(url, headers=self.headers, params=params)
response.raise_for_status()
return response.json()
def list_zones(self):
"""List all zones."""
return self.get("zones")
def get_zone_analytics(self, zone_id, since=-10080):
"""Get zone analytics.
Args:
zone_id: Zone identifier
since: Minutes from now (negative) or ISO 8601 timestamp
"""
endpoint = f"zones/{zone_id}/analytics/dashboard"
params = {"since": since}
return self.get(endpoint, params=params)
def get_dns_records(self, zone_id):
"""Get all DNS records for a zone."""
endpoint = f"zones/{zone_id}/dns_records"
return self.get(endpoint)
def get_firewall_rules(self, zone_id):
"""Get firewall rules for a zone."""
endpoint = f"zones/{zone_id}/firewall/rules"
return self.get(endpoint)
def main():
# Initialize with API token
api = CloudflareAPI(api_token=os.getenv('CLOUDFLARE_API_TOKEN'))
# List zones
zones_response = api.list_zones()
if not zones_response['success']:
print(f"Error: {zones_response['errors']}")
return
zones = zones_response['result']
print(f"Found {len(zones)} zones:\n")
for zone in zones:
print(f"Zone: {zone['name']} (ID: {zone['id']})")
print(f"Status: {zone['status']}")
print(f"Plan: {zone['plan']['name']}")
# Get analytics for this zone
analytics_response = api.get_zone_analytics(zone['id'])
if analytics_response['success']:
totals = analytics_response['result']['totals']
print(f"Requests (7d): {totals.get('requests', {}).get('all', 0):,}")
print(f"Bandwidth (7d): {totals.get('bandwidth', {}).get('all', 0):,} bytes")
print("-" * 80)
if __name__ == "__main__":
main()
Example 3: GraphQL Analytics API
Cloudflare also provides a GraphQL API for advanced analytics queries.
import requests
import os
import json
from datetime import datetime, timedelta
def query_graphql_analytics(zone_tag, days=7):
"""Query analytics using GraphQL API for more advanced queries."""
url = "https://api.cloudflare.com/client/v4/graphql"
headers = {
"Authorization": f"Bearer {os.getenv('CLOUDFLARE_API_TOKEN')}",
"Content-Type": "application/json"
}
# Calculate date range
end_date = datetime.now()
start_date = end_date - timedelta(days=days)
# GraphQL query
query = """
query {
viewer {
zones(filter: {zoneTag: "%s"}) {
httpRequests1dGroups(
filter: {
date_geq: "%s"
date_leq: "%s"
}
limit: 1000
) {
sum {
requests
bytes
cachedRequests
cachedBytes
threats
}
dimensions {
date
}
}
}
}
}
""" % (zone_tag, start_date.strftime('%Y-%m-%d'), end_date.strftime('%Y-%m-%d'))
response = requests.post(
url,
headers=headers,
json={"query": query}
)
if response.status_code == 200:
data = response.json()
print(json.dumps(data, indent=2))
return data
else:
print(f"Error: {response.status_code}")
print(response.text)
return None
# Usage
query_graphql_analytics("your_zone_id_here", days=30)
Example 4: Batch API Requests
import requests
import os
import json
from concurrent.futures import ThreadPoolExecutor, as_completed
def fetch_zone_data(zone_id, api_token):
"""Fetch multiple data types for a zone in parallel."""
base_url = "https://api.cloudflare.com/client/v4"
headers = {
"Authorization": f"Bearer {api_token}",
"Content-Type": "application/json"
}
endpoints = {
'analytics': f'zones/{zone_id}/analytics/dashboard?since=-10080',
'dns': f'zones/{zone_id}/dns_records',
'firewall': f'zones/{zone_id}/firewall/rules',
'settings': f'zones/{zone_id}/settings'
}
results = {}
for name, endpoint in endpoints.items():
url = f"{base_url}/{endpoint}"
try:
response = requests.get(url, headers=headers)
response.raise_for_status()
results[name] = response.json()['result']
except Exception as e:
results[name] = {'error': str(e)}
return results
def generate_comprehensive_report():
"""Generate comprehensive report for all zones."""
api_token = os.getenv('CLOUDFLARE_API_TOKEN')
# Get all zones
response = requests.get(
"https://api.cloudflare.com/client/v4/zones",
headers={
"Authorization": f"Bearer {api_token}",
"Content-Type": "application/json"
}
)
zones = response.json()['result']
# Fetch data for all zones in parallel
with ThreadPoolExecutor(max_workers=5) as executor:
future_to_zone = {
executor.submit(fetch_zone_data, zone['id'], api_token): zone
for zone in zones
}
for future in as_completed(future_to_zone):
zone = future_to_zone[future]
try:
data = future.result()
print(f"\n{'=' * 80}")
print(f"Zone: {zone['name']}")
print(f"{'=' * 80}")
# Analytics
if 'analytics' in data and 'totals' in data['analytics']:
totals = data['analytics']['totals']
print(f"Requests: {totals.get('requests', {}).get('all', 0):,}")
print(f"Bandwidth: {totals.get('bandwidth', {}).get('all', 0):,} bytes")
# DNS Records
if 'dns' in data and not isinstance(data['dns'], dict):
print(f"DNS Records: {len(data['dns'])}")
# Firewall Rules
if 'firewall' in data and not isinstance(data['firewall'], dict):
print(f"Firewall Rules: {len(data['firewall'])}")
except Exception as exc:
print(f"{zone['name']} generated an exception: {exc}")
if __name__ == "__main__":
generate_comprehensive_report()
Method 3: Cloudflare CLI (Wrangler)
Wrangler is Cloudflare’s official CLI tool, primarily designed for Workers development but also useful for general Cloudflare operations.
Installation
npm (recommended):
npm install -g wrangler
Homebrew (macOS):
brew install cloudflare-wrangler2
Cargo (Rust):
cargo install wrangler
Authentication
Interactive Login (OAuth):
wrangler login
This opens a browser window for authentication and stores credentials securely.
API Token:
# Set environment variable
export CLOUDFLARE_API_TOKEN="your_token_here"
# Or use in command
wrangler <command> --api-token="your_token_here"
Configuration File:
# Create ~/.wrangler/config/default.toml
[default]
api_token = "your_token_here"
Example Commands
1. Account Information:
# View account details
wrangler whoami
# List zones
wrangler zones list
2. DNS Management:
# List DNS records
wrangler dns-records list --zone-id=<zone_id>
# Export DNS records to JSON
wrangler dns-records list --zone-id=<zone_id> --output json > dns_records.json
3. Workers Management:
# List workers
wrangler list
# Get worker logs
wrangler tail <worker_name>
4. KV Namespaces:
# List KV namespaces
wrangler kv:namespace list
# List keys in namespace
wrangler kv:key list --namespace-id=<namespace_id>
Scripting with Wrangler
#!/bin/bash
# Script to export Cloudflare configuration
# Set API token
export CLOUDFLARE_API_TOKEN="your_token_here"
# Create output directory
OUTPUT_DIR="cloudflare_export_$(date +%Y%m%d_%H%M%S)"
mkdir -p "$OUTPUT_DIR"
echo "Exporting Cloudflare configuration to $OUTPUT_DIR"
# Export account info
echo "Exporting account information..."
wrangler whoami > "$OUTPUT_DIR/account_info.txt"
# Export zones
echo "Exporting zones..."
wrangler zones list --output json > "$OUTPUT_DIR/zones.json"
# Extract zone IDs and export DNS records for each
echo "Exporting DNS records..."
ZONE_IDS=$(jq -r '.[] | .id' "$OUTPUT_DIR/zones.json")
for ZONE_ID in $ZONE_IDS; do
ZONE_NAME=$(jq -r ".[] | select(.id==\"$ZONE_ID\") | .name" "$OUTPUT_DIR/zones.json")
echo " - Exporting DNS records for $ZONE_NAME"
wrangler dns-records list --zone-id="$ZONE_ID" --output json > "$OUTPUT_DIR/dns_${ZONE_NAME}.json"
done
echo "Export complete!"
echo "Files saved to $OUTPUT_DIR"
Limitations of Wrangler for Reporting
While Wrangler is powerful, it has some limitations for data extraction:
- Limited Analytics Access: Wrangler doesn’t provide direct analytics queries
- Primarily Workers-Focused: Many features are specific to Cloudflare Workers
- No Firewall Rules Export: Cannot directly export firewall rules
- Limited Batch Operations: Not optimized for bulk data extraction
For comprehensive reporting, the Python SDK or direct API calls are more suitable.
Comparison of Methods
Feature Comparison
| Feature | Python SDK | Direct API | Wrangler CLI |
|---|---|---|---|
| Ease of Use | ⭐⭐⭐⭐⭐ Pythonic, intuitive | ⭐⭐⭐ Requires HTTP knowledge | ⭐⭐⭐⭐ Simple commands |
| Installation | pip install cloudflare | No installation (built-in) | npm install -g wrangler |
| Documentation | Good, with examples | Comprehensive API docs | Workers-focused |
| Analytics Access | ⭐⭐⭐⭐⭐ Full access | ⭐⭐⭐⭐⭐ Full access | ⭐⭐ Limited |
| DNS Management | ⭐⭐⭐⭐⭐ Full CRUD | ⭐⭐⭐⭐⭐ Full CRUD | ⭐⭐⭐⭐ Good |
| Firewall Rules | ⭐⭐⭐⭐⭐ Full access | ⭐⭐⭐⭐⭐ Full access | ⭐ Very limited |
| GraphQL Support | ⭐⭐ Indirect | ⭐⭐⭐⭐⭐ Native | ❌ None |
| Error Handling | ⭐⭐⭐⭐⭐ Excellent | ⭐⭐⭐ Manual | ⭐⭐⭐ Good |
| Pagination | ⭐⭐⭐⭐⭐ Automatic | ⭐⭐⭐ Manual | ⭐⭐⭐⭐ Automatic |
| Rate Limiting | ⭐⭐⭐⭐ Built-in handling | ⭐⭐ Manual | ⭐⭐⭐ Handled |
| Scripting | ⭐⭐⭐⭐⭐ Excellent for Python | ⭐⭐⭐⭐⭐ Any language | ⭐⭐⭐⭐ Shell scripts |
| Multi-Zone Operations | ⭐⭐⭐⭐⭐ Easy to iterate | ⭐⭐⭐⭐ Manual loops | ⭐⭐⭐ Manual loops |
| Data Export | ⭐⭐⭐⭐⭐ JSON, CSV, Pandas | ⭐⭐⭐⭐ JSON output | ⭐⭐⭐ JSON, text |
| Learning Curve | ⭐⭐⭐⭐ Moderate | ⭐⭐⭐ Moderate | ⭐⭐⭐⭐ Easy |
Authentication Comparison
| Method | Python SDK | Direct API | Wrangler CLI |
|---|---|---|---|
| API Token | ✅ Recommended | ✅ Recommended | ✅ Recommended |
| API Key + Email | ✅ Supported | ✅ Supported | ⚠️ Legacy |
| OAuth | ❌ Not supported | ❌ Not supported | ✅ Yes (wrangler login) |
| Config File | ✅ ~/.cloudflare/cloudflare.cfg | ⚠️ Manual | ✅ ~/.wrangler/config/ |
| Env Variables | ✅ Full support | ✅ Full support | ✅ Full support |
| Scoped Tokens | ✅ Yes | ✅ Yes | ✅ Yes |
Use Case Recommendations
Use Python SDK when:
- Building data pipelines or automation scripts in Python
- Need complex data transformations with pandas/numpy
- Integrating with existing Python applications
- Want automatic pagination and error handling
- Need to process large amounts of data efficiently
Use Direct API calls when:
- Working in a non-Python environment
- Need maximum flexibility and control
- Using GraphQL for advanced analytics
- Building microservices in other languages (Go, Node.js, etc.)
- Want to minimize dependencies
Use Wrangler CLI when:
- Performing quick, one-off queries
- Managing Cloudflare Workers
- Writing simple shell scripts
- Interactive exploration and debugging
- Prefer command-line tools over programming
Performance Considerations
Python SDK:
- ✅ Efficient connection pooling
- ✅ Automatic retry logic
- ✅ Built-in rate limit handling
- ⚠️ Overhead of Python runtime
Direct API:
- ✅ Minimal overhead
- ✅ Can use async/await for parallelism
- ✅ Full control over connections
- ⚠️ Must implement retry and rate limiting manually
Wrangler CLI:
- ✅ Simple subprocess calls
- ⚠️ Higher overhead per command
- ⚠️ Not optimized for bulk operations
- ⚠️ Limited parallelism
Best Practices
1. Security
# ✅ DO: Use environment variables
import os
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
# ❌ DON'T: Hardcode credentials
cf = CloudFlare.CloudFlare(token='hardcoded_token_here')
2. Error Handling
import CloudFlare
try:
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
zones = cf.zones.get()
except CloudFlare.exceptions.CloudFlareAPIError as e:
print(f"API Error {e.code}: {e.message}")
except CloudFlare.exceptions.CloudFlareInternalError as e:
print(f"Internal Error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
3. Rate Limiting
Cloudflare imposes rate limits:
- Free plan: ~1200 requests per 5 minutes
- Paid plans: Higher limits
import time
from CloudFlare.exceptions import CloudFlareAPIError
def safe_api_call(func, *args, **kwargs):
"""Make API call with retry logic."""
max_retries = 3
retry_delay = 5
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except CloudFlareAPIError as e:
if e.code == 429: # Rate limit exceeded
if attempt < max_retries - 1:
print(f"Rate limit hit, waiting {retry_delay}s...")
time.sleep(retry_delay)
retry_delay *= 2 # Exponential backoff
else:
raise
else:
raise
4. Pagination
def get_all_dns_records(cf, zone_id):
"""Get all DNS records with pagination."""
all_records = []
page = 1
per_page = 100
while True:
records = cf.zones.dns_records.get(
zone_id,
params={'page': page, 'per_page': per_page}
)
if not records:
break
all_records.extend(records)
if len(records) < per_page:
break
page += 1
return all_records
5. Caching Results
import json
import os
from datetime import datetime, timedelta
def cached_api_call(cache_file, ttl_hours=24):
"""Decorator to cache API responses."""
def decorator(func):
def wrapper(*args, **kwargs):
if os.path.exists(cache_file):
mtime = os.path.getmtime(cache_file)
age = datetime.now() - datetime.fromtimestamp(mtime)
if age < timedelta(hours=ttl_hours):
with open(cache_file, 'r') as f:
print(f"Using cached data from {cache_file}")
return json.load(f)
result = func(*args, **kwargs)
with open(cache_file, 'w') as f:
json.dump(result, f, indent=2)
return result
return wrapper
return decorator
@cached_api_call('zones_cache.json', ttl_hours=6)
def get_zones(cf):
"""Get zones with caching."""
return cf.zones.get()
6. Logging
import logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('cloudflare_api.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger('cloudflare_reporting')
def fetch_analytics(cf, zone_id):
"""Fetch analytics with logging."""
logger.info(f"Fetching analytics for zone {zone_id}")
try:
analytics = cf.zones.analytics.dashboard.get(zone_id)
logger.info(f"Successfully retrieved analytics for zone {zone_id}")
return analytics
except Exception as e:
logger.error(f"Failed to fetch analytics for zone {zone_id}: {e}")
raise
Advanced Reporting Examples
Automated Daily Report
#!/usr/bin/env python3
"""
Automated daily Cloudflare report generator.
Run with cron: 0 6 * * * /path/to/daily_report.py
"""
import CloudFlare
import os
import json
from datetime import datetime, timedelta
import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
def generate_daily_report():
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
zones = cf.zones.get()
report = []
report.append("# Cloudflare Daily Report")
report.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
report.append(f"Period: Last 24 hours\n")
total_requests = 0
total_bandwidth = 0
total_threats = 0
for zone in zones:
zone_id = zone['id']
zone_name = zone['name']
# Get 24-hour analytics
analytics = cf.zones.analytics.dashboard.get(
zone_id,
params={'since': -1440} # Last 24 hours in minutes
)
totals = analytics['totals']
requests = totals.get('requests', {}).get('all', 0)
bandwidth = totals.get('bandwidth', {}).get('all', 0)
threats = totals.get('threats', {}).get('all', 0)
total_requests += requests
total_bandwidth += bandwidth
total_threats += threats
report.append(f"## {zone_name}")
report.append(f"- Requests: {requests:,}")
report.append(f"- Bandwidth: {bandwidth / (1024**3):.2f} GB")
report.append(f"- Threats Blocked: {threats:,}")
report.append("")
report.append("## Summary")
report.append(f"- Total Requests: {total_requests:,}")
report.append(f"- Total Bandwidth: {total_bandwidth / (1024**3):.2f} GB")
report.append(f"- Total Threats Blocked: {total_threats:,}")
report_text = "\n".join(report)
# Save to file
filename = f"cloudflare_report_{datetime.now().strftime('%Y%m%d')}.txt"
with open(filename, 'w') as f:
f.write(report_text)
print(report_text)
# Optionally send email
send_email_report(report_text)
def send_email_report(report_text):
"""Send report via email."""
sender = os.getenv('EMAIL_SENDER')
recipient = os.getenv('EMAIL_RECIPIENT')
smtp_server = os.getenv('SMTP_SERVER')
smtp_port = int(os.getenv('SMTP_PORT', 587))
smtp_user = os.getenv('SMTP_USER')
smtp_pass = os.getenv('SMTP_PASS')
if not all([sender, recipient, smtp_server, smtp_user, smtp_pass]):
print("Email configuration incomplete, skipping email")
return
msg = MIMEMultipart()
msg['From'] = sender
msg['To'] = recipient
msg['Subject'] = f"Cloudflare Daily Report - {datetime.now().strftime('%Y-%m-%d')}"
msg.attach(MIMEText(report_text, 'plain'))
try:
with smtplib.SMTP(smtp_server, smtp_port) as server:
server.starttls()
server.login(smtp_user, smtp_pass)
server.send_message(msg)
print("Email sent successfully")
except Exception as e:
print(f"Failed to send email: {e}")
if __name__ == "__main__":
generate_daily_report()
Cost Analysis Report
import CloudFlare
import os
from datetime import datetime, timedelta
import pandas as pd
def analyze_bandwidth_costs():
"""Analyze bandwidth usage to estimate costs."""
cf = CloudFlare.CloudFlare(token=os.getenv('CLOUDFLARE_API_TOKEN'))
zones = cf.zones.get()
# Pricing tiers (example - adjust for your plan)
bandwidth_pricing = {
'free': {'limit_gb': 'unlimited', 'cost_per_gb': 0},
'pro': {'limit_gb': 'unlimited', 'cost_per_gb': 0},
'business': {'limit_gb': 'unlimited', 'cost_per_gb': 0},
'enterprise': {'limit_gb': 'custom', 'cost_per_gb': 'negotiated'}
}
data = []
for zone in zones:
zone_id = zone['id']
zone_name = zone['name']
plan = zone['plan']['name'].lower()
# Get 30-day analytics
analytics = cf.zones.analytics.dashboard.get(
zone_id,
params={'since': -43200} # 30 days in minutes
)
totals = analytics['totals']
bandwidth_bytes = totals.get('bandwidth', {}).get('all', 0)
bandwidth_gb = bandwidth_bytes / (1024**3)
requests = totals.get('requests', {}).get('all', 0)
# Calculate costs (if applicable)
pricing = bandwidth_pricing.get(plan, {})
cost = bandwidth_gb * pricing.get('cost_per_gb', 0)
data.append({
'Zone': zone_name,
'Plan': plan,
'Bandwidth (GB)': round(bandwidth_gb, 2),
'Requests': requests,
'Avg Request Size (KB)': round((bandwidth_bytes / requests) / 1024, 2) if requests > 0 else 0,
'Estimated Cost': f"${cost:.2f}" if cost > 0 else "Included"
})
df = pd.DataFrame(data)
df = df.sort_values('Bandwidth (GB)', ascending=False)
print("\nCloudflare Bandwidth Analysis (Last 30 Days)")
print("=" * 100)
print(df.to_string(index=False))
print("\n" + "=" * 100)
print(f"Total Bandwidth: {df['Bandwidth (GB)'].sum():.2f} GB")
print(f"Total Requests: {df['Requests'].sum():,}")
# Export
filename = f"cloudflare_cost_analysis_{datetime.now().strftime('%Y%m%d')}.csv"
df.to_csv(filename, index=False)
print(f"\nReport saved to {filename}")
if __name__ == "__main__":
analyze_bandwidth_costs()
Conclusion
Retrieving data from Cloudflare for reporting purposes can be accomplished through three main methods, each with its own strengths:
Quick Reference
Choose Python SDK if:
- You’re comfortable with Python
- Need comprehensive data extraction
- Want built-in error handling and pagination
- Building automated reporting systems
Choose Direct API if:
- Using languages other than Python
- Need maximum flexibility
- Want to use GraphQL for analytics
- Minimizing dependencies is important
Choose Wrangler CLI if:
- Performing quick ad-hoc queries
- Writing simple shell scripts
- Managing Cloudflare Workers
- Prefer command-line interfaces
Key Takeaways
- Authentication: Always use API tokens (not legacy API keys) with appropriate scoped permissions
- Rate Limits: Implement retry logic and respect Cloudflare’s rate limits
- Error Handling: Always wrap API calls in try-except blocks
- Caching: Cache frequently accessed data to reduce API calls
- Logging: Implement comprehensive logging for debugging and audit trails
- Security: Never commit credentials; use environment variables or secure vaults
Resources
- Cloudflare API Documentation: https://developers.cloudflare.com/api/
- Python SDK (cloudflare): https://github.com/cloudflare/python-cloudflare
- Wrangler Documentation: https://developers.cloudflare.com/workers/wrangler/
- GraphQL Analytics API: https://developers.cloudflare.com/analytics/graphql-api/
By leveraging these tools and techniques, you can build robust, automated reporting systems that provide valuable insights into your Cloudflare infrastructure, helping you optimize performance, enhance security, and control costs.