Retrieving Data from Okta for Reporting: Python SDK, REST API, and CLI Comparison
READER BEWARE: THE FOLLOWING WRITTEN ENTIRELY BY AI WITHOUT HUMAN EDITING.
Introduction
Okta is a powerful identity and access management (IAM) platform that stores valuable data about users, groups, applications, authentication events, and system configurations. For security teams, compliance officers, and system administrators, extracting this data for reporting, auditing, and analytics is essential. Whether you need to generate compliance reports, monitor authentication patterns, audit application access, or analyze user lifecycle events, Okta provides multiple methods to retrieve this information.
This comprehensive guide explores three primary approaches for retrieving data from Okta:
- Okta Python SDK - Official Python library for programmatic access
- Okta REST API - Direct HTTP API calls for maximum flexibility
- Okta CLI - Command-line interface for quick queries and automation
We’ll cover authentication methods for each approach, compare their strengths and weaknesses, and provide practical examples for common reporting scenarios.
Why Extract Data from Okta?
Common Use Cases
- Compliance Reporting: Generate reports for SOC 2, ISO 27001, HIPAA, or other compliance frameworks
- Security Auditing: Track authentication events, failed login attempts, and suspicious activities
- User Lifecycle Management: Monitor user provisioning, deprovisioning, and status changes
- Application Access Reviews: Audit who has access to which applications
- System Configuration Audits: Document Okta policies, rules, and network zones
- Analytics and Insights: Analyze authentication patterns, adoption rates, and user behavior
- Incident Response: Investigate security incidents and extract forensic data
- Access Certification: Periodic review of user access rights
Method 1: Okta Python SDK
The official Okta Python SDK provides a high-level, Pythonic interface for interacting with Okta’s API. It handles authentication, pagination, rate limiting, and provides type-safe models for Okta resources.
Installation
pip install okta
Authentication Setup
The Python SDK supports multiple authentication methods:
API Token Authentication (Recommended for Scripts)
First, create an API token in your Okta admin console:
- Navigate to Security → API → Tokens
- Click Create Token
- Give it a descriptive name (e.g., “Reporting Script”)
- Copy the token immediately (it won’t be shown again)
Store the token securely:
# config.py - Never commit this file!
OKTA_ORG_URL = "https://your-domain.okta.com"
OKTA_API_TOKEN = "your-api-token-here"
Configure the SDK client:
from okta.client import Client as OktaClient
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
OAuth 2.0 Private Key Authentication (Recommended for Production)
For production applications, OAuth 2.0 with private key JWT is more secure:
- Create an OAuth 2.0 application in Okta
- Generate a public/private key pair
- Upload the public key to Okta
- Configure the SDK with private key authentication
import asyncio
from okta.client import Client as OktaClient
config = {
'orgUrl': 'https://your-domain.okta.com',
'authorizationMode': 'PrivateKey',
'clientId': 'your-client-id',
'scopes': ['okta.users.read', 'okta.groups.read', 'okta.apps.read', 'okta.logs.read'],
'privateKey': 'path/to/private-key.pem'
}
client = OktaClient(config)
Basic Data Retrieval Examples
Retrieving All Users
import asyncio
from okta.client import Client as OktaClient
async def get_all_users():
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
users, resp, err = await client.list_users()
all_users = []
while True:
for user in users:
all_users.append({
'id': user.id,
'email': user.profile.email,
'firstName': user.profile.first_name,
'lastName': user.profile.last_name,
'status': user.status,
'created': user.created,
'lastLogin': user.last_login
})
if resp.has_next():
users, err = await resp.next()
else:
break
await client.close()
return all_users
# Run the async function
users = asyncio.run(get_all_users())
print(f"Retrieved {len(users)} users")
Retrieving Users with Filters
async def get_filtered_users():
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
# Get active users only
query_params = {'filter': 'status eq "ACTIVE"'}
users, resp, err = await client.list_users(query_params)
# Get users created in the last 30 days
from datetime import datetime, timedelta
thirty_days_ago = (datetime.now() - timedelta(days=30)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
query_params = {'filter': f'created gt "{thirty_days_ago}"'}
recent_users, resp, err = await client.list_users(query_params)
# Search users by email domain
query_params = {'search': 'profile.email sw "example.com"'}
company_users, resp, err = await client.list_users(query_params)
await client.close()
return users, recent_users, company_users
asyncio.run(get_filtered_users())
Retrieving Groups and Members
async def get_groups_and_members():
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
# Get all groups
groups, resp, err = await client.list_groups()
groups_data = []
for group in groups:
# Get group members
members, resp, err = await client.list_group_users(group.id)
member_list = []
async for user in members:
member_list.append({
'email': user.profile.email,
'name': f"{user.profile.first_name} {user.profile.last_name}"
})
groups_data.append({
'id': group.id,
'name': group.profile.name,
'description': group.profile.description,
'memberCount': len(member_list),
'members': member_list
})
await client.close()
return groups_data
groups = asyncio.run(get_groups_and_members())
Retrieving Applications and Assignments
async def get_applications_report():
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
# Get all applications
apps, resp, err = await client.list_applications()
apps_data = []
for app in apps:
# Get users assigned to this application
assignments, resp, err = await client.list_application_users(app.id)
user_count = 0
assigned_users = []
async for assignment in assignments:
user_count += 1
user, resp, err = await client.get_user(assignment.id)
assigned_users.append({
'email': user.profile.email,
'assignedDate': assignment.created
})
apps_data.append({
'id': app.id,
'name': app.label,
'status': app.status,
'created': app.created,
'userCount': user_count,
'assignedUsers': assigned_users
})
await client.close()
return apps_data
apps = asyncio.run(get_applications_report())
Retrieving System Logs
async def get_system_logs():
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
# Get logs from the last 24 hours
from datetime import datetime, timedelta
since = (datetime.utcnow() - timedelta(days=1)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
query_params = {
'since': since,
'limit': 1000
}
logs, resp, err = await client.get_logs(query_params)
log_entries = []
async for log in logs:
log_entries.append({
'timestamp': log.published,
'eventType': log.event_type,
'actor': log.actor.display_name if log.actor else 'System',
'target': log.target[0].display_name if log.target else 'N/A',
'outcome': log.outcome.result,
'clientIp': log.client.ip_address if log.client else 'N/A'
})
await client.close()
return log_entries
logs = asyncio.run(get_system_logs())
Complete Reporting Example with Python SDK
import asyncio
import csv
from datetime import datetime, timedelta
from okta.client import Client as OktaClient
async def generate_user_access_report():
"""
Generate comprehensive user access report including:
- User details
- Group memberships
- Application assignments
- Recent authentication activity
"""
config = {
'orgUrl': 'https://your-domain.okta.com',
'token': 'your-api-token-here'
}
client = OktaClient(config)
print("Fetching users...")
users, resp, err = await client.list_users()
report_data = []
async for user in users:
print(f"Processing {user.profile.email}...")
# Get user's groups
groups, resp, err = await client.list_user_groups(user.id)
group_names = []
async for group in groups:
group_names.append(group.profile.name)
# Get user's application assignments
apps, resp, err = await client.list_assigned_applications_for_user(user.id)
app_names = []
async for app in apps:
app_names.append(app.label)
# Get recent login activity
since = (datetime.utcnow() - timedelta(days=30)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
query_params = {
'filter': f'actor.id eq "{user.id}" and eventType eq "user.session.start"',
'since': since,
'limit': 10
}
logs, resp, err = await client.get_logs(query_params)
login_count = 0
last_login = None
async for log in logs:
login_count += 1
if not last_login:
last_login = log.published
report_data.append({
'Email': user.profile.email,
'First Name': user.profile.first_name,
'Last Name': user.profile.last_name,
'Status': user.status,
'Created': user.created,
'Last Login': last_login or 'Never',
'Login Count (30d)': login_count,
'Groups': ', '.join(group_names),
'Applications': ', '.join(app_names),
'Group Count': len(group_names),
'App Count': len(app_names)
})
await client.close()
# Write to CSV
output_file = f'user_access_report_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv'
with open(output_file, 'w', newline='') as csvfile:
fieldnames = report_data[0].keys()
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(report_data)
print(f"\nReport generated: {output_file}")
print(f"Total users: {len(report_data)}")
return report_data
# Run the report
if __name__ == "__main__":
asyncio.run(generate_user_access_report())
Advantages of Python SDK
✅ Type Safety: Strongly-typed models prevent errors
✅ Automatic Pagination: SDK handles pagination automatically
✅ Rate Limiting: Built-in rate limit handling
✅ Error Handling: Comprehensive exception handling
✅ Documentation: IntelliSense and type hints in IDEs
✅ Maintainability: High-level abstractions make code cleaner
✅ Best Practices: Follows Python conventions and patterns
Disadvantages of Python SDK
❌ Learning Curve: Need to learn SDK-specific APIs
❌ Async Only: Requires understanding of asyncio
❌ Updates Needed: SDK must be updated for new API features
❌ Overhead: Additional layer between your code and API
❌ Limited Flexibility: Some advanced API features may not be exposed
Method 2: Okta REST API
The Okta REST API provides direct HTTP access to all Okta functionality. This approach offers maximum flexibility and control, making it ideal for custom integrations, edge cases, and when the SDK doesn’t support a specific feature.
Authentication
The REST API supports two primary authentication methods:
API Token Authentication
import requests
OKTA_ORG_URL = "https://your-domain.okta.com"
API_TOKEN = "your-api-token-here"
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f'SSWS {API_TOKEN}'
}
response = requests.get(f'{OKTA_ORG_URL}/api/v1/users', headers=headers)
users = response.json()
OAuth 2.0 Bearer Token Authentication
import requests
import jwt
import time
def get_access_token():
"""
Get OAuth 2.0 access token using private key JWT
"""
private_key = open('private-key.pem', 'r').read()
# Create JWT
payload = {
'aud': f'https://your-domain.okta.com/oauth2/v1/token',
'iss': 'your-client-id',
'sub': 'your-client-id',
'iat': int(time.time()),
'exp': int(time.time()) + 3600
}
client_assertion = jwt.encode(payload, private_key, algorithm='RS256')
# Request access token
token_url = f'https://your-domain.okta.com/oauth2/v1/token'
data = {
'grant_type': 'client_credentials',
'scope': 'okta.users.read okta.groups.read okta.apps.read okta.logs.read',
'client_assertion_type': 'urn:ietf:params:oauth:client-assertion-type:jwt-bearer',
'client_assertion': client_assertion
}
response = requests.post(token_url, data=data)
return response.json()['access_token']
# Use the access token
access_token = get_access_token()
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f'Bearer {access_token}'
}
Basic Data Retrieval Examples
Retrieving Users
import requests
OKTA_ORG_URL = "https://your-domain.okta.com"
API_TOKEN = "your-api-token-here"
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f'SSWS {API_TOKEN}'
}
def get_all_users():
"""
Retrieve all users with pagination
"""
users = []
url = f'{OKTA_ORG_URL}/api/v1/users'
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
users.extend(response.json())
# Handle pagination via Link header
links = response.links
url = links['next']['url'] if 'next' in links else None
return users
def get_filtered_users():
"""
Retrieve users with filters
"""
# Active users only
params = {'filter': 'status eq "ACTIVE"'}
response = requests.get(f'{OKTA_ORG_URL}/api/v1/users', headers=headers, params=params)
active_users = response.json()
# Search by email
params = {'search': 'profile.email eq "user@example.com"'}
response = requests.get(f'{OKTA_ORG_URL}/api/v1/users', headers=headers, params=params)
search_results = response.json()
# Users with specific attribute
params = {'filter': 'profile.department eq "Engineering"'}
response = requests.get(f'{OKTA_ORG_URL}/api/v1/users', headers=headers, params=params)
dept_users = response.json()
return active_users, search_results, dept_users
users = get_all_users()
print(f"Retrieved {len(users)} users")
Retrieving Groups
def get_groups_with_members():
"""
Retrieve all groups and their members
"""
groups = []
url = f'{OKTA_ORG_URL}/api/v1/groups'
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
for group in response.json():
# Get group members
members_url = f'{OKTA_ORG_URL}/api/v1/groups/{group["id"]}/users'
members_response = requests.get(members_url, headers=headers)
members = members_response.json()
groups.append({
'id': group['id'],
'name': group['profile']['name'],
'description': group['profile'].get('description', ''),
'memberCount': len(members),
'members': [{'email': m['profile']['email'], 'name': f"{m['profile']['firstName']} {m['profile']['lastName']}"} for m in members]
})
links = response.links
url = links['next']['url'] if 'next' in links else None
return groups
groups = get_groups_with_members()
Retrieving Applications
def get_applications_with_assignments():
"""
Retrieve all applications and their user assignments
"""
apps = []
url = f'{OKTA_ORG_URL}/api/v1/apps'
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
for app in response.json():
# Get application users
app_users_url = f'{OKTA_ORG_URL}/api/v1/apps/{app["id"]}/users'
app_users_response = requests.get(app_users_url, headers=headers)
app_users = app_users_response.json()
apps.append({
'id': app['id'],
'name': app['label'],
'status': app['status'],
'created': app['created'],
'userCount': len(app_users),
'users': [{'id': u['id'], 'username': u.get('credentials', {}).get('userName', 'N/A')} for u in app_users]
})
links = response.links
url = links['next']['url'] if 'next' in links else None
return apps
apps = get_applications_with_assignments()
Retrieving System Logs
from datetime import datetime, timedelta
from urllib.parse import quote
def get_system_logs(hours=24, event_type=None):
"""
Retrieve system logs with filters
"""
since = (datetime.utcnow() - timedelta(hours=hours)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
params = {
'since': since,
'limit': 1000,
'sortOrder': 'DESCENDING'
}
if event_type:
params['filter'] = f'eventType eq "{event_type}"'
logs = []
url = f'{OKTA_ORG_URL}/api/v1/logs'
while url:
response = requests.get(url, headers=headers, params=params if url == f'{OKTA_ORG_URL}/api/v1/logs' else None)
response.raise_for_status()
logs.extend(response.json())
links = response.links
url = links['next']['url'] if 'next' in links else None
params = None # Only use params on first request
return logs
# Get all authentication events
auth_logs = get_system_logs(hours=24, event_type='user.session.start')
# Get failed login attempts
failed_logins = get_system_logs(hours=24, event_type='user.session.start')
failed_logins = [log for log in failed_logins if log['outcome']['result'] == 'FAILURE']
Complete Reporting Example with REST API
import requests
import csv
import json
from datetime import datetime, timedelta
from collections import defaultdict
OKTA_ORG_URL = "https://your-domain.okta.com"
API_TOKEN = "your-api-token-here"
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f'SSWS {API_TOKEN}'
}
def generate_security_audit_report():
"""
Generate comprehensive security audit report
"""
print("Generating security audit report...")
# 1. Get failed login attempts
print("Analyzing authentication failures...")
since = (datetime.utcnow() - timedelta(days=7)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
params = {
'since': since,
'filter': 'eventType eq "user.session.start"',
'limit': 1000
}
response = requests.get(f'{OKTA_ORG_URL}/api/v1/logs', headers=headers, params=params)
logs = response.json()
failed_logins = defaultdict(int)
suspicious_ips = defaultdict(int)
for log in logs:
if log['outcome']['result'] == 'FAILURE':
actor = log.get('actor', {}).get('displayName', 'Unknown')
failed_logins[actor] += 1
ip = log.get('client', {}).get('ipAddress', 'Unknown')
suspicious_ips[ip] += 1
# 2. Get users with excessive privileges
print("Analyzing user privileges...")
response = requests.get(f'{OKTA_ORG_URL}/api/v1/users', headers=headers)
users = response.json()
privileged_users = []
for user in users:
# Get user's groups
groups_response = requests.get(f'{OKTA_ORG_URL}/api/v1/users/{user["id"]}/groups', headers=headers)
groups = groups_response.json()
# Check for admin groups
admin_groups = [g for g in groups if 'admin' in g['profile']['name'].lower()]
if admin_groups:
privileged_users.append({
'email': user['profile']['email'],
'groups': [g['profile']['name'] for g in admin_groups]
})
# 3. Get inactive users
print("Identifying inactive users...")
thirty_days_ago = (datetime.utcnow() - timedelta(days=30)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
inactive_users = []
for user in users:
last_login = user.get('lastLogin')
if not last_login or last_login < thirty_days_ago:
inactive_users.append({
'email': user['profile']['email'],
'status': user['status'],
'lastLogin': last_login or 'Never',
'created': user['created']
})
# 4. Generate report
report = {
'timestamp': datetime.utcnow().isoformat(),
'summary': {
'totalUsers': len(users),
'failedLoginAttempts': sum(failed_logins.values()),
'usersWithFailedLogins': len(failed_logins),
'suspiciousIPs': len([ip for ip, count in suspicious_ips.items() if count > 10]),
'privilegedUsers': len(privileged_users),
'inactiveUsers': len(inactive_users)
},
'failedLogins': dict(sorted(failed_logins.items(), key=lambda x: x[1], reverse=True)[:20]),
'suspiciousIPs': dict(sorted(suspicious_ips.items(), key=lambda x: x[1], reverse=True)[:10]),
'privilegedUsers': privileged_users,
'inactiveUsers': inactive_users[:50]
}
# Save to JSON
output_file = f'security_audit_{datetime.now().strftime("%Y%m%d_%H%M%S")}.json'
with open(output_file, 'w') as f:
json.dump(report, f, indent=2)
print(f"\nSecurity audit report generated: {output_file}")
print(f"Summary:")
print(f" - Total users: {report['summary']['totalUsers']}")
print(f" - Failed login attempts: {report['summary']['failedLoginAttempts']}")
print(f" - Suspicious IPs: {report['summary']['suspiciousIPs']}")
print(f" - Privileged users: {report['summary']['privilegedUsers']}")
print(f" - Inactive users: {report['summary']['inactiveUsers']}")
return report
if __name__ == "__main__":
generate_security_audit_report()
Advantages of REST API
✅ Maximum Flexibility: Access all API features directly
✅ No Dependencies: Only requires HTTP client (requests)
✅ Language Agnostic: Easy to translate to other languages
✅ Fine-Grained Control: Complete control over requests
✅ Immediate Updates: Access new API features immediately
✅ Debugging: Easy to test with curl or Postman
✅ Lightweight: No SDK overhead
Disadvantages of REST API
❌ More Code: Need to handle pagination, rate limiting manually
❌ Error Prone: No type safety, easy to make mistakes
❌ Boilerplate: More repetitive code for common operations
❌ Maintenance: Breaking changes require code updates
❌ Documentation: Need to reference API docs constantly
Method 3: Okta CLI
The Okta CLI is a command-line tool that provides quick access to Okta APIs for scripting and automation. While not as feature-rich as the SDK or API, it’s excellent for rapid queries and one-off reports.
Installation
macOS:
brew install okta/tap/okta-cli
Linux:
curl -L https://cli.okta.com/install.sh | bash
Windows:
# Using Chocolatey
choco install okta-cli
# Or using Scoop
scoop bucket add okta https://github.com/okta/scoop-okta-cli
scoop install okta-cli
Authentication Setup
Initialize the CLI with your Okta credentials:
# Interactive setup
okta login
# You'll be prompted for:
# - Okta domain (e.g., your-domain.okta.com)
# - Choose authentication method (browser or API token)
The CLI stores configuration in ~/.okta/okta.yaml.
Using API Token
# Set environment variable
export OKTA_CLIENT_TOKEN="your-api-token-here"
export OKTA_CLIENT_ORGURL="https://your-domain.okta.com"
# Or configure in profile
okta login --token your-api-token-here --url https://your-domain.okta.com
Using OAuth 2.0
# Configure OAuth app
okta apps create
# Login with OAuth
okta login --org https://your-domain.okta.com
Basic Data Retrieval Examples
Listing Users
# List all users
okta users list
# List users with filters
okta users list --filter 'status eq "ACTIVE"'
# Search users
okta users list --search 'profile.email sw "example.com"'
# Get specific user
okta users get user@example.com
# Export users to JSON
okta users list --format json > users.json
# List users in CSV format
okta users list --format csv > users.csv
Listing Groups
# List all groups
okta groups list
# Get group details
okta groups get "Engineering Team"
# List group members
okta groups list-users "Engineering Team"
# Export group membership
okta groups list-users "Engineering Team" --format json > group-members.json
Listing Applications
# List all applications
okta apps list
# Get application details
okta apps get "Salesforce"
# List users assigned to an application
okta apps list-users "Salesforce"
# Export application assignments
okta apps list-users "Salesforce" --format json > app-assignments.json
Retrieving Logs
# Get recent logs (last 24 hours by default)
okta logs get
# Get logs with date range
okta logs get --since 2025-11-01T00:00:00Z --until 2025-11-19T23:59:59Z
# Filter by event type
okta logs get --filter 'eventType eq "user.session.start"'
# Export logs to JSON
okta logs get --since 2025-11-01T00:00:00Z --format json > logs.json
# Get failed login attempts
okta logs get --filter 'eventType eq "user.session.start" and outcome.result eq "FAILURE"'
Shell Scripting Examples
User Access Report
#!/bin/bash
# user-access-report.sh - Generate user access report
OUTPUT_DIR="./reports/$(date +%Y%m%d)"
mkdir -p "$OUTPUT_DIR"
echo "Generating user access report..."
# Export all users
echo "Exporting users..."
okta users list --format json > "$OUTPUT_DIR/users.json"
# Count active vs inactive users
ACTIVE=$(okta users list --filter 'status eq "ACTIVE"' --format json | jq '. | length')
INACTIVE=$(okta users list --filter 'status ne "ACTIVE"' --format json | jq '. | length')
# Export groups
echo "Exporting groups..."
okta groups list --format json > "$OUTPUT_DIR/groups.json"
# Generate summary report
cat > "$OUTPUT_DIR/summary.txt" << EOF
User Access Report
Generated: $(date)
==================================
User Statistics:
- Active Users: $ACTIVE
- Inactive Users: $INACTIVE
- Total Users: $((ACTIVE + INACTIVE))
Groups: $(okta groups list --format json | jq '. | length')
Applications: $(okta apps list --format json | jq '. | length')
EOF
echo "Report generated in $OUTPUT_DIR"
cat "$OUTPUT_DIR/summary.txt"
Security Audit Script
#!/bin/bash
# security-audit.sh - Daily security audit
DATE=$(date +%Y%m%d)
OUTPUT_FILE="security_audit_$DATE.txt"
{
echo "Security Audit Report"
echo "Date: $(date)"
echo "========================================"
echo ""
echo "Failed Login Attempts (Last 24 hours):"
okta logs get --filter 'eventType eq "user.session.start" and outcome.result eq "FAILURE"' \
--format json | jq -r '.[] | "\(.published) - \(.actor.displayName) - \(.client.ipAddress)"'
echo ""
echo "Users with Admin Access:"
okta groups list-users "Administrators" --format json | jq -r '.[].profile.email'
echo ""
echo "Recently Created Users (Last 7 days):"
SEVEN_DAYS_AGO=$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ)
okta users list --filter "created gt \"$SEVEN_DAYS_AGO\"" --format json | \
jq -r '.[] | "\(.profile.email) - Created: \(.created)"'
echo ""
echo "Suspended Users:"
okta users list --filter 'status eq "SUSPENDED"' --format json | jq -r '.[].profile.email'
} > "$OUTPUT_FILE"
echo "Security audit complete: $OUTPUT_FILE"
# Email the report
if [ -s "$OUTPUT_FILE" ]; then
mail -s "Daily Security Audit - $DATE" security-team@company.com < "$OUTPUT_FILE"
fi
Application Access Review
#!/bin/bash
# app-access-review.sh - Generate application access review
APP_NAME="${1:-Salesforce}"
OUTPUT="app_access_review_$(echo $APP_NAME | tr ' ' '_')_$(date +%Y%m%d).csv"
echo "Application Access Review: $APP_NAME"
echo "Generated: $(date)"
echo ""
# Get application ID
APP_ID=$(okta apps list --format json | jq -r ".[] | select(.label == \"$APP_NAME\") | .id")
if [ -z "$APP_ID" ]; then
echo "Application not found: $APP_NAME"
exit 1
fi
# Get assigned users
echo "Email,First Name,Last Name,Status,Assigned Date" > "$OUTPUT"
okta apps list-users "$APP_NAME" --format json | \
jq -r '.[] | "\(.profile.email),\(.profile.firstName),\(.profile.lastName),\(.status),\(.created)"' \
>> "$OUTPUT"
echo "Access review exported to: $OUTPUT"
echo "Total users with access: $(tail -n +2 "$OUTPUT" | wc -l)"
Advantages of Okta CLI
✅ Quick & Easy: Fast for one-off queries
✅ Shell Integration: Works seamlessly with bash/zsh scripts
✅ No Coding Required: Simple command-line interface
✅ Multiple Formats: JSON, CSV, table output
✅ Interactive: Browser-based authentication option
✅ Scriptable: Easy to automate with cron
Disadvantages of Okta CLI
❌ Limited Features: Not all API endpoints available
❌ Less Control: Can’t customize requests as much
❌ Performance: Slower for large-scale operations
❌ Error Handling: Limited error handling options
❌ Complex Logic: Difficult for complex data processing
❌ Dependencies: Requires CLI installation and updates
Comparison Matrix
| Feature | Python SDK | REST API | Okta CLI |
|---|---|---|---|
| Learning Curve | Medium | Low | Low |
| Setup Complexity | Medium | Low | Low |
| Type Safety | ✅ High | ❌ None | ❌ None |
| Pagination | ✅ Automatic | ⚠️ Manual | ✅ Automatic |
| Rate Limiting | ✅ Built-in | ⚠️ Manual | ✅ Built-in |
| Error Handling | ✅ Excellent | ⚠️ Basic | ⚠️ Basic |
| Performance | ⚡ Fast | ⚡ Fast | ⚠️ Moderate |
| Flexibility | ⚠️ Good | ✅ Excellent | ❌ Limited |
| API Coverage | ⚠️ Good | ✅ Complete | ❌ Partial |
| Scripting | ✅ Excellent | ✅ Excellent | ✅ Good |
| Complex Logic | ✅ Excellent | ✅ Excellent | ❌ Limited |
| Debugging | ✅ Good | ✅ Excellent | ⚠️ Basic |
| Dependencies | Python + SDK | Python + requests | CLI binary |
| Update Frequency | SDK releases | Always current | CLI releases |
| Best For | Production apps | Custom integrations | Quick queries |
When to Use Each Method
Use Python SDK When:
- Building production applications
- Need type safety and IDE support
- Want automatic pagination and rate limiting
- Working in Python ecosystem already
- Building long-term maintainable code
- Need comprehensive error handling
Example scenarios:
- Automated user provisioning system
- Compliance reporting dashboard
- Identity governance application
- User lifecycle automation
Use REST API When:
- Need access to newest API features immediately
- Building in non-Python language
- Require maximum flexibility and control
- SDK doesn’t support specific endpoint
- Building custom integration
- Need to minimize dependencies
Example scenarios:
- Custom webhook handlers
- Microservices integration
- Multi-language environments
- Edge cases not covered by SDK
Use Okta CLI When:
- Performing ad-hoc queries
- Writing quick shell scripts
- Automating simple tasks
- Learning Okta API
- Troubleshooting issues
- One-off data exports
Example scenarios:
- Daily email reports
- Manual audits
- Quick data exports
- Cron job automation
Best Practices
Security
Never hardcode credentials
# ❌ Bad API_TOKEN = "00abc123def456..." # ✅ Good import os API_TOKEN = os.environ.get('OKTA_API_TOKEN')Use OAuth 2.0 for production
- More secure than API tokens
- Supports scoped access
- Better audit trail
Rotate credentials regularly
- Set expiration on API tokens
- Rotate OAuth keys quarterly
- Monitor for compromised credentials
Implement least privilege
- Request only necessary scopes
- Use read-only tokens when possible
- Create separate tokens per application
Performance
Implement rate limiting
import time from ratelimit import limits, sleep_and_retry @sleep_and_retry @limits(calls=100, period=60) # 100 calls per minute def call_okta_api(): # Your API call passUse pagination efficiently
# Process in batches BATCH_SIZE = 200 params = {'limit': BATCH_SIZE}Cache when appropriate
from functools import lru_cache @lru_cache(maxsize=100) def get_user_groups(user_id): # Cached for repeated calls passParallelize independent requests
import concurrent.futures with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: futures = [executor.submit(get_user, user_id) for user_id in user_ids] results = [f.result() for f in concurrent.futures.as_completed(futures)]
Error Handling
Implement retry logic
from tenacity import retry, stop_after_attempt, wait_exponential @retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10) ) def fetch_users(): response = requests.get(url, headers=headers) response.raise_for_status() return response.json()Log errors comprehensively
import logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) try: users = fetch_users() except Exception as e: logger.error(f"Failed to fetch users: {e}", exc_info=True)Handle rate limits gracefully
if response.status_code == 429: retry_after = int(response.headers.get('X-Rate-Limit-Reset', 60)) time.sleep(retry_after)
Data Management
Export to multiple formats
import pandas as pd df = pd.DataFrame(users) df.to_csv('users.csv', index=False) df.to_excel('users.xlsx', index=False) df.to_json('users.json', orient='records')Validate data quality
# Check for required fields for user in users: assert 'email' in user['profile'], f"Missing email for user {user['id']}"Archive reports
import gzip import shutil # Compress old reports with open('report.json', 'rb') as f_in: with gzip.open('report.json.gz', 'wb') as f_out: shutil.copyfileobj(f_in, f_out)
Automation Examples
Scheduled Daily Report
# daily_report.py
import asyncio
import os
from datetime import datetime
from okta.client import Client as OktaClient
import pandas as pd
async def generate_daily_report():
config = {
'orgUrl': os.environ['OKTA_ORG_URL'],
'token': os.environ['OKTA_API_TOKEN']
}
client = OktaClient(config)
# Get active users created today
users, resp, err = await client.list_users({'filter': 'status eq "ACTIVE"'})
today = datetime.now().date()
new_users = []
async for user in users:
created_date = datetime.fromisoformat(user.created.replace('Z', '+00:00')).date()
if created_date == today:
new_users.append({
'Email': user.profile.email,
'Name': f"{user.profile.first_name} {user.profile.last_name}",
'Created': user.created
})
await client.close()
# Generate report
if new_users:
df = pd.DataFrame(new_users)
filename = f'new_users_{today}.csv'
df.to_csv(filename, index=False)
print(f"Report generated: {filename}")
# Send email (using your email service)
# send_email(filename)
else:
print("No new users today")
if __name__ == "__main__":
asyncio.run(generate_daily_report())
Crontab entry:
# Run daily at 6 AM
0 6 * * * cd /path/to/scripts && python daily_report.py
Compliance Report Generator
# compliance_report.py
import asyncio
import os
from datetime import datetime, timedelta
from okta.client import Client as OktaClient
import json
async def generate_compliance_report():
"""
Generate SOC 2 compliance report covering:
- User access reviews
- Administrative access
- Authentication logs
- Configuration changes
"""
config = {
'orgUrl': os.environ['OKTA_ORG_URL'],
'token': os.environ['OKTA_API_TOKEN']
}
client = OktaClient(config)
report = {
'generated': datetime.utcnow().isoformat(),
'period': '30 days',
'sections': {}
}
# 1. User Access Review
print("Generating user access review...")
users, resp, err = await client.list_users()
user_summary = {
'total': 0,
'active': 0,
'suspended': 0,
'deprovisioned': 0
}
async for user in users:
user_summary['total'] += 1
user_summary[user.status.lower()] = user_summary.get(user.status.lower(), 0) + 1
report['sections']['user_access'] = user_summary
# 2. Administrative Access
print("Auditing administrative access...")
groups, resp, err = await client.list_groups()
admin_users = []
async for group in groups:
if 'admin' in group.profile.name.lower():
members, resp, err = await client.list_group_users(group.id)
async for member in members:
admin_users.append({
'email': member.profile.email,
'group': group.profile.name
})
report['sections']['administrative_access'] = {
'count': len(admin_users),
'users': admin_users
}
# 3. Authentication Events
print("Analyzing authentication events...")
since = (datetime.utcnow() - timedelta(days=30)).strftime('%Y-%m-%dT%H:%M:%S.000Z')
query_params = {
'filter': 'eventType eq "user.session.start"',
'since': since,
'limit': 1000
}
logs, resp, err = await client.get_logs(query_params)
auth_summary = {
'total_attempts': 0,
'successful': 0,
'failed': 0
}
async for log in logs:
auth_summary['total_attempts'] += 1
if log.outcome.result == 'SUCCESS':
auth_summary['successful'] += 1
else:
auth_summary['failed'] += 1
report['sections']['authentication'] = auth_summary
await client.close()
# Save report
filename = f'compliance_report_{datetime.now().strftime("%Y%m%d")}.json'
with open(filename, 'w') as f:
json.dump(report, f, indent=2)
print(f"\nCompliance report generated: {filename}")
return report
if __name__ == "__main__":
asyncio.run(generate_compliance_report())
Troubleshooting
Common Issues
Authentication Errors
# Check token validity
import requests
response = requests.get(
'https://your-domain.okta.com/api/v1/users/me',
headers={'Authorization': f'SSWS {API_TOKEN}'}
)
if response.status_code == 401:
print("Token is invalid or expired")
elif response.status_code == 403:
print("Token lacks required permissions")
else:
print("Token is valid")
Rate Limiting
# Check rate limit headers
print(f"Rate limit: {response.headers.get('X-Rate-Limit-Limit')}")
print(f"Remaining: {response.headers.get('X-Rate-Limit-Remaining')}")
print(f"Reset: {response.headers.get('X-Rate-Limit-Reset')}")
Pagination Issues
# Ensure you're following links correctly
if 'next' in response.links:
next_url = response.links['next']['url']
# Continue pagination
Conclusion
Extracting data from Okta for reporting and analytics is essential for security, compliance, and operational excellence. Each method—Python SDK, REST API, and CLI—has its strengths:
- Python SDK: Best for production applications requiring robust error handling and type safety
- REST API: Ideal for custom integrations and maximum flexibility
- Okta CLI: Perfect for quick queries and simple automation
Choose the method that best fits your use case, technical requirements, and team expertise. For many organizations, a combination of all three provides the best balance of capabilities:
- Use the CLI for ad-hoc queries and troubleshooting
- Use the Python SDK for production reporting systems
- Use the REST API for custom integrations and edge cases
Key Takeaways
- Security First: Always use secure authentication methods (OAuth 2.0 in production)
- Handle Errors: Implement retry logic and comprehensive error handling
- Respect Limits: Implement rate limiting and pagination
- Automate: Schedule regular reports for consistency
- Document: Keep your scripts well-documented and maintainable
- Monitor: Track your reporting scripts and alert on failures
Next Steps
- Set up authentication for your chosen method
- Start with simple queries to understand the data
- Build your first reporting script
- Automate your reports
- Share insights with your team