Let AI set up TraceKit for you
AI assistants can guide you through the entire setup process
Python Integration Guide
Learn how to instrument your Python applications with OpenTelemetry to send distributed traces to TraceKit.
Using Django?
See our dedicated Django observability guide for framework-specific setup, pain points, and best practices.
Python Distributed Tracing Guide
Go deeper with our Python distributed tracing guide -- covering common pain points, production patterns, and code examples.
New: Frontend Observability
Looking for browser-side error tracking? Check out the new Browser SDK with React, Vue, Angular, Next.js, and Nuxt integrations via Framework Wrappers.
90% Automatic Tracing!
With the right libraries, most of your application will be traced automatically with minimal setup. No need to manually instrument every function.
Prerequisites
- • Python 3.8 or higher
- • A TraceKit account (create one free)
- • A generated API key from the API Keys page
What Gets Traced Automatically?
With proper setup, these operations are traced automatically with zero manual instrumentation:
| Component | Span Type | Captured Attributes | Auto-Traced? |
|---|---|---|---|
| HTTP Endpoints | SERVER | method, route, status_code, duration, client_ip | Yes |
| Database Queries | DB | db.system, db.statement, db.name, duration | Yes |
| HTTP Client Calls | CLIENT | method, url, status_code, duration, peer.service | Yes |
| Redis Operations | DB | db.system (redis), db.statement, duration | Yes |
| Celery Tasks | Custom | task.name, task.id, status, duration | Yes |
| LLM (OpenAI/Anthropic) | CLIENT | gen_ai.system, model, tokens, cost, finish_reason, latency | Yes |
| Custom Business Logic | Custom | user-defined attributes | Manual |
Installation
Install the required OpenTelemetry packages:
# Install TraceKit Python SDK
pip install tracekit-apm
# Framework-specific installation
pip install tracekit-apm[flask] # For Flask
pip install tracekit-apm[django] # For Django
pip install tracekit-apm[fastapi] # For FastAPI
pip install tracekit-apm[all] # All frameworksBasic Setup
Create a tracing initialization module in your application:
Create tracing.py
# Simple initialization with TraceKit SDK
import tracekit
import os
# Initialize TraceKit with code monitoring enabled
client = tracekit.init(
api_key=os.getenv('TRACEKIT_API_KEY'),
service_name=os.getenv('SERVICE_NAME', 'my-python-app'),
endpoint=os.getenv('TRACEKIT_ENDPOINT', '{ appURL }'),
enable_code_monitoring=True # Enable live debugging
)
# That's it! TraceKit handles all OpenTelemetry setup automaticallyVerify It Works
Start your application and make a few requests. Then open the Traces page in your TraceKit dashboard. You should see:
- • SERVER spans for each incoming HTTP request (Django, Flask, or FastAPI)
- • CLIENT spans for outgoing HTTP calls (if instrumented with requests/httpx)
- • DB spans for database queries (if instrumented with SQLAlchemy/psycopg2)
Traces typically appear within 30 seconds. If you don't see them, check the Troubleshooting section.
Framework Integration
TraceKit works seamlessly with popular Python web frameworks through OpenTelemetry instrumentation.
Flask
# app.py
from flask import Flask, request, jsonify
import tracekit
from tracekit.middleware.flask import init_flask_app
import os
# Create Flask app
app = Flask(__name__)
# Initialize TraceKit with code monitoring
client = tracekit.init(
api_key=os.getenv("TRACEKIT_API_KEY"),
service_name="my-flask-app",
endpoint=os.getenv("TRACEKIT_ENDPOINT", "{ appURL }"),
enable_code_monitoring=True # Enable live debugging
)
# Add TraceKit middleware (auto-traces all routes!)
init_flask_app(app, client)
@app.route("/api/users/<int:user_id>")
def get_user(user_id):
# Capture snapshot for debugging (optional)
if client.get_snapshot_client():
client.capture_snapshot('get-user', {
'user_id': user_id,
'request_path': request.path,
'request_method': request.method
})
return jsonify({"id": user_id, "name": "Alice"})
if __name__ == "__main__":
app.run(port=5000)Django
# settings.py
from tracing import init_tracer
import os
# Initialize tracing at Django startup
init_tracer(
service_name="my-django-app",
endpoint=os.getenv("TRACEKIT_ENDPOINT", "{ appURL }"),
api_key=os.getenv("TRACEKIT_API_KEY")
)
# Add Django instrumentation to INSTALLED_APPS (not required but recommended)
# Then use:
# opentelemetry-instrument python manage.py runserver
# Or instrument manually in your WSGI/ASGI file:
from opentelemetry.instrumentation.django import DjangoInstrumentor
DjangoInstrumentor().instrument()FastAPI
# main.py
from fastapi import FastAPI
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
import os
from tracing import init_tracer
# Initialize tracing
init_tracer(
service_name="my-fastapi-app",
endpoint=os.getenv("TRACEKIT_ENDPOINT", "{ appURL }"),
api_key=os.getenv("TRACEKIT_API_KEY")
)
# Create FastAPI app
app = FastAPI()
# Auto-instrument FastAPI
FastAPIInstrumentor.instrument_app(app)
@app.get("/api/users")
async def get_users():
return {"users": ["alice", "bob", "charlie"]}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)Middleware API Reference
Framework integration functions and classes for automatic request tracing:
| Function / Class | Parameters | Returns | Description |
|---|---|---|---|
| create_flask_middleware() | • client: TracekitClient• snapshot_enabled: bool = False |
FlaskMiddleware | Lower-level Flask middleware factory. Returns WSGI middleware that wraps each request in a SERVER span. |
| init_flask_app() | • app: Flask• client: TracekitClient |
None | High-level Flask setup. Registers middleware, error handlers, and request hooks on the Flask app. |
| TracekitDjangoMiddleware | Django middleware class | — | Django middleware class. Add 'tracekit.middleware.django.TracekitDjangoMiddleware' to your MIDDLEWARE list in settings.py. |
| create_fastapi_middleware() | • client: TracekitClient |
ASGIMiddleware | FastAPI/Starlette ASGI middleware factory. Returns middleware that wraps each request in a SERVER span. |
| init_fastapi_app() | • app: FastAPI• client: TracekitClient |
None | High-level FastAPI setup. Registers ASGI middleware, exception handlers, and shutdown hooks. |
Django Middleware Setup
# settings.py
MIDDLEWARE = [
'tracekit.middleware.django.TracekitDjangoMiddleware',
# ... other middleware
]
FastAPI Middleware Setup
from tracekit.middleware.fastapi import init_fastapi_app
app = FastAPI()
init_fastapi_app(app, client)
Code Monitoring (Live Debugging)
TraceKit includes production-safe code monitoring for live debugging without redeployment. Capture variable state and stack traces at any point in your code.
Key Features
- • Synchronous API - No
awaitneeded (works in both sync and async code) - • Auto-Registration - Breakpoints automatically created on first call
- • Background Sync - SDK polls for active breakpoints every 30 seconds
- • Rate Limited - Max 1 capture per second per breakpoint
- • Production Safe - No performance impact when inactive
Enable Code Monitoring
Enable code monitoring when initializing TraceKit:
import tracekit
import os
# Option 1: Direct init with keyword argument
client = tracekit.init(
api_key=os.getenv("TRACEKIT_API_KEY"),
service_name="my-flask-app",
enable_code_monitoring=True, # default: False
)
# Option 2: Using TracekitConfig dataclass
from tracekit import TracekitConfig
config = TracekitConfig(
api_key=os.getenv("TRACEKIT_API_KEY"),
service_name="my-flask-app",
enable_code_monitoring=True,
)
client = tracekit.init(config=config)Adding Snapshots
@app.route("/api/checkout")
def checkout():
cart = request.get_json()
user_id = cart['user_id']
# Capture snapshot at this point (synchronous - no await)
if client.get_snapshot_client():
client.capture_snapshot('checkout-validation', {
'user_id': user_id,
'cart_items': len(cart.get('items', [])),
'total_amount': cart.get('total', 0),
})
# Process payment...
result = process_payment(cart)
# Another checkpoint
if client.get_snapshot_client():
client.capture_snapshot('payment-complete', {
'user_id': user_id,
'payment_id': result['payment_id'],
'success': result['success'],
})
return jsonify({'status': 'success', 'result': result})What Gets Captured
- • Variable values at capture point
- • Full call stack with file/line numbers
- • Request context (HTTP method, URL, headers)
- • Execution timestamp
Production Safety
The Python SDK includes multiple layers of protection to ensure code monitoring is safe for production environments:
PII Scrubbing
13 built-in patterns detect sensitive data (passwords, tokens, API keys, etc.). Enabled by default with typed [REDACTED:type] markers for easy identification.
Crash Isolation
All SDK entry points are wrapped in try/except blocks. Snapshot failures are silently caught and logged — your application code is never affected.
Circuit Breaker
After 3 consecutive failures within 60 seconds, the circuit opens and snapshot capture pauses for a 5-minute cooldown. Auto-recovers once the cooldown expires.
Remote Kill Switch
Disable all snapshot capture instantly from the TraceKit dashboard. The kill switch state propagates to SDKs via SSE — no redeployment required.
Real-Time Updates
The Python SDK receives real-time configuration updates via Server-Sent Events (SSE). When you toggle breakpoints or activate the kill switch from the dashboard, changes propagate to all connected SDK instances immediately — no polling delay and no redeployment needed.
Code Monitoring API
Programmatic API for code monitoring and snapshot capture:
| Method / Class | Parameters | Returns | Description |
|---|---|---|---|
| capture_snapshot() | • label: str — breakpoint label• variables: dict — variables to capture |
None | Captures a snapshot of variable state at the call site. Synchronous API (no await needed). Auto-registers breakpoint on first call. Rate-limited to 1 capture/sec per label. |
| get_snapshot_client() | none | Optional[SnapshotClient] | Returns the snapshot client if code monitoring is enabled, or None if disabled. Use to guard snapshot calls. |
| SnapshotClient | class | — | Advanced / Internal. Manages breakpoint polling and snapshot uploads. Created automatically when enable_code_monitoring=True. |
capture_config Options
Fine-tune snapshot capture behavior via the capture_config dict passed to tracekit.init():
| Key | Type | Default | Description |
|---|---|---|---|
| capture_depth | int | 3 | Maximum depth for nested object serialization. Deeper structures are truncated to [...]. |
| max_payload | int | 16384 | Maximum payload size in bytes per snapshot. Payloads exceeding this limit are truncated. |
| capture_timeout | float | 5.0 | Timeout in seconds for sending a snapshot to the server. Prevents slow networks from blocking capture. |
| pii_scrubbing | bool | True | Enable PII detection and redaction. When enabled, 13 built-in patterns scrub sensitive values and replace them with typed [REDACTED:type] markers. |
| circuit_breaker | bool | True | Enable the circuit breaker. After 3 failures in 60 seconds, capture pauses for a 5-minute cooldown then auto-recovers. |
End-to-End Workflow
Enable Code Monitoring
Code monitoring defaults to disabled in the Python SDK. You must explicitly enable it with enable_code_monitoring=True. Alternatively, use the TracekitConfig dataclass.
import tracekit
client = tracekit.init(
api_key=os.getenv("TRACEKIT_API_KEY"),
service_name="my-flask-app",
enable_code_monitoring=True, # default: False
)
Add Capture Points
Place capture_snapshot calls at code paths you want to debug. The Python API is synchronous — no await needed.
@app.route("/orders", methods=["POST"])
def create_order():
order = request.get_json()
user = get_current_user()
client.capture_snapshot("order-processing", {
"order_id": order["id"],
"total": order["total"],
"user_id": user.id,
})
# Business logic continues...
return jsonify({"status": "ok"})
Deploy and Verify Traces
Deploy your application and confirm traces are flowing. Check the TraceKit dashboard at /traces to see incoming requests.
# Start your Flask/Django app
flask run --host=0.0.0.0
# Verify traces appear in dashboard at /traces
Navigate to Code Monitoring
Go to /snapshots and click the "Browse Code" tab. You'll see auto-discovered files and functions from your traces.
Set Breakpoints
Breakpoints are auto-registered on the first capture_snapshot call. A background thread handles polling for active breakpoints. You can also manually add breakpoints via the UI "Set Breakpoint" button.
Trigger a Capture
Make a request that hits a code path with capture_snapshot. The SDK uses inspect.currentframe() for file/line detection. Capture is synchronous; data is sent in the background.
View Snapshot
Go to /snapshots and click the captured snapshot. View the captured variables, call stack, request context, security flags, and trace link.
LLM Instrumentation
TraceKit automatically instruments OpenAI and Anthropic SDK calls when detected. LLM calls appear as spans with OTel GenAI semantic convention attributes.
Zero-config auto-instrumentation
If openai or anthropic is installed, TraceKit patches them automatically at init. Both sync and async methods are covered.
Captured Attributes
| Attribute | Description |
|---|---|
| gen_ai.system | Provider name (openai, anthropic) |
| gen_ai.request.model | Model name (gpt-4o, claude-sonnet-4-20250514, etc.) |
| gen_ai.usage.input_tokens | Prompt token count |
| gen_ai.usage.output_tokens | Completion token count |
| gen_ai.response.finish_reasons | Finish reason (stop, end_turn, tool_calls) |
Configuration
import tracekit
client = tracekit.init(
api_key=os.getenv('TRACEKIT_API_KEY'),
service_name='my-service',
endpoint='https://app.tracekit.dev/v1/traces',
# LLM instrumentation (enabled by default)
instrument_llm={
'enabled': True, # Master toggle
'openai': True, # OpenAI instrumentation
'anthropic': True, # Anthropic instrumentation
'capture_content': False, # Capture prompts/completions (off by default)
},
)
Environment Variable Override
Use TRACEKIT_LLM_CAPTURE_CONTENT=true to enable prompt/completion capture without code changes. Useful for enabling in staging but not production.
Streaming Support
Streaming responses (both sync and async) produce a single span covering the entire stream. Token counts are accumulated from the final event. No special configuration needed.
LLM Dashboard
View LLM cost, token usage, and latency metrics on the dedicated LLM Observability dashboard at /ai/llm in your TraceKit instance.
Nested Spans (Parent-Child Relationships)
Create nested spans to track operations within requests. Parent spans are created automatically by the Flask middleware, and child spans link automatically.
from opentelemetry import trace
@app.route("/api/users/<int:user_id>")
def get_user(user_id):
# Get the tracer
tracer = trace.get_tracer(__name__)
# Parent span is auto-created by Flask middleware
# Create child span using context manager
with tracer.start_as_current_span('db.query.user') as span:
span.set_attributes({
'db.system': 'postgresql',
'db.operation': 'SELECT',
'db.table': 'users',
'db.statement': 'SELECT * FROM users WHERE id = ?',
'user.id': user_id
})
user = fetch_user_from_db(user_id)
span.set_attributes({
'user.found': user is not None,
'user.role': user.get('role') if user else None
})
return jsonify(user)Trace Hierarchy
GET /api/users/1 (parent - auto-created) └─ db.query.user (child - manually created)
Automatic Instrumentation Libraries
These libraries automatically create child spans for common operations. Set them up once, and every call is traced automatically.
Database Queries
Automatically trace all database operations:
SQLAlchemy
from sqlalchemy import create_engine
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor
# Create engine
engine = create_engine("postgresql://user:pass@localhost/mydb")
# Instrument SQLAlchemy (one line!)
SQLAlchemyInstrumentor().instrument(engine=engine)
# Now all queries are automatically traced
from sqlalchemy.orm import Session
with Session(engine) as session:
users = session.query(User).all() # This query is traced!HTTP Client Calls
Automatically trace all outgoing HTTP requests:
from opentelemetry.instrumentation.requests import RequestsInstrumentor
import requests
# Instrument requests library (one line!)
RequestsInstrumentor().instrument()
# Now all HTTP requests are automatically traced
response = requests.get("https://api.example.com/users") # Traced!Redis Operations
Trace Redis commands automatically:
from redis import Redis
from opentelemetry.instrumentation.redis import RedisInstrumentor
# Instrument Redis
RedisInstrumentor().instrument()
# Create Redis client
redis_client = Redis(host='localhost', port=6379)
# All operations are now traced!
redis_client.set("key", "value") # Traced!
value = redis_client.get("key") # Traced!Auto-Instrumented HTTP Client Libraries
The following HTTP client libraries are automatically instrumented when auto_instrument_http_client=True (default):
| Library | Span Type | Captured Attributes | Auto-Instrumented? |
|---|---|---|---|
| requests | CLIENT | http.method, http.url, http.status_code, peer.service | ✓ Yes |
| urllib | CLIENT | http.method, http.url | ✓ Yes |
| urllib3 | CLIENT | http.method, http.url, http.status_code, peer.service | ✓ Yes |
Note: These libraries are instrumented automatically via OpenTelemetry when the SDK is initialized. No additional setup required.
Local UI (Development Mode)
Debug your Python application locally without creating an account. TraceKit Local UI runs on your machine at http://localhost:9999 and automatically receives traces when you run your app in development mode.
Automatic Detection
The Python SDK automatically detects when Local UI is running on port 9999 and sends traces to both Local UI and cloud (if you have an API key configured).
Quick Start
1. Install the Local UI:
npm install -g @tracekit/local-ui2. Start the Local UI:
tracekit-local3. Run your app in development mode:
ENV=development python app.py4. Open your browser:
http://localhost:9999Features
Auto-Detection
SDK checks for Local UI at localhost:9999 on startup
Real-Time Updates
See traces instantly with WebSocket live updates
Development Only
Only activates when ENV=development
Works Offline
No internet connection required - everything runs locally
Benefits
- See your first trace in under 60 seconds
- Debug locally without switching to the cloud dashboard
- Stay in your flow - everything runs on your machine
- Works completely offline
- Perfect for development and demos
Troubleshooting
If traces aren't appearing in Local UI, check:
- Local UI is running (
curl http://localhost:9999/api/health) ENV=developmentis set- SDK version is v0.3.2 or higher
- Check console for " Local UI detected" message
Service Discovery
TraceKit automatically instruments outgoing HTTP calls to create service dependency graphs. When your service makes an HTTP request to another service, TraceKit creates CLIENT spans and injects trace context headers.
Supported HTTP Clients
- requests - Python requests library
- httpx - Modern HTTP client
- aiohttp - Async HTTP client
- urllib3 - Low-level HTTP client
Custom Service Name Mappings
For local development or when service names can't be inferred from hostnames, configure service name mappings:
import tracekit
import os
client = tracekit.init(
api_key=os.getenv('TRACEKIT_API_KEY'),
service_name='my-service',
# Map localhost URLs to actual service names
service_name_mappings={
'localhost:8082': 'payment-service',
'localhost:8083': 'user-service',
'localhost:8084': 'inventory-service',
}
)
# Now requests to localhost:8082 will show as "payment-service"
import requests
response = requests.get('http://localhost:8082/charge')
# -> Creates CLIENT span with peer.service = "payment-service"This maps localhost:8082 to "payment-service" in your service graph.
Service Name Detection
TraceKit intelligently extracts service names from URLs:
| URL | Extracted Service Name |
|---|---|
| http://payment-service:3000 | payment-service |
| http://payment.internal | payment |
| http://payment.svc.cluster.local | payment |
| https://api.example.com | api.example.com |
Viewing Service Dependencies
Visit your TraceKit dashboard to see the Service Map - a visual graph showing which services call which, with health metrics and latency data.
Manual Instrumentation (Optional)
For custom business logic that isn't covered by auto-instrumentation libraries, you can manually create spans. This is optional and only needed for specific operations you want to measure.
from opentelemetry import trace
@app.route("/api/users/<int:user_id>")
def get_user(user_id):
# Get the tracer
tracer = trace.get_tracer(__name__)
# Parent span is auto-created by Flask middleware
# Create child span using context manager
with tracer.start_as_current_span('db.query.user') as span:
span.set_attributes({
'db.system': 'postgresql',
'db.operation': 'SELECT',
'db.table': 'users',
'db.statement': 'SELECT * FROM users WHERE id = ?',
'user.id': user_id
})
user = fetch_user_from_db(user_id)
span.set_attributes({
'user.found': user is not None,
'user.role': user.get('role') if user else None
})
return jsonify(user)
# Creates a nested trace:
# GET /api/users/1 (parent span - auto-created)
# └─ db.query.user (child span - manually created)Core API
Module-level exports and primary classes:
| Symbol | Type | Returns | Description |
|---|---|---|---|
| get_client() | function | TracekitClient | Returns the global TracekitClient singleton. Call after tracekit.init(). |
| TracekitClient | class | — | Main SDK client. Created via tracekit.init(), accessed via tracekit.get_client(). Provides all tracing, lifecycle, and code monitoring methods. |
| TracekitConfig | dataclass | — | Configuration dataclass with fields: api_key, service_name, endpoint, traces_path, metrics_path, enabled, sample_rate, enable_code_monitoring, auto_instrument_http_client, service_name_mappings. |
| extract_client_ip_from_headers() | function | Optional[str] | Extracts client IP from request headers (X-Forwarded-For, X-Real-IP). Returns None if no IP found. |
TracekitClient Tracing API
The TracekitClient provides these methods for manual span creation and management:
| Method | Parameters | Returns | Description |
|---|---|---|---|
| start_trace() | • operation_name: str• attributes: Optional[Dict] = None |
Span | Creates a new root trace. Use for top-level operations not tied to an incoming HTTP request. |
| start_server_span() | • operation_name: str• attributes: Optional[Dict] = None• parent_context = None |
Span | Creates a SERVER span for incoming requests. Extracts trace context from request headers for distributed tracing. |
| start_span() | • operation_name: str• attributes: Optional[Dict] = None |
Span | Creates a child span linked to the current active span from context. |
| end_span() | • span: Span• final_attributes: Optional[Dict] = None• status: str = "OK" |
None | Ends a span, recording its duration. Optionally sets final attributes and status code. |
| add_event() | • span: Span• name: str• attributes: Optional[Dict] = None |
None | Records a timestamped event on a span. Use for significant occurrences during an operation. |
| record_exception() | • span: Span• exception: Exception |
None | Records an exception on a span with message, type, and traceback. Automatically sets span status to ERROR. |
Lifecycle Methods
Control the SDK lifecycle and query its state:
| Method | Parameters | Returns | Description |
|---|---|---|---|
| flush() | none | None | Forces an immediate flush of all pending spans to the backend. Call before process exit or after critical operations. |
| shutdown() | none | None | Gracefully shuts down the SDK: flushes remaining data, releases resources, and stops background tasks. |
| is_enabled() | none | bool | Returns whether the SDK is currently enabled and actively tracing requests. |
| should_sample() | none | bool | Returns whether the current request should be sampled based on the configured sample rate. |
Complete Tracing Lifecycle Example
from tracekit import get_client
client = get_client()
span = client.start_trace("process-order", {"order.id": order_id})
try:
child = client.start_span("validate-payment")
client.add_event(child, "payment-validated", {"amount": total})
client.end_span(child)
client.end_span(span, {"order.status": "completed"})
except Exception as e:
client.record_exception(span, e)
client.end_span(span, status="ERROR")
finally:
client.flush()
Environment Variables
Best practice: Store sensitive configuration in environment variables:
# .env
TRACEKIT_API_KEY=ctxio_your_generated_api_key_here
TRACEKIT_ENDPOINT={ appURL }
SERVICE_NAME=my-python-appAll configuration options available for the Python SDK:
| Option | Type | Default | Env Variable | Description |
|---|---|---|---|---|
| api_key | str | required | TRACEKIT_API_KEY | Your TraceKit API key for authentication |
| service_name | str | "python-app" | TRACEKIT_SERVICE_NAME | Name of your service in the trace dashboard |
| endpoint | str | "app.tracekit.dev" | TRACEKIT_ENDPOINT | TraceKit collector endpoint URL |
| traces_path | str | "/v1/traces" | TRACEKIT_TRACES_PATH | HTTP path for trace data export |
| metrics_path | str | "/v1/metrics" | TRACEKIT_METRICS_PATH | HTTP path for metrics data export |
| enabled | bool | True | TRACEKIT_ENABLED | Enable or disable tracing globally |
| sample_rate | float | 1.0 | TRACEKIT_SAMPLE_RATE | Trace sampling rate (0.0 to 1.0, where 1.0 = 100%) |
| enable_code_monitoring | bool | False | TRACEKIT_CODE_MONITORING_ENABLED | Enable live code debugging / snapshot capture |
| auto_instrument_http_client | bool | True | - | Auto-instrument outgoing HTTP calls (requests, httpx) |
| service_name_mappings | Optional[Dict] | None | - | Map host:port to service names for service discovery |
Note: The Python SDK does not auto-read environment variables. Read them with os.getenv() and pass to init().
Production Configuration
Production Checklist
- • Use HTTPS/TLS for the OTLP endpoint
- • Store API keys in a secrets manager (AWS Secrets Manager, HashiCorp Vault)
- • Set appropriate service names and versions
- • Configure resource attributes (deployment.environment, host.name, etc.)
- • Adjust sampling rates if needed for high-traffic services
Troubleshooting
Traces not appearing?
Cause: The SDK is misconfigured or not initialized before request handling.
Fix:
-
•
Call
tracekit.init()at the top of your application before any framework setup -
•
Verify API key:
print(os.getenv('TRACEKIT_API_KEY')) -
•
Check that
enabledisTrue - • Review stderr for OpenTelemetry export errors
Connection refused errors?
Cause: The TraceKit endpoint is unreachable.
Fix:
-
•
Test with
curl -X POST https://app.tracekit.dev/v1/traces(expect 401) - • Check that your Python environment can reach external HTTPS endpoints (proxy, firewall)
- • Verify the endpoint URL
Code monitoring not working?
Cause: Code monitoring defaults to disabled in the Python SDK.
Fix:
-
•
Set
enable_code_monitoring=Trueininit() -
•
Add
capture_snapshot()calls in target code paths - • Ensure the polling interval has elapsed before expecting snapshots
- • Verify API key permissions
Authentication errors (401/403)?
Cause: The API key is invalid or malformed.
Fix:
-
•
Strip whitespace:
api_key=os.getenv('TRACEKIT_API_KEY', '').strip() - • Regenerate in the dashboard
- • Confirm the key matches the target environment
Missing dependency errors?
Cause: Python OTel packages must be explicitly installed -- they are not bundled with the SDK.
Fix:
-
•
Install all required packages:
pip install tracekit opentelemetry-api opentelemetry-sdk opentelemetry-instrumentation-flask(or the framework-specific package) -
•
Use
pip list | grep opentelemetryto verify installed versions
Complete Example
Here's a complete working example with Flask:
# Complete Flask example with TraceKit SDK
from flask import Flask, request, jsonify
from opentelemetry import trace
import tracekit
from tracekit.middleware.flask import init_flask_app
import os
import time
# Create Flask app
app = Flask(__name__)
# Initialize TraceKit with code monitoring
client = tracekit.init(
api_key=os.getenv("TRACEKIT_API_KEY"),
service_name=os.getenv("SERVICE_NAME", "flask-api"),
endpoint=os.getenv("TRACEKIT_ENDPOINT", "{ appURL }"),
enable_code_monitoring=True
)
# Add TraceKit middleware
init_flask_app(app, client)
@app.route("/api/users")
def get_users():
return jsonify({"users": ["alice", "bob", "charlie"]})
@app.route("/api/users/<int:user_id>")
def get_user(user_id):
# Capture snapshot for debugging
if client.get_snapshot_client():
client.capture_snapshot('get-user', {
'user_id': user_id,
'request_path': request.path,
'request_method': request.method
})
# Create nested span for database query
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span('db.query.user') as span:
span.set_attributes({
'db.system': 'postgresql',
'db.operation': 'SELECT',
'db.table': 'users',
'user.id': user_id
})
time.sleep(0.01) # Simulate DB query
return jsonify({"id": user_id, "name": "Alice"})
if __name__ == "__main__":
print("Flask app starting with TraceKit tracing and code monitoring enabled")
app.run(host="0.0.0.0", port=5000)You're all set!
Your Python application is now sending traces to TraceKit. Visit the Dashboard to see your traces.
Custom Metrics
Track custom metrics like request counts, queue sizes, and response times using the TraceKit metrics API.
Counter
Track monotonically increasing values (requests, events):
import tracekit
client = tracekit.init(
api_key="your-api-key",
service_name="my-service"
)
# Create a counter with optional tags
counter = client.counter("http.requests.total", tags={"service": "api"})
# Increment by 1
counter.inc()
# Add a specific value
counter.add(5)Gauge
Track values that can go up or down (queue size, connections):
# Create a gauge
gauge = client.gauge("http.connections.active")
# Set to specific value
gauge.set(42)
# Increment/decrement
gauge.inc()
gauge.dec()Histogram
Track value distributions (latencies, sizes):
# Create a histogram with tags
histogram = client.histogram("http.request.duration", tags={"unit": "ms"})
# Record values
histogram.record(45.2)
histogram.record(123.5)🔄 Migrating from OpenTelemetry
TraceKit wraps OpenTelemetry internally, so you get the same standards-based trace data with significantly less setup code. Here's how to migrate from a raw OpenTelemetry setup to TraceKit.
Before vs After
# 5 imports required
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
resource = Resource.create({"service.name": "my-service"})
provider = TracerProvider(resource=resource)
exporter = OTLPSpanExporter(
endpoint="https://api.tracekit.io/v1/traces",
headers={"Authorization": f"Bearer {api_key}"},
)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("my-service")
# + Flask/Django middleware setup
# + manual span creation for each route
import tracekit
tracekit.init(
api_key=os.environ["TRACEKIT_API_KEY"],
service_name="my-service",
)
# Flask middleware: one line
app = tracekit.flask_app(__name__)
Migration Steps
Install the SDK: pip install tracekit
Replace init code: Remove all opentelemetry imports and provider setup. Replace with tracekit.init()
Replace middleware: For Flask, use tracekit.flask_app(). For Django, add tracekit.DjangoMiddleware to MIDDLEWARE
Remove OTel packages: pip uninstall opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
Verify: Start your app and check the Traces page for incoming data
Key Migration Benefits
- • 25 lines to 5 lines — no more boilerplate for exporters, resources, processors
- • No OTel dependency management — TraceKit handles version pinning internally
- • Built-in code monitoring — not available with raw OpenTelemetry
- • Built-in security scanning — automatic variable redaction on snapshots
- • Auto-instrumentation included — HTTP, DB, and external calls traced automatically
⚡ Performance Overhead
TraceKit is built on OpenTelemetry's efficient batch processing pipeline. The SDK adds minimal overhead to your Python application.
Request Tracing
< 2ms per request
Spans are batched and exported asynchronously.
Code Monitoring (Idle)
Zero overhead
No performance impact when no active breakpoints.
Code Monitoring (Capture)
< 5ms per snapshot
Includes variable serialization and security scanning.
Memory Footprint
~15-25 MB
SDK runtime and span buffer.
SDK Initialization
< 500ms one-time
One-time cost at application startup. Includes OpenTelemetry provider setup and auto-instrumentation registration.
Note: Performance characteristics are typical for production workloads and may vary with application complexity, request volume, and number of instrumented libraries. Use sampling configuration to reduce overhead in high-traffic services.
✅ Best Practices
✓ DO: Initialize before Flask/Django app creation
Call tracekit.init() before creating your Flask or Django app so auto-instrumentation patches are applied to all modules.
import tracekit
tracekit.init() # Before app creation
from flask import Flask
app = Flask(__name__)
✓ DO: Use environment variables for API keys
Store your API key in TRACEKIT_API_KEY rather than hardcoding it. The SDK reads this automatically.
✓ DO: Use atexit for cleanup in non-framework apps
For scripts and CLI tools that don't use Flask/Django, register atexit to flush pending spans on exit.
import atexit
import tracekit
sdk = tracekit.init()
atexit.register(sdk.shutdown)
✓ DO: Enable code monitoring in staging first
Test breakpoint capture and snapshot behavior in a staging environment before rolling out to production.
✓ DO: Use sampling in high-traffic services
Set TRACEKIT_SAMPLING_RATE to a value below 1.0 for services handling thousands of requests per second to reduce overhead without losing visibility.
✓ DO: Set meaningful service names
Use TRACEKIT_SERVICE_NAME to give your service a descriptive name that makes it easy to identify in the trace viewer.
✗ DON'T: Import route modules before tracekit.init()
Auto-instrumentation patches libraries at import time. If you import requests, psycopg2, or route modules before calling tracekit.init(), those modules won't be instrumented.
✗ DON'T: Create spans for every function
Trace boundaries like HTTP handlers, database calls, and external service calls. Instrumenting internal helper functions adds noise and overhead without useful insight.
✗ DON'T: Add high-cardinality attributes
Avoid using user IDs, request IDs, or session tokens as span attributes. These create excessive unique time series and degrade query performance.
✗ DON'T: Disable TLS in production
The TRACEKIT_INSECURE flag is for local development only. Always use TLS when sending traces to TraceKit in production.
Next Steps
- • Add auto-instrumentation libraries for components you use (Redis, Celery, MongoDB, etc.)
- • Explore your traces on the Traces page to identify performance bottlenecks
- • Optionally add custom spans for specific business logic you want to measure
- • Configure sampling for high-traffic services to reduce overhead
- • Set up alert rules to get notified when issues occur