OpenTelemetry Migration Guide
A phased approach to migrating from proprietary APM agents to OpenTelemetry, with code examples for Go, Node.js, and Python.
Assess Your Current Setup
Before writing any migration code, take inventory of what you're migrating from. Most teams underestimate the surface area of their existing instrumentation.
Audit Your Current Instrumentation
- Vendor SDK initialization -- Identify every service that imports a vendor SDK (Datadog dd-trace, New Relic agent, Dynatrace OneAgent). Note the version and configuration for each.
- Custom spans and metrics -- Search for vendor-specific API calls (e.g.,
tracer.StartSpan(),newrelic.StartTransaction()). These require manual conversion to OTel equivalents. - Proprietary agents -- Some vendors install OS-level agents that auto-instrument without code changes. These must be replaced with OTel auto-instrumentation libraries.
Identify Vendor Lock-in Points
- Custom dashboards and alerts -- Document every dashboard and alert rule. You'll need to recreate these in your new backend or use the OTel Collector to fan out to both old and new backends during migration.
- Proprietary query languages -- If your team writes DQL, NRQL, or similar queries, plan time to convert these to your new backend's query format.
- Vendor-specific attributes -- Some vendors add custom span attributes (e.g.,
dd.service,nr.transactionName) that downstream systems depend on. Map these to OTel semantic conventions.
Choose Your Migration Strategy
There are two approaches to migration. We strongly recommend the gradual approach for production systems.
Big-Bang Migration (Not Recommended)
Replace all vendor SDKs simultaneously in a single release. This is faster but high-risk: if anything breaks, every service is affected and rollback is complex.
Gradual Migration with OTel Collector Proxy (Recommended)
Deploy the OpenTelemetry Collector as a proxy that accepts telemetry in both vendor-native and OTLP formats. During migration, both old and new instrumentation coexist:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
# Accept vendor-native formats during transition
datadog:
endpoint: 0.0.0.0:8126
exporters:
otlp:
endpoint: your-backend:4317
service:
pipelines:
traces:
receivers: [otlp, datadog]
exporters: [otlp]
This lets you migrate one service at a time while maintaining full trace visibility. Services using the old SDK and services using OTel appear in the same traces because the Collector normalizes everything to OTLP.
Install OpenTelemetry SDKs
Replace vendor SDK initialization with OTel SDK initialization. Here are before/after examples for the three most common languages.
Go: Before (Vendor SDK)
import "gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer"
func main() {
tracer.Start(tracer.WithServiceName("my-service"))
defer tracer.Stop()
}
Go: After (OpenTelemetry)
import (
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/otel/sdk/resource"
semconv "go.opentelemetry.io/otel/semconv/v1.24.0"
)
func main() {
exp, _ := otlptracegrpc.New(ctx)
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exp),
sdktrace.WithResource(resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceNameKey.String("my-service"),
)),
)
otel.SetTracerProvider(tp)
defer tp.Shutdown(ctx)
}
Node.js: Before (Vendor SDK)
const tracer = require('dd-trace').init({
service: 'my-service'
});
Node.js: After (OpenTelemetry)
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const sdk = new NodeSDK({
serviceName: 'my-service',
traceExporter: new OTLPTraceExporter(),
});
sdk.start();
Python: Before (Vendor SDK)
from ddtrace import tracer
tracer.configure(hostname='localhost', port=8126)
Python: After (OpenTelemetry)
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
resource = Resource.create({"service.name": "my-service"})
provider = TracerProvider(resource=resource)
provider.add_span_processor(BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)Migrate Instrumentation
After replacing SDK initialization, convert your custom instrumentation and update propagation headers.
Converting Custom Spans
Replace vendor-specific span creation with OTel API calls. The concepts map directly:
tracer.StartSpan("operation")becomestracer.Start(ctx, "operation")span.SetTag("key", value)becomesspan.SetAttributes(attribute.String("key", value))span.SetError(err)becomesspan.RecordError(err); span.SetStatus(codes.Error, err.Error())
Updating Propagation Headers
Most vendor SDKs use proprietary or B3 propagation headers. OTel defaults to W3C Trace Context (traceparent/tracestate). During migration, configure the OTel SDK to accept both formats:
// Go: Accept both B3 and W3C during migration
import (
"go.opentelemetry.io/contrib/propagators/b3"
"go.opentelemetry.io/otel/propagation"
)
otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(
propagation.TraceContext{}, // W3C (new)
b3.New(), // B3 (legacy)
))
Dual-Write Period
During migration, some services send to the old backend and some to the new. The OTel Collector can fan out to both backends simultaneously, ensuring no data gaps:
exporters:
otlp/new-backend:
endpoint: new-backend:4317
datadog:
api:
key: ${DD_API_KEY}
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/new-backend, datadog]
Validating Trace Continuity
After migrating each service, verify that traces still connect across boundaries. Look for orphan spans (spans with no parent in the trace) -- these indicate broken context propagation at the boundary between migrated and unmigrated services.
Validate and Cut Over
Once all services are migrated, validate data quality and decommission the old pipeline.
Compare Trace Data
- Span counts -- Compare total span counts between old and new backends for the same time period. A significant drop indicates missing instrumentation.
- Latency distribution -- P50/P95/P99 latencies should match within 5%. Large discrepancies suggest missing spans or incorrect timing.
- Error rates -- Error classification may differ between vendors. Verify that error spans in OTel match error events in the old backend.
Verify No Data Loss
Run both pipelines in parallel for at least one week. Sample 100 traces per day and manually verify they're complete in both backends. Focus on traces that cross 3+ service boundaries -- these are most likely to have propagation issues.
Decommission Old Agents
- Remove vendor SDK imports and dependencies from every service
- Remove vendor agent sidecar containers or host agents
- Remove vendor-specific environment variables (DD_AGENT_HOST, NEW_RELIC_LICENSE_KEY, etc.)
- Remove the vendor receiver from the OTel Collector config
- Update CI/CD pipelines to remove vendor SDK installation steps
Update Dashboards and Alerts
Recreate critical dashboards using OTel semantic convention attribute names. The naming differs from vendor conventions: for example, http.request.method (OTel) vs http.method (some vendors). Update alert queries to use the new attribute names and verify alert thresholds still trigger correctly.
Ready to implement?
TraceKit helps you implement these practices with live breakpoints, distributed tracing, and production debugging.
Related Resources
Learn distributed tracing patterns and best practices for Go
Calculate SLA uptime and error budgets for your services
AI-powered enterprise observability at enterprise prices. See how TraceKit delivers core APM without the complexity.
Next.js blurs the line between server and client -- React Server Components, ISR, and streaming SSR create invisible boundaries where traces break. TraceKit gives you full visibility across the RSC boundary, from server render to client hydration.
Step-by-step guide to migrate from Datadog to TraceKit. Replace dd-trace with TraceKit SDK, map environment variables, and verify traces in minutes.