aboutsummaryrefslogtreecommitdiff
path: root/docs/tracing/opentracing.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/tracing/opentracing.md')
-rw-r--r--docs/tracing/opentracing.md112
1 files changed, 112 insertions, 0 deletions
diff --git a/docs/tracing/opentracing.md b/docs/tracing/opentracing.md
new file mode 100644
index 00000000..a2110bc0
--- /dev/null
+++ b/docs/tracing/opentracing.md
@@ -0,0 +1,112 @@
+Opentracing
+===========
+
+Dendrite extensively uses the [opentracing.io](http://opentracing.io) framework
+to trace work across the different logical components.
+
+At its most basic opentracing tracks "spans" of work; recording start and end
+times as well as any parent span that caused the piece of work.
+
+A typical example would be a new span being created on an incoming request that
+finishes when the response is sent. When the code needs to hit out to a
+different component a new span is created with the initial span as its parent.
+This would end up looking roughly like:
+
+```
+Received request Sent response
+ |<───────────────────────────────────────>|
+ |<────────────────────>|
+ RPC call RPC call returns
+```
+
+This is useful to see where the time is being spent processing a request on a
+component. However, opentracing allows tracking of spans across components. This
+makes it possible to see exactly what work goes into processing a request:
+
+
+```
+Component 1 |<─────────────────── HTTP ────────────────────>|
+ |<──────────────── RPC ─────────────────>|
+Component 2 |<─ SQL ─>| |<── RPC ───>|
+Component 3 |<─ SQL ─>|
+```
+
+This is achieved by serializing span information during all communication
+between components. For HTTP requests, this is achieved by the sender
+serializing the span into a HTTP header, and the receiver deserializing the span
+on receipt. (Generally a new span is then immediately created with the
+deserialized span as the parent).
+
+A collection of spans that are related is called a trace.
+
+
+Spans are passed through the code via contexts, rather than manually. It is
+therefore important that all spans that are created are immediately added to the
+current context. Thankfully the opentracing library gives helper functions for
+doing this:
+
+```golang
+span, ctx := opentracing.StartSpanFromContext(ctx, spanName)
+defer span.Finish()
+```
+
+This will create a new span, adding any span already in `ctx` as a parent to the
+new span.
+
+
+Adding Information
+------------------
+
+Opentracing allows adding information to a trace via three mechanisms:
+- "tags" ─ A span can be tagged with a key/value pair. This is typically
+ information that relates to the span, e.g. for spans created for incoming HTTP
+ requests could include the request path and response codes as tags, spans for
+ SQL could include the query being executed.
+- "logs" ─ Key/value pairs can be looged at a particular instance in a trace.
+ This can be useful to log e.g. any errors that happen.
+- "baggage" ─ Arbitrary key/value pairs can be added to a span to which all
+ child spans have access. Baggage isn't saved and so isn't available when
+ inspecting the traces, but can be used to add context to logs or tags in child
+ spans.
+
+
+See
+[specification.md](https://github.com/opentracing/specification/blob/master/specification.md)
+for some of the common tags and log fields used.
+
+
+Span Relationships
+------------------
+
+Spans can be related to each other. The most common relation is `childOf`, which
+indicates the child span somehow depends on the parent span ─ typically the
+parent span cannot complete until all child spans are completed.
+
+A second relation type is `followsFrom`, where the parent has no dependence on
+the child span. This usually indicates some sort of fire and forget behaviour,
+e.g. adding a message to a pipeline or inserting into a kafka topic.
+
+
+Jaeger
+------
+
+Opentracing is just a framework. We use
+[jaeger](https://github.com/jaegertracing/jaeger) as the actual implementation.
+
+Jaeger is responsible for recording, sending and saving traces, as well as
+giving a UI for viewing and interacting with traces.
+
+To enable jaeger a `Tracer` object must be instansiated from the config (as well
+as having a jaeger server running somewhere, usually locally). A `Tracer` does
+several things:
+- Decides which traces to save and send to the server. There are multiple
+ schemes for doing this, with a simple example being to save a certain fraction
+ of traces.
+- Communicating with the jaeger backend. If not explicitly specified uses the
+ default port on localhost.
+- Associates a service name to all spans created by the tracer. This service
+ name equates to a logical component, e.g. spans created by clientapi will have
+ a different service name than ones created by the syncapi. Database access
+ will also typically use a different service name.
+
+ This means that there is a tracer per service name/component.