Three Pillars of Observability

Thundra provides users to monitor their serverless system to keep your system working. By monitoring your serverless system, you can stay aware of issues like a drop on metric and when someting goes wrong in your system you can see and fix it. Observability is more expanded version to understand problems that you cannot aware of it. Thundra provides you to achieve perfectly operating your serverless system via 3 pillars of observability: Metric, Logs and Traces.

Metrics

To achieve full observability on your serverless system, Thundra provides you metrics of your Lambda functions. Using these metrics you can observe behaviour of your Lambda function within time intervals. You can find following metrics if you plug Thundra to your Lambda function. These metrics can be differentiate according to runtime of your function:

  • Invocation Counts - Shows total invocations, invocations that ended with error or have cold start.

  • Invocation Durations - Shows average, p99 and p95 duration of invocations for each time interval.

  • Memory Usages - Shows average memory usage of allocated memory within each time interval.

  • CPU Percentages - Shows average process CPU usage of selected function for each time interval.

  • Disk IO Bytes - Shows average disk IO bytes for selected function within each time interval.

  • Process Memory Usages - Shows average process memory usage within each time for selected function.

  • Thread Count - Shows the activated thread count for each time interval.

  • GC Counts - Shows how many GC executed for generation 0, generation 1 and generation 2.

  • Number of Go Routines (Go) - Number of Go-routines in execution on average.

  • GC Pause (Go) - Pause is the GC stop-the-world pauses since the program started in millisecond (ms).

  • Heap Stats (Go) - Shows heap stats on average:

    • Heap Allocation: MBs of allocated heap objects. "Allocated" heap objects include all reachable objects, as well as unreachable objects that the garbage collector has not yet freed.

    • Heap Allocation: MBs of allocated heap objects. "Allocated" heap objects include all reachable objects, as well as unreachable objects that the garbage collector has not yet freed.

    • Heap In Use: MBs in in-use spans. In-use spans have at least one object in them. These spans can only be used for other objects of roughly the same size.

  • Number of Allocated Heap Objects (Go) - Shows average number of allocated heap objects within each time interval.

  • Network IO Stats (Go)- Shows average network IO operations on how many KBs are sent or received.

  • Network IO error counts (Go) - Shows the number of errors while sending and receiving packets in total.

  • Loaded Class Counts (Java) - Shows the average of currently loaded class counts and the maximum of total loaded class counts for the selected function for each time interval.

  • Memory usages by pools (Java) - Show average memory usages of each JVM memory region in MB for the selected function for each time interval. The following memory regions are shown:

    • Eden space

    • Survivor space

    • Tenured generation

    • Metaspace

    • Code cache

  • GC durations (Java) - Shows the total minor and major GC durations in milliseconds for the selected function for each time interval.

You can navigate to metrics of your function by clicking on Functions List page and selecting Metrics Tab on Function Details page.

Traces

Thundra provides users to trace their serverless system end-to-end. Traces help you to detect problematic areas and bottlenecks in your serverless architecture. So, you can easily have observability and fix issues.

Open Tracing Compatibility

Thundra agents support OpenTracing API. If you are not familiar with OpenTracing you can get detailed information here. When you plug Thundra agents to your Lambda function, it automatically creates root span for your function. You can access Thundra’s tracer via OpenTracing’s GlobalTracer instance and build new spans and add some business information to increase observability.

Also you can add your individual spans to parts of your function in order to monitor.

Example Trace Map

Distributed Traces

Thundra provides users "Full tracing" capability which means Thundra includes both Distributed and Local tracing. Using distributed tracing, you can observe interaction of your Lambda functions with other resources. In addition to higher level observability, you are able to observe your Lambda function line by line as deeper level.

Distributed tracing is supported with two unique capabilities in Thundra:

  • Multiple upstream transaction - There can be multiple invocations when an invocation is triggered. For example a batch of messages come through multiple invocations and written to DynamoDB with one transaction. In Thundra, you can display each upstream invocation link to downstream invocation.

  • Business transaction - Some invocation are related to each other with logical interactions. Such as writing a message to DynamoDB table after an approval. You can link logically related flows with Thundra's distributed tracing.

Local Tracing

Local tracing one of the unique features of Thundra provides to users. In favor of local tracing, you can trace methods in your Lambda function. On Trace Chart page, spans of an invocation displayed. Methods in your Lambda function can be seen with Methodtag. Click on a method to display details of it! Thundra's Local Tracing feature allows users to display local variables, parameters inside a method within a Lambda function on Summary tab.

Thundra provides users to list all traces in their serverless system on Traces Page. You can search through your traces using Thundra's detailed query capability. You can display trace map of specific trace by clicking on any trace on this list.

Using this trace map you can display health of interactions between your lambda function and other reasources. By clicking on your Lambda, you are able have detailed insight about your Lambda. You can display line by line your Lambda function and for any line logs, local variables and tags are displayed.

Logs

In addition to metrics and traces, logs are the one of the most efficient way to observe your serverless system and detect bottlenecks in your applications.

You can display your logs in a list in Logs page. You can investigate them using Thundra's detailed query capability. In addition to this, logs, traces and metrics are aggregated. You can display metrics of a function and display logs of any invocation of this function in order to have a detailed information. Furthermore, foreach span in a trace, logs can be displayed .

You can add logs to your function using console.log() or Thundra logger. For more detailed information: Java , Python, .NET, Go, NodeJS.