Deployment Integrations
On-premise Integrations
Platform Integrations

Root Cause Analysis of Outliers

Thundra helps users monitor their serverless applications that are the epitome of highly-distributed small black boxes. It is hard to find out and troubleshoot problems for your functions. Thundra provides you details of each function to find out problems in your Lambda function and display invocation metrics. With Performance Analysis Page, it becomes extremely easy to detect the outlier invocations and checking for downstream services with 2 clicks.

You can navigate to performance analysis tab to analyze your monitoring data by firstly selecting any function that you want to detect outliers from functions page.

The Performance Analysis tab provides intelligent and extremely valuable information of your Lambda function invocations. All the information is represented in the form of heatmap and graphs that allow you to better comprehend and access the invocation data. From the Performance Analysis charts provided, you can easily detect problematic invocations, dive into their performance and isolate the issue.

On heatmap section, invocations specific for a function are plotted time against duration. Depending on the interval of time and duration falls in, it is plotted as square according to that interval. As shown in the legend, darker color in a cell indicates higher invocation count within lying interval.

When it comes to identifying outliers and dive deep into root causes, selection mode of heat map is very helpful. Choose selection mode and select an area with your cursor to learn more insight about outliers that you find interesting. Charts, invocations and resource usages modified according to invocations in the selected area on heat map.

Services for specific function and invocation can be seen at a glance with usages on the resource chart section. Using resource chart, you can see the breakdown of services usages and which services consume much more time compared to total time.

In duration and Count chart, why selected outlier happened can be easily observed just looking at error and cold start counts or duration changes within a time range.

All the invocations in selected area are displayed at the bottom. If you want to dive deeper for each invocation to analyze outliers, click on one of them.So, you shall be able to see trace data and logs among other detailed information pertaining to the specific invocation selected.

Following picture shows how errors and resource usages are displayed in a view for an outlier invocation.

How to configure your serverless system for metrics?

You can easily use performance summary for outlier detection and root cause analysis. Learn how to configure your system in each environment by visiting our documentation pages: Java, Python, NodeJS, .NET and Go.