Deployment Integrations
On-premise Integrations
Platform Integrations

Digging Observability Data with Thundra Queries

As an observability tool with automated instrumentation that also lets application teams embed their manual instrumentation, Thundra proudly hosts rich data that can be used to understand applications better. Developers can ask any question to Thundra data using Thundra's highly configurable and extensible query language.

Using Thundra queries, developers can

  • Dig the enriched function, invocation, log, trace, operation and log data and ask meaningful questions.

  • Save the queries for later usage or share it with the colleagues to foster a common view towards the application.

  • Set up very detailed alerts and assign a proper severity on that.

Let's go over the queries that you can write with Thundra's extensive query language on different data models. Note that this queries can save you time but you can also build your own Thundra's Query Helper with no need to know the syntax.

Applications

In this section, we'll go over the different queries that you can use to explore your functions from different aspects.

You can use the query helper for applications to build the queries visually. See the related docs for Query Helper for Functions.

To bring the applications in a specific region:

Region=eu-west-1 ORDER BY LastInvocationTime

To bring the applications with more than specific count of timeouts

COUNT(Timeout) >=10 ORDER BY LastInvocationTime DESC

To bring the applications with custom tags that you can set for your functions. (serviceName is used as a custom tag here)

appTags.serviceName IN (user,team)
ORDER BY LastInvocationTime DESC

To see how to use tags for the applications, see this blog.

To combine two (or more) conditions together you can use the AND keyword. For example; following query will return the applications that has the string tag of "user" and whose healthy (normal-erroneous/normal * 100) invocations are less than 95 percent of the total.

appTags.serviceName=user AND
Health < 95
ORDER BY LastInvocationTime DESC

You can sort the applications according to various fields such as health, last invocation time, number of errors or invocations, and even for cost. In the following example, you will run the previous query but will sort the applications according to the cost in the period.

appTags.serviceName=user AND
Health < 95
ORDER BY EstimatedCost DESC

Invocations

In this section, we'll go over the different queries that you can use to explore the invocations of an application from different aspects.

You can use the query helper for invocations to build the queries visually. See the related docs for Query Helper for Invocations.

To bring the latest erroneous invocations:

Erroneous=true ORDER BY LastInvocationTime DESC

To bring the invocations with a specific error type:

ErrorType=DemoIllegalAccessException
ORDER BY LastInvocationTime DESC

To bring the invocations with business tags that you can set for your invocations. (user.id is used as a custom tag here)

tags.user.id="1"
ORDER BY LastInvocationTime DESC

To see how to use tags for the invocations, see this blog.

To combine two (or more) conditions together you can use the AND keyword. For example; following query will return the invocations that has the who has the string tag of "user.id" and whose duration is more than a second (1000 ms).

tags.user.id="1" AND
Duration > 1000
ORDER BY LastInvocationTime DESC

You can sort the invocations according to various fields such as duration and last invocation time. In the following example, you will run the previous query but will sort the invocations according to their duration from longest to the shortest.

tags.user.id="1" AND
Duration > 1000
ORDER BY Duration DESC

Unique Traces

As you may know, unique trace represents the unique business flows that occurs multiple times in a distributed architecture. For example; the same 5 Lambda functions can be triggered asynchronously when an e-commerce user adds an item to cart. The asynchronous chain of invocations when this happens is defined as "unique trace" in Thundra. In this section, we'll go over the different queries that you can use to explore the unique traces of from different aspects.

You can use the query helper for invocations to build the queries visually. See the related docs for Query Helper for Unique Traces.

To bring the unique traces that contain an invocation of a specific application:

Name=user-team-api-java-lab ORDER BY COUNT(Error) DESC

To bring the unique traces that has more errors than a threshold(in this case 20):

COUNT(Error) >20 ORDER BY COUNT(Error) DESC

To bring the unique traces in which there's at least one SQS interaction whose duration is longer than 100ms.

resource.AWS-SQS.duration>100
ORDER BY COUNT(Trace) DESC

To combine two (or more) conditions together you can use the AND keyword. For example; following query will return the unique traces that took longer than 2 seconds and that have the at least one SQS interaction whose duration is longer than 100ms.

resource.AWS-SQS.duration=100
AND AVG(Duration) >2000
ORDER BY COUNT(Error) DESC

Traces

A trace is one occurrence of a business flow in the system. In this section, we'll go over the different queries that you can use to explore the traces of from different aspects.

You can use the query helper for invocations to build the queries visually. See the related docs for Query Helper for Traces.

To bring the traces that has at least one cold started invocation:

HasColdStart=true ORDER BY StartTime DESC

To bring the traces that has the duration of more than a second (1000ms)

Duration > 1000 ORDER BY StartTime DESC

To combine two(or more) conditions together, you can use AND keyword. For example; the following query will return the traces that has at least one erroneous invocations and whose duration is more than 10 seconds. (10000ms)

HasError=true
AND Duration > 1000
ORDER BY StartTime DESC

Operations

An operation means a single interaction between a compute resource and any other resource. These can be referenced in the application from here. In this section, we will go over different queries that you can use to explore operations from different aspects.

You can use the query helper for invocations to build the queries visually. See the related docs for Query Helper for Operations.

In order to bring the erroneous operations:

Erroneous=true ORDER BY StartTime DESC

In order to bring the operations which took more than a second:

Duration > 100 ORDER BY StartTime DESC

In order to combine two or more conditions together, you can use AND keyword. For example; the following query will return the DynamoDB operations whose duration is longer than 1000ms.

ResourceType=AWS-DynamoDB
AND
Duration > 1000 ORDER BY StartTime DESC

Logs

Logs are already very self explanatory, they are basically the every single log that your applications are printing. You can go check this page if you still didn't.

You can use the query helper for logs to build the queries visually. See the related docs for Query Helper for Logs.

In order to bring the log items whose log level is "INFO"

Level=INFO ORDER BY Time DESC

In order to bring the log items that has the string value that's similar to some string, you can make the following search:

Message=*err* ORDER BY Time DESC

In order to combine two or more conditions together, you can use AND keyword. For example; the following query will return the logs that has the "err" string in it and with the log level of "INFO".

Message=*err* AND Level=INFO ORDER BY Time DESC