Introduction

One of the most time-consuming parts of application development is debugging. This already hard part of our day to day life can easily become a nightmare if the application crashes only on production and you cannot reproduce it on your local machine. Fortunately, we have .NET tools which can come really handy in these scenarios.

The Problem

To demonstrate what I am talking about imagine that we have an API that serves weather data. Yes, the default one from .NET Web Api creation.

Vs_Initial_Screen

This application crashes after a week or two on production machine and you cannot figure out why.

Vs_Initial_Screen

Dotnet Tools

In case you are living under a rock .NET team released so called .NET Tools. A .NET tool is a special NuGet package that you can install and use as a normal CLI application. If you install it as a global tool, it automatically gets added to your PATH variable and you can use it everywhere in your PC. How cool is that?

Find the Problem in Our Application

We will use some .NET tools to figure out why our application behaves the way it does. This approach can be used even if you do not have the source code of the application. But having it certainly helps.

1. Install dotnet-counters Tool

dotnet-counters is a performance monitoring tool. It can display performance counter values that are published via the EventCounter API. For example, you can quickly monitor things like CPU usage.

However, we will only use it to identify our process. If you know your process Id you can skip this step.

Open your cmd and run:

dotnet tool install --global dotnet-counters

This is how you can install .NET tools. Pay attention to the global flag. This is what tells .NET that this is a global tool.

You can run:

dotnet-counters ps

Which will output every .NET application that is currently running on the machine. In my case there is only one.

Vs_Initial_Screen

As we clearly see the PID of our crashing application is 3552.

2. Create a Memory Dump

Once we identify the application that we want to debug we can create a memory dump. A memory dump takes all information from the RAM which is associated with our application and writes it to a file so we can analyze later.

Before taking a memory dump we need to install another tool:

dotnet tool install --global dotnet-dump

The dotnet-dump is a way to collect and analyze Windows and Linux dumps without any native debugger involved like lldb on Linux.

After installing the tool, we can start it:

dotnet-dump collect -p 3552

After it finishes writing a file is generated and dotnet-dump outputs its location.

Vs_Initial_Screen

In my case the location is /mnt/c/Users/hasan/core_20210216_001752

The weird location is because I am using WSL2. If you want more information you can check Windows Subsystem for Linux.

3. Analyze the memory dump

With dotnet-dump except creating the dump we can also analyze it:

dotnet-dump analyze /mnt/c/Users/hasan/core_20210216_001752

The latest part of the command after analyze is the path to the dump file from previous step. This should give us a nice pointy arrow where we can execute some commands.

Vs_Initial_Screen

This tool is actually SOS.dll debugging extension from windbg which is integrated as .NET tool. You can execute the same commands which you can execute from the actual windbg with SOS installed.

Here is the documentation of SOS.dll.

Basically, SOS lets you view information about code that is running inside the .NET Core runtime. So, what we can do with this? We can print the entire heap and check what is in there:

dumpheap

This command can take quite some time if your heap is large. Unfortunately printing the entire heap is not useful. We only need the exceptions. Fortunately, we can apply a filter which prints only the exception objects that are in the heap:

dumpheap -type "Exception"

This will result in a lot less objects from the first one. From now on we can do some detective work. We know that our application throws this error occasionally which means that we can ignore all exceptions that have Count of 1.

Vs_Initial_Screen

And focus only on these exceptions that have count of 2 or greater.

Vs_Initial_Screen

We can check each of these objects individually since they are not that many or we can continue our detective work. From now on everything boils down to experience. Which of these exceptions look like caused from a client application? Can you guess? If I already did not know the answer, I would start my investigation from System.ArgumentOutOfRangeException. Which is thrown 6 times. Other types look like internal .NET types.

Vs_Initial_Screen

Let’s see all the 6 objects that are in the heap and have this type:

dumpheap -mt 00007fb2b17dcb70

With -mt (short from Method Table) we can display all objects that are associated with this method table. Which shows us every object that is of type System.ArgumentOutOfRangeException.

Vs_Initial_Screen

Each object has the same size. Which we can deduce that they are the same and it does not matter which one we pick. Let’s just pick the first one and see what is inside:

dumpobj 00007fae5c00b2c8

Vs_Initial_Screen

This should look familiar. These are the properties of System.ArgumentOutOfRangeException type. We can drill down further of each property to see what is inside but since this is an Exception type, SOS has a nice command to print the relevant information of the exception:

Vs_Initial_Screen

Here we can see the exception message and the stack trace on where our exception happened. Which clearly says that the exception is thrown on DebugAppExample -> Controllers -> WeatherForecasetController -> Get Method. Let’s check the code:

[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
    var rng = new Random();

    if (rng.Next(0, 100) % 5 == 0)
    {
        throw new ArgumentOutOfRangeException("Specified argument was out of range of valid values");
    }

    return Enumerable.Range(1, 5).Select(index => new WeatherForecast
    {
        Date = DateTime.Now.AddDays(index),
        TemperatureC = rng.Next(-20, 55),
        Summary = Summaries[rng.Next(Summaries.Length)]
    })
    .ToArray();
}

We now know which method to look for the error and the exception message. From now on it is straight forward.

Conclusion

We take a look at .NET tools and used some of them to debug our live application. With the integration of SOS.dll we can analyze memory dumps both in Linux and Windows.

You can check the full source code of the demo application in my github.

What are your ways of debugging live applications whose behavior you cannot recreate locally? Share with me in comments or send a tweet.

Become a Subscriber

Subscribe to my blog and get the latest posts straight to your inbox.