Observability

RE: Good Logging

Most people do logs wrong, here's my opinion on how to do them correctly.

I read an excellent article today that covers off logging standards quite well. However, there are a few points missing (and ones that come up for me regularly) that I would like to address. In addition to the points outlined, I'd like to include:

Use appropriate log levels
TRACE can never contain too many log messages
Do not log non-text data or secrets to logs

Use appropriate log levels

When selecting log levels, it is essential to choose the correct one. My definitions are as follows:

PANIC: a critical error has occurred, and the program has stopped executing.
ERROR: a critical error has occurred. However, the program is still running, and the currently running process has failed to complete correctly.
WARN: a specific operation has been unable to complete. However, the process has continued. Some remediation may need to take place due to this.
INFO: a message that describes a critical event or state change has happened within the program (e.g. a record had a CRUD operation performed on it, or a batch has finished, etc.).
DEBUG: messages generated from inside of a function within a program that contain variable information or indicate a critical point in the function (e.g. "24 records returned" or "generating id").
TRACE: messages generated from inside of a function that indicates a known step has occurred; may contain non-important variable information.

Error, warnings, and informational entries should always be in Production logs. Use these entries to determine when the program is running abnormally, or alternately when it is running correctly. You should be able to understand the current state of your application (or at least a thread of it) by reading the logs. If you cannot, they are not useful. You should only enable debug logs in Production when there is a critical issue, and you do not have enough context to troubleshoot correctly; otherwise, they should be off. Trace logs should never be enabled in Production and should only be used in development or testing to ensure behaviour is correct (i.e. as an aid to manual testing).

TRACE can never contain too many log messages

The whole point of TRACE is that you need to be able to track precisely where execution has failed if an error or warning triggers in a non-obvious way. Ergo, TRACE messages should be very verbose because you need them in circumstances where a bread crumb is required. TRACE is not a verbose DEBUG, because DEBUG is used to display information that can change. TRACE shows information that doesn't change and indicates a static path which provides a way to trace the source of an error.

Do not log non-text data or secrets to logs

This one is very straight forward: log files are not a dumping ground, they have a specific purpose and that purpose is to help you understand the state of your application and to pinpoint where errors occur. Do not log secrets, log whether they exist. Do not log non-text files into log files because it is not readable anyway:

store them on a filesystem
name them appropriately
open the file if you need to see it's contents
archive them for posterity if you need to check something later

All of these solutions make your job more comfortable, and also make it easier to troubleshoot and investigate issues. Polluting log files makes everyone's jobs harder (including your own). Your logs need to be easy to read; otherwise, they become problems in their own right instead of aids to help you fix issues.

Have I missed anything?

If you think I've missed anything off the list, or think I have anything to add - feel free to contact me on twitter @stophamotime.

Thanks for reading!

RE: Good Logging

Use appropriate log levels

TRACE can never contain too many log messages

Do not log non-text data or secrets to logs

Have I missed anything?

Read next

Observability, Black Boxes, and why Signals aren't all that

Resize large images using ImageMagick and PowerShell

Kanban is not a solution for team dysfunction