A snapshot contains information about the distributed state of execution of a service at a point in time; it includes the stack traces of all the goroutines currently alive across all the processes that are part of the snapshot (both goroutines that are currently running on-CPU, and the many more goroutines that are usually blocked waiting for some asynchronous condition), plus variable data that was captured from different stack frames (i.e. functions currently being executed by some goroutine).
Snapshots can answer question about the current state of a distributed system:
– what requests are in flight and how long have each of them been running for?
– what’s the status of my background jobs?
– what is a particular request blocked on?
A Side-Eye snapshot is like a goroutine dump (think output of /debug/pprof/goroutine?debug=2) on steroids. Besides enumerating all the threads of execution in the system, you can also request for arbitrary data to be collected whenever specific functions are encountered on a stack. And, like all the other data collected by Side-Eye, the tool generates links between semantically-related parts of a snapshot. For example, Side-Eye auto-magically links a client-side goroutine blocked on a network request with the server-side handler. Or, you can navigate from a request that has been blocked for a while to the execution trace of the respective goroutine to see details on the execution of the request before and after the snapshot was taken — so you can see, for example, what other goroutine was holding the lock that the request contended with and what that goroutine was working on at the time.
Side-Eye organizes the captured data in a SQLite or DuckDB database that you can explore in the web application or download and use locally.
Btw, capturing a Side-Eye snapshot is an order of magnitude faster than the stop-the-world event caused by the goroutine profile (i.e. /debug/pprof/goroutine?debug=2), which can stop your programs for hundreds of milliseconds. Also, a snapshot works works across processes, whereas a pprof goroutine dump works at the level of one process.
Besides production environments, Side-Eye snapshots can also be useful in tests/CI. The Side-Eye client library lets you programatically capture a snapshot when some condition is met — for example when a test is about to timeout. This can be very useful when trying to figure out why some operations that should have ended by know is still in progress.