Dear fellow tech enthusiasts, I’m @vuongvm812, and I’m an Intern Software Engineer at CyberAgent Hanoi DevCenter . In this blog post, I’ll be talking about my process of improving our observability system through error handling!

Feel free to play around with the middleware playground !

Background

About CySche

Our team, CySche , is based on the idea of ​​an internal Generative AI contest.

CySche is an application that quickly adapts schedules through interaction with a generative AI. Unlike traditional scheduling tools that primarily rely on a calendar interface, CySche offers an interactive experience simply where employees participant input names and preferred dates and times within the Slack app. The AI then suggests the most suitable time, and once confirmed, it automatically notifies the participants of the finalized schedule.

The project includes these components:

  • api (written in Golang): communicate with Slack’s API.
  • inhouse (written in Golang): retrieves internal information.
  • llm (written in Python): interacts with LLM.

 

Golang’s HTTP Middleware and gRPC Interceptor

 

HTTP / gRPC middleware / interceptor
HTTP / gRPC middleware / interceptor

Both gRPC interceptors and HTTP middleware serve as powerful tools for common concerns in their respective ecosystems by wrapping around request and response handlers. This allows managing us to add the following features:

  • Authentication and authorization
  • Logging
  • Metrics collection
  • Request and response transformation
  • Error handling

gRPC Interceptors are integral to gRPC applications, allowing you to interact with proto messages or context either before or after they are sent or received by the client or server. They work by defining a function that wraps around the request handler, providing direct access to the request, context, and response. This enables you to modify responses—such as adjusting the Status object for status codes or altering the response message—before they are returned.

Similarly, HTTP Middleware in Golang is a chainable function layer that wraps around an HTTP handler to process requests and responses before or after the main logic executes. Middleware functions are composable, making them ideal for implementing reusable and modular solutions to enhance application behavior.

Both patterns follow a similar philosophy of extending application functionality by wrapping core logic, enabling developers to maintain cleaner and more modular codebases.

 

Challenges

Our current monitoring system is currently using Datadog. This system not only tracks our services’ logs, traces, or monitors any crucial errors but also keeps track of the infrastructure such as containers’ resources, network, and database health.

The monitoring system of our project was primitive and did not provide much information when debugging an error through logging. This problem was due to the fact that our current project does not provide a clear status code. The missing status code made it almost hard to trace any error request, leading to an unhealthy customer experience.

Due to this deep-rooted issue, I decided that I would improve it by refactoring our services’ error handling mechanism.

Our First Implementation

After investigating the http.ResponseWriter interface, I can see that the function WriteHeader() only lets us set the header once. Any subsequent calls to WriteHeader() will have no effect. additionally, the Write() function automatically triggers WriteHeader() when being called, defaulting the status code to 200.

func (w *response) WriteHeader(code int) {
    ...
 
    if w.wroteHeader {
        caller := relevantCaller()
        w.conn.server.logf("http: superfluous response.WriteHeader call from %s (%s:%d)", caller.Function, path.Base(caller.File), caller.Line)
        return
    }

    ...

    w.wroteHeader = true
    w.status = code
 
    ...
}
func (w *response) Write(data []byte) (n int, err error) {
	return w.write(len(data), data, "")
}

// either dataB or dataS is non-zero.
func (w *response) write(lenData int, dataB []byte, dataS string) (n int, err error) {
	...
	
	if !w.wroteHeader {
		w.WriteHeader(StatusOK)
	}
	
	...
}

The first problem I must tackle is how we could call WriteHeader()  inside the middleware, even if there is a Write() function call inside the HTTP handler. After researching, I came up with the usage of a second ResponseWriter , which inherits the original http.ResponseWriter .

type ResponseWriter struct { 	
    http.ResponseWriter 	
    statusCode int 	
    body *bytes.Buffer 
} 

func (w *ResponseWriter) WriteHeader(code int) { 	
    w.statusCode = code 
} 

func (w *ResponseWriter) Write(b []byte) (int, error) { 	
    return w.body.Write(b) 
}

This new writer will contain an additional statusCode and body field that acts as temporary storage for the status code and response body. The original http.ResponseWriter will then use those values ​​to send the response to the client. Another aspect to consider is having access to the error code that we will get from our HTTP handler.

Additionally, the http.Handler does not return errors if any occurred. Through many considerations, I decided to change the input type of middleware from http.Handler to a custom HTTP handler that would return an error to the middleware to handle:

type HandlerWithError func(w http.ResponseWriter, req *http.Request) error

After I had every piece of the puzzle, this is the middleware’s first initial form:

func SimpleMiddleware(handler HandlerWithError) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, req * http.Request) {
        wrapper: = common.NewResponseWriter(w)
        if err: = handler(wrapper, req);err != nil {
            http.Error(w, "Internal", int(cerr.StatusCode().HTTPStatusCode()))
        }

        w.WriteHeader(wrapper.StatusCode())
        if _, err: = w.Write(wrapper.Body().Bytes());err != nil {
            http.Error(w, err.Error(), http.StatusInternalServerError)
        }

        w.Write([] byte("Hello from Simple Middleware!\n"))
    })
}

func Handler(w http.ResponseWriter, r * http.Request) error {
    testString: = [] byte("This is Go's handler!\n")
    w.Write(testString)
    return nil
}

...

http.Handle("/simple", SimpleMiddleware(Handler))

After requesting to “/simple” endpoint, we expect to receive a message that would contain two parts:

  • “This is Go’s handler”: this message came from our HTTP handler.
  • “Hello from Simple Middleware”: this message came from our middleware.
$ curl -v localhost:8080/simple                                                                                                                                                                      
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /simple HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Mon, 23 Dec 2024 07:46:44 GMT
< Content-Length: 52
< Content-Type: text/plain; charset=utf-8
<
This is Go's handler!
Hello from Simple Middleware!
* Connection #0 to host localhost left intact

Upgraded Implementation

My initial implementation had a flaw: when adding new middleware, it makes the code hard to read since HTTP middleware basically is functions wrapping one another:

http.Handle("/",FirstMiddleware(SecondMiddleware(handler)))

Middleware Abstraction

This could be fixed by using Go’s philosophy of simplicity. You can reduce the complexity of your code or hide that complexity through the use of the interface.

In the case of our middleware, this surely does help tremendously since we could abstract our middleware implementation, allowing us to share some common packages across middleware, in this case logger .

type middleware struct {
	logger        log.Logger
}

func NewMiddleware(logger log.Logger) *middleware {
	return &middleware{
		logger:        logger,
	}
}

func (m *middleware) FirstChainingMiddleware(handler HandlerWithError) HandlerWithError {
	return CHandlerFunc(func(w http.ResponseWriter, req *http.Request) error {
		...
}

func (m *middleware) SecondChainingMiddleware(handler HandlerWithError) HandlerWithError {
	return CHandlerFunc(func(w http.ResponseWriter, req *http.Request) error {
		...
}

Middleware Chaining

Another problem that arises inside Golang’s middleware implementation is function wrapping. When integrating multiple middlewares, wrapping middlewares together results in two issues:

  • Readability: multiple wrappers would cause the code to become hard to read, for example, FirstMiddleware(SecondMiddleware(…NthMiddleware(handler)).
  • Reusability: In our case, HTTP middleware also uses other imported packages such as logger. Without decoupling these middlewares, we would have to pass logger into middleware whenever we call them: FirstMiddleware(SecondMiddleware(handler, logger), logger).

This forces us to come up with an implementation that would satisfy both of these elements.

The below middleware chaining takes advantage of the variadic function, which takes multiple arguments as an array.

The Go implementation below defines a function Chain that takes a variadic number of Option functions and returns a new function that wraps an HTTP handler. The type Option is a function that takes a CHandlerFunc (a custom handler type) and returns another CHandlerFunc. The Chain function constructs a middleware chain by applying each Option (middleware) in reverse order, starting from the last one. For each middleware, it wraps the current handler (next) and creates a new next handler by calling the middleware. Finally, the last next handler is executed, which represents the final handler after all middleware has been applied. This pattern allows for the composition of middleware in a flexible and modular way.

type Option func(HandlerWithError) HandlerWithError

func Chain(mw ...Option) func(HandlerWithError) http.Handler {
	return func(handler HandlerWithError) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			next := handler
			for k := len(mw) - 1; k >= 0; k-- {
				currHandler := mw[k]
				nextHandler := next

				next = currHandler(nextHandler)
			}

			next(w, r)
		})
	}
}

Our final implementation would look as the following:

func Handler(w http.ResponseWriter, r *http.Request) error {
	testString := []byte("This is Go's handler!\n")
	w.Write(testString)
	return nil
}

func main() {
	logger := &log.Logger{}
	m := middleware.NewMiddleware(logger)
	middlewares := []middleware.Option{
		m.FirstMiddleware,
		m.SecondMiddleware,
	}
	handler := middleware.ChainMiddleware(middlewares...)(Handler)
	http.Handle("/chain", handler)
	http.ListenAndServe(":8080", nil)
}

After another request to “/chain”, we get the following response:

$ curl -v localhost:8080/chain                                                                                                                                                                       
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> GET /chain HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Mon, 23 Dec 2024 07:46:59 GMT
< Content-Length: 81
< Content-Type: text/plain; charset=utf-8
<
This is Go's handler!
Hello from first middleware!
Hello from second middleware!
* Connection #0 to host localhost left intact

Results and Miscellaneous Features

With my error-handling mechanism completed, it’s time to fully utilize and integrate it into my monitoring system.

First of all, through these status codes and error messages, we can now categorize our errors using Datadog monitors. By creating new monitors that would keep track of our system’s health, we were able to find errors that would occur more frequently: UNAVAILABLE, NOT_FOUND, and DynamoDB.Error.

Another feature that I implemented using the middleware chaining method was attaching Slack’s conversation ID to the logs’ metadata. This facilitates the process of debugging a conversion between our CySche and users.

Thread-id in logs' metadata
Thread-id in logs’ metadata

Centralizing all of our services’ metrics was also crucial since it would decrease the time for developers and project managers to grasp the health, resources, or errors that might happen to our infrastructure and backend services.

Datadog Dashboard
Datadog Dashboard

Future Improvements

One particular improvement point was to refactor the abstraction of middleware. Our current implementation is putting all of the middleware methods inside an interface. This reduces our system’s decoupling property as well as makes it harder to mock our middleware.

As we continue to refine our observability system, a key area of ​​focus will be building comprehensive monitoring for user-centric metrics. Understanding user behavior is critical for driving engagement and improving overall user experience. In the future, I plan to implement tracking for metrics such as churn rate, retention rate, bounce rate, and other user activity indicators.

Conclusion

Improving the observability system of an application is essential for ensuring reliability, scalability, and seamless user experiences. By implementing a robust error-handling mechanism, adding targeted monitors, and creating a centralized dashboard, we can gain deeper insights into system performance, quickly identify and resolve issues, and optimize overall operations. Observability is a continuous journey, and these upgrades lay the foundation for maintaining a high-performing, resilient application as demands evolve.