What You Need To Know About gRPC Error Handling

October 20, 2023
Rss Fetcher

The official gRPC documentation hardly mentions error handling. It recommends using Google’s status package if you need a richer error model, but are there any alternatives?

This article came about when I started wondering how error handling works. For example, what’s received by the client if the server returns a standard library error value such as errors.New(“foo”)? And what happens if they have a custom error model such as:

type Misdirection struct {
  Culprit string
  Scapegoat string
}

func (m *Misdirection) Error() string {
  return m.Scapegoat
}

Does it make sense for the server to define sentinel errors the client can use to handle different error cases?

gRPC Is a Framework

A framework is a piece of code that calls code written by users of the framework.

Google has written a gRPC server framework that does all the heavy lifting. When a client sends a request, it finds the correct server method and calls it. An example of a server method is this. What does the framework code look like where the server method is invoked?

The gRPC protocol supports four types of communication between a server and a client: unary RPC calls (synchronous client — server request/response), server streaming RPCs, client streaming RPCs, and bidirectional streaming.

There must be multiple call sites in the framework to the server methods depending on the type of communication, but to make this as simple as possible, I looked into the code for unary RPCs. The following answer is based on version 1.58.2 of Google’s gRPC framework package. On line 1346, the gRPC server framework calls the server method.

An excerpt is shown below:

reply, appErr := md.Handler(info.serviceImpl, ctx, df, s.opts.unaryInt)
if appErr != nil {
  ...
}

The reply and appErr variables are the response and error values returned by the server method. The type of appErr is the standard library’s error interface. Note this is not what the client receives from the server but rather what the gRPC framework receives before it sends it over the wire.

So, what does the framework do with it? If appErr is non-nil, the framework has an if branch for handling it. How it handles it depends on what the server method returns.

You’ll learn more about this in the next section.

What Is Sent Over the Wire?

First, the framework attempts to convert appErr into a *status.Status value. This conversion fails if the server method returns anything but a *status.Error, such as when returning the standard library’s errors.New(…) or a custom error like my Misdirection example above.

The code comment in the framework code below explains how it handles these non-*status.Error errors, but to further elaborate, status.FromContextError initialises a *status.Status value with the code Unknown. Its message field is assigned the string value appErr.Error(). This is the answer to the question of what the client receives.

appStatus, ok := status.FromError(appErr)
if !ok {
   // Convert non-status application error to a status error with code
   // Unknown, but handle context errors specifically.
   appStatus = status.FromContextError(appErr)
   appErr = appStatus.Err()
}

In the case where the conversion ofappErr succeeds, appStatus holds the *status.Status value the server method returned. Either way, appStatus is sent over the wire to the client in the following code snippet.

if e := t.WriteStatus(stream, appStatus); e != nil {
  channelz.Warningf(logger, s.channelzID, "grpc: Server.processUnaryRPC failed to write status: %v", e)
}

Sentinel Errors

Before diving into sentinel errors, I’d say the gRPC error model is designed in such a way as to remove many of the use cases for sentinel errors. For example, when calling an “update resource” operation, it might fail because the subject of the update might not exist in the system.

In such a case, the server method can return a status.Errorf(codes.NotFound, “failed to update resource by id = %s”, id). For the client, it’s probably enough to check the received status code (NotFound) to determine the cause of the failed operation since it knows which resource it requested to update.

The alternative would be to have a sentinel error for the specific error. One such error is ErrPersonNotFound, but that may lead to many similar but different sentinel errors, including ErrDogNotFound, ErrOrderNotFound, etc.

If you need specific errors the client can use to handle a received error correctly, then sentinel errors add value. There’s a catch, though. To make it easy for clients to compare their received errors to your sentinel errors, they must be constructed using the status.Errorf function.

One example is var ErrPersonNotFound = status.Errorf(codes.NotFound, “failed to …”).

But, why is this important?

It’s important because status.Errorf returns a *status.Error value, not a *status.Status. The former is comparable using errors.Is(err, ErrPersonNotFound); the latter is not. This is because *status.Error implements the Is(err target) bool method which is used under the hood when calling errors.Is.

The type *status.Status does not implement this method.

Conclusion

In cases where server methods return non-nil errors, the gRPC server framework in Go always sends a value of type *status.Error over the wire.

Sentinel errors can be used in gRPC implementations as well, but make sure to construct them using the status.Errorf function.

What You Need To Know About gRPC Error Handling was originally published in Better Programming on Medium, where people are continuing the conversation by highlighting and responding to this story.

gRPC Is a Framework

What Is Sent Over the Wire?

Sentinel Errors

Conclusion

Previous Post

Next Post

Solutions

Regions Covered