2018-05-04T10:14:05-07:00

Go CBOR encoder: Episode 4, reflect and pointers

In the previous episode we encoded positive integers and learned how to write a CBOR item with a variable size. Our CBOR encoder can now encode nil, true, false, and unsigned integers. cbor.Encoder has grown strong, but type switches have their limits, we need more powerful weapons for the battles ahead: we’re about to take on pointers, and reflect will be our sword.

In the first episode of the series we encoded of the nil value since it was the easiest value to start with, but we aren’t finished with the nil we still got work to do to cover all cases. That’s because our encoder only handles the “naked” nil value, but not typed pointers that are nil. Whaaat? There are two kinds of nil pointers? Yep, that’s because nil by itself is special. Consider the following code:

var p *int = nil
var v interface{} = p
switch v.(type) {
case nil:
    fmt.Println("nil")
case *int:
    fmt.Println("int pointer")
}

The example above prints “int pointer”, because v isn’t a regular value but an interface that points to a int pointer value. Go interfaces are pairs of 32 or 64bits addresses: one for the type and one for the value. So in the type switch above we match the *int case because p’s type is *int. If we replaced the v definition with var v interface{} = nil, the program would print “nil”. That’s because the type of a nil value is itself nil, but typed-pointers’ type aren’t. Russ Cox’s article Go Data Structures: Interfaces is a superb introduction to how Go interfaces work if you’d like to learn more.

Let’s exhibit the problem in our code and add a test for typed nil pointers:

func TestNilTyped(t *testing.T) {
    var i *int = nil
    testEncoder(t, i, nil, []byte{0xf6})

    var v interface{} = nil
    testEncoder(t, v, nil, []byte{0xf6})
}

And run our tests with go test to see what happens:

--- FAIL: TestNilTyped (0.00s)
    cbor_test.go:18: err: &errors.errorString{s:"Not Implemented"} != <nil> with (*int)(nil)

The *int(nil) value isn’t recognized. So why did plain nil worked? Because it’s special: both its type and its value are nil. The Encode function matches the naked nil with the case nil statement in the type switch, this means only interfaces with a nil type will be matched. Therefor the code only works with the naked nil value, but not with typed pointers.

It turns out there’s a package to address that: reflect introspects the type system and let us match pointer types individually. The Law of reflection is a great introduction to reflection and the use of this package.

So we want to know if a value is a pointer. How does reflect help us? Consider this snippet:

fmt.Println(reflect.ValueOf(nil).Kind())
var i *int = nil
fmt.Println(reflect.ValueOf(i).Kind())

It prints:

invalid
ptr

What happens here? First we convert each Go value to a reflect.Value, then we query its type with the method Kind that returns a reflect.Kind enumeration. reflect.Kind represents the specific kind of type that a Type represents. Kinds are families of types. For example there is a kind for structs —reflect.Struct—, for functions —reflect.Func—, and for pointers —reflect.Pointer.

We see above that the naked nil value and a nil pointer to integer have different kinds: invalid, and ptr. We’ll have to handle the two cases separately.

Refactoring time! We replace the type switch with a switch statement on the Kind of our value. In the example below x.Kind() allows us to distinguish types the same way the type switch x.(type) did:

func (e *Encoder) Encode(v interface{}) error {
    x := reflect.ValueOf(v)
    switch x.Kind() {
    case reflect.Invalid:
        // naked nil value == invalid type
        return e.writeHeader(majorSimpleValue, simpleValueNil)
    case reflect.Bool:
        var minor byte
        if x.Bool() {
            minor = simpleValueTrue
        } else {
            minor = simpleValueFalse
        }
        return e.writeHeader(majorSimpleValue, minor)
    case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
        return e.writeInteger(x.Uint())
    }
    return ErrNotImplemented
}

To identify pointer types reflect has a Kind called reflect.Ptr. We add a another case clause for reflect.Ptr, then if the pointer is nil we write the encoded nil value to the output:

case reflect.Ptr:
	if x.IsNil() {
		return e.writeHeader(majorSimpleValue, simpleValueNil)
	}

After we add that, a quick run of go test confirms that TestNilTyped works.

Splendid! We solved nil pointers. How about non-nil pointers? They are relatively easy to handle: if we detect a pointer we can fetch the value it refers to via reflect.Indirect. So when we get a pointer we get the value it references instead of the memory address. Here’s an example of how reflect.Indirect works:

var i = 1
var p = &i
var reflectValue = reflect.ValueOf(p)
fmt.Println(reflectValue.Kind())
fmt.Println(reflect.Indirect(reflectValue).Kind())

It prints:

int
ptr

When we find a non-nil pointer type, we call the Indirect function to retrieve the pointed value and we recursively call the Encode method on that value. We add a new test: TestPointer that verifies pointer referencing works as intended:

func TestPointer(t *testing.T) {
    i := uint(10)
    pi := &i  // pi is a *uint

    // should output the number 10
    testEncoder(t, pi, nil, []byte{0x0a})
}

With our test written let’s add the code necessary to handle valid pointers in our case clause:

case reflect.Ptr:
	if x.IsNil() {
		return e.writeHeader(majorSimpleValue, simpleValueNil)
	} else {
		return e.Encode(reflect.Indirect(x).Interface())
	}

reflect.Indirect(x).Interface()) retrieves an interface to x’s underlying value, we pass it recursively to Encode and return the result. So if we passed a pointer to pointer to pointer to integer (***int) we’d have 3 recursive calls to Encode. TestPointer now passes, we are done with pointers!

There’s a repository with the code for this episode.

The reflect package will help us to handle more complex types in subsequent episodes. Next time we will encode string types: byte string, and Unicode strings.