Table of Contents

Custom converters

While using the GenerateShapeAttribute is by far the simplest way to make an entire type graph serializable, some types may not be compatible with automatic serialization. In such cases, you can define and register your own custom converter for the incompatible type.

Define your own converter

Consider class Foo that cannot be serialized automatically.

Declare a class that derives from MessagePackConverter<T>:

using Nerdbank.MessagePack;

public record Foo(int MyProperty1, string? MyProperty2);

class FooConverter : MessagePackConverter<Foo?>
{
    public override Foo? Read(ref MessagePackReader reader, SerializationContext context)
    {
        if (reader.TryReadNil())
        {
            return null;
        }

        context.DepthStep();
        int property1 = 0;
        string? property2 = null;

        int count = reader.ReadMapHeader();
        for (int i = 0; i < count; i++)
        {
            string? key = reader.ReadString();
            switch (key)
            {
                case "MyProperty":
                    property1 = reader.ReadInt32();
                    break;
                case "MyProperty2":
                    property2 = reader.ReadString();
                    break;
                default:
                    // Skip the value, as we don't know where to put it.
                    reader.Skip(context);
                    break;
            }
        }

        return new Foo(property1, property2);
    }

    public override void Write(ref MessagePackWriter writer, in Foo? value, SerializationContext context)
    {
        if (value is null)
        {
            writer.WriteNil();
            return;
        }

        context.DepthStep();
        writer.WriteMapHeader(2);

        writer.Write("MyProperty");
        writer.Write(value.MyProperty1);

        writer.Write("MyProperty2");
        writer.Write(value.MyProperty2);
    }
}
Caution

It is imperative that each Write and Read method write and read exactly one msgpack structure.

A converter that reads or writes more than one msgpack structure may appear to work correctly, but will result in invalid, unparseable msgpack. Msgpack is a structured, self-describing format similar to JSON. In JSON, an individual array element or object property value must be described as a single element or the JSON would be invalid.

If you have more than one value to serialize or deserialize (e.g. multiple fields on an object) you MUST use a map or array header with the appropriate number of elements you intend to serialize. In the Write method, use WriteMapHeader or WriteArrayHeader. In the Read method, use ReadMapHeader() or ReadArrayHeader().

Custom converters are encouraged to override MessagePackConverter<T>.GetJsonSchema to support the MessagePackSerializer.GetJsonSchema methods.

Security considerations

Any custom converter should call SerializationContext.DepthStep on the SerializationContext argument provided to it to ensure that the depth of the msgpack structure is within acceptable bounds. This call should be made before reading or writing any msgpack structure (other than nil).

This is important to prevent maliciously crafted msgpack from causing a stack overflow or other denial-of-service attack. A stack overflow tends to crash the process, whereas a call to DepthStep merely throws a typical MessagePackSerializationException which is catchable and more likely to be caught.

While checking the depth only guards against exploits during deserialization, converters should call it during serialization as well to help an application avoid serializing a data structure that they will later be unable to deserialize. It also taps into the built-in cancellation token checks built into depth tracking.

Applications that have a legitimate need to exceed the default stack depth limit can adjust it by setting SerializationContext.MaxDepth to a higher value.

Delegating to sub-values

The SerializationContext.GetConverter method may be used to obtain a converter to use for members of the type your converter is serializing or deserializing.

public override void Write(ref MessagePackWriter writer, in Foo? value, SerializationContext context)
{
    if (value is null)
    {
        writer.WriteNil();
        return;
    }

    context.DepthStep();
    writer.WriteMapHeader(2);

    writer.Write("MyProperty");
    SomeOtherType? propertyValue = value.MyProperty1;
    context.GetConverter<SomeOtherType>().Write(ref writer, propertyValue, context);
    writer.Write("MyProperty2");
    writer.Write(value.MyProperty2);
}

The above assumes that SomeOtherType is a type that you declare and can have GenerateShapeAttribute<T> applied to it. If this is not the case, you may provide your own type shape and reference that. For convenience, you may want to apply it directly to your custom converter:

// SomeOtherType is outside your assembly and not attributed.
public partial record SomeOtherType;

[GenerateShape<SomeOtherType>] // allow FooConverter to provide the shape for SomeOtherType
partial class FooConverter : MessagePackConverter<Foo?>
{
    public override Foo? Read(ref MessagePackReader reader, SerializationContext context)
    {
        // ...
        context.GetConverter<SomeOtherType, FooConverter>().Read(ref reader, context);
        // ...

The GenerateShapeAttribute<T> is what enables FooConverter to be a "provider" for the shape of SomeOtherType.

Arrays of a type require a shape of their own. So even if you define your type MyType with GenerateShapeAttribute<T>, serializing MyType[] would require a witness type and attribute. For example:

// SomeOtherType is outside your assembly and not attributed.
public partial record SomeOtherType;

[GenerateShape<SomeOtherType[]>]
partial class FooConverter : MessagePackConverter<Foo?>
{
    public override Foo? Read(ref MessagePackReader reader, SerializationContext context)
    {
        // ...
        context.GetConverter<SomeOtherType[], FooConverter>().Read(ref reader, context);
        // ...

Version compatibility

Important

Consider forward and backward version compatibility in your serializer. Assume that your converter will deserialize values that a newer or older version of your converter serialized.

Version compatibility may take several forms. Most typically it means to be prepared to skip values that you don't recognize. For example when reading maps, skip values when you don't recognize the property name. When reading arrays, you must read all the values in the array, even if you don't expect more than some given number of elements.

The sample above demonstrates reading all map entries and values, including explicitly skipping entries and values that the converter does not recognize. If you're serializing only property values as an array, it is equally important to deserialize every array element, even if fewer elements are expected than are actually there. For example:

context.DepthStep();
int property1 = 0;
string? property2 = null;
int count = reader.ReadArrayHeader();
for (int i = 0; i < count; i++)
{
    switch (i)
    {
        case 0:
            property1 = reader.ReadInt32();
            break;
        case 1:
            property2 = reader.ReadString();
            break;
        default:
            // Skip the value, as we don't know where to put it.
            reader.Skip(context);
            break;
    }
}

return new Foo(property1, property2);

Note the structure uses a switch statement, which allows for 'holes' in the array to develop over time as properties are removed. It also implicitly skips values in any unknown array index, such that reading all array elements is guaranteed.

Performance considerations

Cancellation handling

A custom converter should honor the SerializationContext.CancellationToken. This is mostly automatic because most converters should already be calling SerializationContext.DepthStep(), which will throw OperationCanceledException if the token is canceled.

For particularly expensive converters, it may be beneficial to check the token periodically through the conversion process.

Memory pressure

The built-in converters take special considerations to avoid allocating, encoding and deallocating strings for property names. This reduces GC pressure and removes redundant CPU time spent repeatedly converting UTF-8 encoded property names as strings. Your custom converters may follow similar patterns if tuning performance for your particular type's serialization is important.

The following sample demonstrates using the MessagePackString class to avoid allocations and repeated encoding operations for strings used for property names:

[MessagePackConverter(typeof(MyCustomTypeConverter))]
public class MyCustomType
{
    public string? Message1 { get; set; }

    public string? Message2 { get; set; }
}

public class MyCustomTypeConverter : MessagePackConverter<MyCustomType>
{
    private static readonly MessagePackString Message1 = new(nameof(MyCustomType.Message1));
    private static readonly MessagePackString Message2 = new(nameof(MyCustomType.Message2));

    public override MyCustomType? Read(ref MessagePackReader reader, SerializationContext context)
    {
        if (reader.TryReadNil())
        {
            return null;
        }

        string? message1 = null;
        string? message2 = null;

        int count = reader.ReadMapHeader();

        // It is critical that we read or skip every element of the map, even if we don't recognize the key.
        for (int i = 0; i < count; i++)
        {
            // Compare the key to those we recognize such that we don't decode or allocate strings unnecessarily.
            if (Message1.TryRead(ref reader))
            {
                message1 = reader.ReadString();
            }
            else if (Message2.TryRead(ref reader))
            {
                message2 = reader.ReadString();
            }
            else
            {
                // We don't recognize the key, so skip both the key and the value.
                reader.Skip(context);
                reader.Skip(context);
            }
        }

        return new MyCustomType
        {
            Message1 = message1,
            Message2 = message2,
        };
    }

    public override void Write(ref MessagePackWriter writer, in MyCustomType? value, SerializationContext context)
    {
        if (value is null)
        {
            writer.WriteNil();
            return;
        }

        writer.WriteMapHeader(2);

        // Write the pre-encoded msgpack for the property names to avoid repeatedly paying encoding costs.
        writer.WriteRaw(Message1.MsgPack.Span);
        writer.Write(value.Message1);

        writer.WriteRaw(Message2.MsgPack.Span);
        writer.Write(value.Message2);
    }
}

Stateful converters

Converters are usually stateless, meaning that they have no fields and serialize/deserialize strictly on the inputs provided them via their parameters.

When converters have stateful fields, they cannot be used concurrently with different values in those fields. Creating multiple instances of those converters with different values in those fields requires creating unique instances of MessagePackSerializer which each incur a startup cost while they create and cache the rest of the converters necessary for your data model.

For higher performance, configure one MessagePackSerializer instance with one set of converters. Your converters can be stateful by accessing state in the SerializationContext parameter instead of fields on the converter itself.

For example, suppose your custom converter serializes data bound for a particular RPC connection and must access state associated with that connection. This can be achieved as follows:

  1. Store the state in the SerializationContext via its SerializationContext.this[object] indexer.
  2. Apply that SerializationContext to a MessagePackSerializer by setting its StartingContext property.
  3. Your custom converter can then retrieve that state during serialization/deserialization via that same SerializationContext.this[object] indexer.
class Program
{
    static void Main()
    {
        MessagePackSerializer serializer = new()
        {
            StartingContext = new SerializationContext
            {
                ["ValueMultiplier"] = 3,
            },
        };
        SpecialType original = new(5);
        Console.WriteLine($"Original value: {original}");
        byte[] msgpack = serializer.Serialize(original);
        Console.WriteLine(MessagePackSerializer.ConvertToJson(msgpack));
        SpecialType deserialized = serializer.Deserialize<SpecialType>(msgpack);
        Console.WriteLine($"Deserialized value: {deserialized}");
    }
}

class StatefulConverter : MessagePackConverter<SpecialType>
{
    public override SpecialType Read(ref MessagePackReader reader, SerializationContext context)
    {
        int multiplier = (int)context["ValueMultiplier"]!;
        int serializedValue = reader.ReadInt32();
        return new SpecialType(serializedValue / multiplier);
    }

    public override void Write(ref MessagePackWriter writer, in SpecialType value, SerializationContext context)
    {
        int multiplier = (int)context["ValueMultiplier"]!;
        writer.Write(value.Value * multiplier);
    }
}

[GenerateShape]
[MessagePackConverter(typeof(StatefulConverter))]
partial record struct SpecialType(int Value);

When the state object stored in the SerializationContext is a mutable reference type, the converters may mutate it such that they or others can observe those changes later. Consider the thread-safety implications of doing this if that same mutable state object is shared across multiple serializations that may happen on different threads in parallel.

Converters that change the state dictionary itself (by using SerializationContext.this[object]) can expect those changes to propagate only to their callees.

Strings can serve as convenient keys, but may collide with the same string used by another part of the data model for another purpose. Make your strings sufficiently unique to avoid collisions, or use a static readonly object MyKey = new object() field that you expose such that all interested parties can access the object for a key that is guaranteed to be unique.

Async converters

MessagePackConverter<T> is an abstract class that requires a derived converter to implement synchronous Write and Read methods. The base class also declares virtual async alternatives to these methods (WriteAsync and ReadAsync, respectively) which a derived class may optionally override. These default async implementations are correct, and essentially buffer the whole msgpack representation while deferring the actual serialization work to the synchronous methods.

For types that may represent a great deal of data (e.g. arrays and maps), overriding the async methods in order to read or flush msgpack in smaller portions may reduce memory pressure and/or improve performance. When a derived type overrides the async methods, it should also override PreferAsyncSerialization to return true so that callers know that you have optimized async paths.

The built-in converters, including those that serialize your custom data types by default, already override the async methods with optimal implementations.

Register your custom converter

There are two ways to get the serializer to use your custom converter.

Note that if your custom type is used as the top-level data type to be serialized, it must still have GenerateShapeAttribute applied as usual.

Attribute approach

To get your converter to be automatically used wherever the data type that it formats needs to be serialized, apply a MessagePackConverterAttribute to your custom data type that points to your custom converter.

[MessagePackConverter(typeof(MyCustomTypeConverter))]
public class MyCustomType { }

Runtime registration

For precise runtime control of where your converter is used and/or how it is instantiated/configured, you may register an instance of your custom converter with an instance of MessagePackSerializer using the RegisterConverter.

MessagePackSerializer serializer = new();
serializer.RegisterConverter(new MyCustomTypeConverter());