'Avoiding allocations but without allowing default zero values for value types

Value types in C# can't have a parameterless ctor, as the default behaviour for CLR when creating an instance without parameters is to just zero all the bits.

Assume that I have a use case where I want to have a type that enforces some invariants on an underlying value and those invariants disallow the default all-zero state. This forces me into using a reference type. Assume now that I'm making a lot, and I mean a lot of instances of this type. Performance is critical, and those allocations add up and put a lot of pressure on the GC. I'd very much like to avoid those allocations, and using a value type is the first that comes to mind. But alas, I can't, since the default value is invalid.

Assume that the type satisfies all other things that you'd want from a value type, i.e. it does represent a single value, has value semantics, takes up to 16 bits. The only catch is that all-zeroes value is an invalid state for this type.

How would I achieve my performance goals, which are verifiably constrained by the GC pressure in the outlined scenario, without sacrificing the invariants contracted by my type?

EDIT:

A very quick example, assume I hold a sequence of 64 bits along with an index of the first non-zero bit.

public <struct/class> BitSequence64
{
    private long _bits;
    private int _firstNonZero;

    public IEnumerable<byte> Bytes => ...

    // A bunch of helper properties.

    public BitSequence64(long value)
    {
        // Set the _firstNonZero, etc.
        ...
    }

    // Methods that allow you to twiddle the bits but maintaining 
    // the invariant of always having at least one non-zero.
}

So obviously setting _bits to all-zero makes no sense, especially since _firstNonZero would then point to the first bit, which isn't nonzero. There's a lot of those individual sequences and I would very much like dependants of this type to safely use it without needing to validate that it's not a default value every time it's passed to a public facing API.



Solution 1:[1]

A technique I've used with a good deal of success is to alter how the internal state is interpreted so that default(T) for value type T is valid when zero is not valid or a non-zero default is desired.

Exactly how you accomplish that will depend on what constitutes the type's internal state. Generally speaking, at least one of the member fields will store a value that is not the same as what the consumer sees. Instead, the type's public interface will account for the differences coming into and going out of the type.

What will help you determine how to do that is to understand the natural properties (mathematical or otherwise) of the types in the internal state.

The first thing to do is select a reasonable default for the type. Obviously, the natural default(T) has been determined unreasonable, so something else needs to be selected. For example, that might be the minimum value that's within the valid range. Whatever it is, though, it will inform how inputs need to be adjusted before storing them, and how internal values needs to be adjusted (by the inverse operation) before returning them.

A Starter Example

A very contrived and rudimentary example of this technique is the following Year wrapper type.

Caution: DO NOT use this example as-is; it's for demonstration purposes only.

public readonly struct Year
{
    private const int Delta = 2000;

    private readonly int _value;

    public Year(int value)
    {
        _value = value - Delta;
    }

    public int Value => _value + Delta;
}

Here, the default is the Delta constant used to adjust the internal state. In default(Year), _value will be 0 but the Value property will return 2000. Similarly, new Year(2000) will translate 2000 to 0 on input and back to 2000 on output. Another way to think of it is that _value represents an offset from the default.

As you build functionality around this internal representation, it's important to remember that only the constructor and the Value property should access the backing field. Everything else, even private members, should use the Value property to ensure consistency. Likewise, creating new instances should use the constructor and pass consumer-facing values. Using the backing field anywhere else invites potential bugs so it's best to avoid doing that. Unit tests are crucial to ensure consistency.

The Type in Question

The case of BitSequence64 is a little tricker because of that particular invariant. In the comments, it looked like a default of 1 -- only bit 0 set -- might be a reasonable default for the type. From here on out, I'll operate under the assumption that it is.

This can be accomplished by XOR'ing the actual value by 1. That's pretty nice because XOR'ing by 1 is its own inverse operation.

Now, default(BitSequence64) is valid because it represents the same value that new BitSequence64(1L), which is also valid, represents.

public struct BitSequence64
{
    private const long DefaultBit = 1L;
    private long _value;
    private int _firstNonZero;

    public BitSequence64(long value)
    {
        if (value == 0)
            throw new ArgumentException("At least one bit must be set.", nameof(value));

        _value = value ^ DefaultBit;
        _firstNonZero = GetFirstNonZero(_value);
    }

    public long Value => _value ^ DefaultBit;

    public int FirstNonZero => _firstNonZero;

    // Note that this property uses the post-adjustment, consumer-facing value.
    public IEnumerable<byte> Bytes => BitConverter.GetBytes(Value);

    private static int GetFirstNonZero(long value)
    {
        // TODO: Incorporate your implementation here.
        throw new NotImplementedException();
    }

    // And, of course, let's not forget the members that do bit-twiddling
    // while maintaining the invariants.
    // ...
}

A default of 1 is convenient, but what if you needed the default to be 0x8000_0000_0000_0000 (MSB set)? _firstNonZero would no longer be correct when defaulted to zero.

It's easy enough to account for that in the internal state. We can redefine _firstNonZero as the distance from bit 31 "downward", instead of the distance from bit 0 "upward". In addition to XOR'ing the value by the most significant bit instead, modify the constructor and the FirstNonZero property to perform the translation on _firstNonZero.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1