(More) Portable BitFields Using C++11

There are lots of reasons to using C++’s bit field feature. Perhaps you need a more compact way to represent your data structures, maybe you need to use them interact with hardware, or if you’re writing an emulator, maybe you want to use them to efficiently represent the hardware that you’re emulating. The list goes on…

NOTE: There has been some discussion in the comments about degree of portability of this technique. There seems to be a growing consensus that this is in fact undefined behavior. For system level code, using techniques that are technically UB is pretty much standard practice, but I wanted to fully disclose that unfortunately, this technique is likely UB and it is possible that some compilers/architectures could emit code that has unexpected results.


So, I often see code that looks like this:

union Register {
    uint8_t value;
    struct {
        uint8_t field1 : 2;
        uint8_t field2 : 3;
        uint8_t field3 : 3;
    } bits;
};

The idea is simple. Given an object of type Register, we can easily get the whole 8-bit value. Or you access any of the 3 sub-fields using easy to read, intuitive code, like this:

Register r = { 0 };
r.field1 = 1;
r.field2 = 7;

Instead of the less clear:

uint8_t r = { 0 };
r = (r & ~0x03) | 1;
r = (r & ~0x1c) | (7 << 2);

Bit fields sound infinitely better, right? Well, there is a downside. The ordering of the bit fields is implementation defined, meaning that if you really care about the actual bit positions, then you have to double check that your compiler will lay out the bits the way that you expect. Not only that, but if you ever change compilers you may be chasing a bug that is very hard to spot due to the code looking correct. The bit twiddling version may less clear, and harder to maintain. But at least it is absolutely clear which bits are being set to which. Can’t we have the best of both worlds? Can’t we do that without sacrificing efficiency? The answer is YES!

To maximize flexibility, we’ll want a way to select the underlying type based on the “highest” bit we need to touch. For example, if we have a bit field that starts at bit #5 and it 8 bits wide, we’ll need at least a uint16_t to represent it correctly. Fortunately, C++11 <type_traits> offers a nice and terse way to implement this. std::conditional. You can think of std::conditional as a ternary if, but for a static condition, and will result in 1 of 2 types instead of values. Using this, we can make the following helper class:

template <size_t LastBit>
struct MinimumTypeHelper {
    typedef /* depends on LastBit */ type;
}

We are going to create a structure with a single member type. This type, depending on the integer constant will be one of 5 types. This is implemented using the previously mentioned std::conditional, which works like this:

template <bool B, class T, class F>
struct conditional;

If B resolves to true, the member type will be T. Otherwise, it will be F. For example: std::conditional<true, int, float>::type will equal int. Building on this, we can write the following:

template <size_t LastBit>
struct MinimumTypeHelper {
    typedef
        typename std::conditional<LastBit == 0 , void,
        uint8_t>::type type;
};

which simply makes MinimumTypeHelper<N>::type equal void if N is zero, and uint8_t otherwise. But now we want to support 16-bit types. So we replace the false condition with another std::conditional statement, like this:

template <size_t LastBit>
struct MinimumTypeHelper {
    typedef
        typename std::conditional<LastBit == 0 , void,
        typename std::conditional<LastBit <= 8 , uint8_t,
        uint16_t>::type>::type type
};

We need all of the ::type‘s in there because each std::conditional results in a class which has a type member that is set to one of the parameters given to std::conditional. For each one, we need that type to be passed as a parameter to “parent” conditional. Continuing with this pattern through uint64_t, we end up with the following complete and extensible implementation:

template <size_t LastBit>
struct MinimumTypeHelper {
    typedef
        typename std::conditional<LastBit == 0 , void,
        typename std::conditional<LastBit <= 8 , uint8_t,
        typename std::conditional<LastBit <= 16, uint16_t,
        typename std::conditional<LastBit <= 32, uint32_t,
        typename std::conditional<LastBit <= 64, uint64_t,
        void>::type>::type>::type>::type>::type type;
};

So, now given a bit width, we can easily select a type with MinimumTypeHelper<N>::type. We can finally create our bit field.

template <size_t Index, size_t Bits = 1>
class BitField {
private:
    enum {
        Mask = (1u << Bits) - 1u
    };

    typedef typename MinimumTypeHelper<Index + Bits>::type T;
public:
    template <class T2>
    BitField &operator=(T2 value) {
        value_ = (value_ & ~(Mask << Index)) | ((value & Mask) << Index);
        return *this;
    }

    operator T() const             { return (value_ >> Index) & Mask; }
    explicit operator bool() const { return value_ & (Mask << Index); }
    BitField &operator++()         { return *this = *this + 1; }
    T operator++(int)              { T r = *this; ++*this; return r; }
    BitField &operator--()         { return *this = *this - 1; }
    T operator--(int)              { T r = *this; --*this; return r; }

private:
    T value_;
};

Let’s break this down bit-by-bit (oh I crack myself up :-P).

  • First we have the enum Mask. For a given bit count, this will create a mask of that many bits. So if Bits == 7, then it will result in 0x7f, a pattern of 7 set bits.
  • Next we have typedef typename MinimumTypeHelper<Index + Bits>::type T;, which looks complex, but isn’t too bad once we look more closely. All this is doing is defining a member type based on a type which can support Index + Bits bits. We need to add them because a bit field 3 bits big at position 11 requires a type of at least 14 bits to represent.
  • From this point forward, we actually have pretty straight forward code. The constructor will take a value of any type, mask it, and do traditional bit twiddling to insert it into the value. Using SFINAE, we could restrict the parameter to an integer type if desired.
  • We allow conversion to the base type, so it’s trivial to extract the integer value of just this field using operator T() const.
  • We allow testing of the field as a boolean using explicit operator bool() const
  • Finally, for convenience, we have both increment and decrement operators.

Ok, so we have our class! How the heck do we use it? It’s actually very simple. Once we have all of this framework in place. Re-hashing the example we started with. We could re-write it like this:

union Register {
    uint8_t value;
    BitField<0,2> field1; // field1 is bits [0-2)
    BitField<2,3> field2; // field2 is bits [2-5)
    BitField<5,3> field3; // field3 is bits [5-8)
};

Using is just as simple as the original code:

Register r = { 0 };
r.field1 = 1;
r.field2 = 7;

But we now have full control over how the bits will laid out in memory, and don’t have to worry about portability between compiler implementations and versions. In fact, we just gained a feature previously unmentioned. What if we want a “meta-field” which will alias one or more fields? Using this approach we can have overlapping fields! Let’s use a hardware register as an example:

union Video {
    uint8_t raw;
    BitField<0,1>  background_enabled;
    BitField<1,1>  sprite_enabled;
    // some more fields...

    // meta-fields which don't occupy any space :-)
    BitField<0,2> screen_enabled;
};

In this example, we can independently test if the background or sprites are enabled. But what if we want to know if either are enabled? We we could write:

bool screen_enabled = v.background_enabled || v.sprite_enabled;

But that’s annoying. Because screen_enabled occupies the same space as both of those fields, we can just write:

bool screen_enabled = v.screen_enabled

Using this approach, any set of contiguous bits can be aliased without any size or performance penalty.

What about efficiency? We looking at what GCC and Clang emit, this approach appears to cause them to emit nearly identical code (often exactly identical). There is one last tweak that we could choose to do. A bit field of size one is really just a boolean, so we can specialize for that to make it clear we have boolean semantics and ensure that the compiler will emit optimal code:

template <size_t Index>
class BitField<Index, 1> {
private:
    enum {
        Bits = 1,
        Mask = 0x01
    };

    typedef typename MinimumTypeHelper<Index + Bits>::type T;
public:
    BitField &operator=(bool value) {
        value_ = (value_ & ~(Mask << Index)) | (value << Index);
        return *this;
    }

    explicit operator bool() const { return value_ & (Mask << Index); }

private:
    T value_;
};

This code is very similar, but a bit simpler since we just want to be able to get or set the boolean value of the field. Putting it all together into a ready to use header, we have this:


#ifndef BITFIELD_H_
#define BITFIELD_H_

#include <cstdint>
#include <cstddef>
#include <type_traits>

using std::uint8_t;
using std::uint16_t;
using std::uint32_t;
using std::uint64_t;

namespace {

template <size_t LastBit>
struct MinimumTypeHelper {
    typedef
        typename std::conditional<LastBit == 0 , void,
        typename std::conditional<LastBit <= 8 , uint8_t,
        typename std::conditional<LastBit <= 16, uint16_t,
        typename std::conditional<LastBit <= 32, uint32_t,
        typename std::conditional<LastBit <= 64, uint64_t,
        void>::type>::type>::type>::type>::type type;
};

}

template <size_t Index, size_t Bits = 1>
class BitField {
private:
    enum {
        Mask = (1u << Bits) - 1u
    };

    typedef typename MinimumTypeHelper<Index + Bits>::type T;
public:
    template <class T2>
    BitField &operator=(T2 value) {
        value_ = (value_ & ~(Mask << Index)) | ((value & Mask) << Index);
        return *this;
    }

    operator T() const             { return (value_ >> Index) & Mask; }
    explicit operator bool() const { return value_ & (Mask << Index); }
    BitField &operator++()         { return *this = *this + 1; }
    T operator++(int)              { T r = *this; ++*this; return r; }
    BitField &operator--()         { return *this = *this - 1; }
    T operator--(int)              { T r = *this; --*this; return r; }

private:
    T value_;
};


template <size_t Index>
class BitField<Index, 1> {
private:
    enum {
        Bits = 1,
        Mask = 0x01
    };

    typedef typename MinimumTypeHelper<Index + Bits>::type T;
public:
    BitField &operator=(bool value) {
        value_ = (value_ & ~(Mask << Index)) | (value << Index);
        return *this;
    }

    explicit operator bool() const { return value_ & (Mask << Index); }

private:
    T value_;
};

#endif

Happy bit twiddling!

19 thoughts on “(More) Portable BitFields Using C++11

  1. Wilfried van Asten

    Thanks very much for this! Now my emulator’s code will be that much more readable and easy to maintain. I wrote a lot of the code before I knew about the useful union of bitfield and uint8_t, but am slowly transitioning to that format. Now I can do so and still be certain the code will remain portable πŸ˜€

  2. Marco Nilsson

    How does this do with gcc-4.4.7 since it’s lacking in C++11 support? I’m trying it now an I’m getting a lot of ambiguity errors:


    error: conversion from ‘Base::BitField’ to ‘uint8_t’ is ambiguous

    src/SensorReader.cpp:371: error: ambiguous overload for ‘operator==’ in ‘other.::ExhubAddress::can_addr == exhub.::ExhubAddress::can_addr’
    src/SensorReader.cpp:371: note: candidates are: operator==(long unsigned int, long unsigned int)
    src/SensorReader.cpp:371: note: operator==(long unsigned int, int)
    src/SensorReader.cpp:371: note: operator==(int, long unsigned int)
    src/SensorReader.cpp:371: note: operator==(int, int)

    Is it fixable or does it need a more modern compiler?

  3. Evan Teran

    @Marco,

    I beleive that it is possible to get it to work with C++03, but it’s going to need a LOT of helper code. In particular, `MinimumTypeHelper` will probably have to be done completely differently and just have a specialization for each size you wish to support. It would be a lot less flexible, but it would work (i think).

  4. Wil

    This looks nice, though it seems it does not handle all of the possible issues with mapping hardware registers. In particular, if the Register setup above were mapped to a hardware register,

    a = field1;
    b = field2;

    will result in 2 reads of the register. If the read of this hardware register clears its value, then field2 may not get the correct/expected value.

    Also, I’m wondering what happens when you assign a value of 8 (1000b) to a bit field of width 3 as you show? πŸ™‚ (I know, it will get masked to 0 – which is perhaps what you meant to demonstrate?)

  5. Evan Teran Post author

    So, you are correct, this does read the whole “register” at a time, which if a read can cause side effects, is potentially bad for some use case. That’s a pretty noteworthy caveat, so thanks for pointing it out.

    The 8 was definitely a typo, I meant to place a 7 there. But yes, it would get masked to 0.

  6. db

    “Well, there is a downside. The ordering of the bit fields is implementation defined, meaning that if you really care about the actual bit positions, then you have to double check that your compiler will lay out the bits the way that you expect.”

    Unfortunately, it’s very important to note that another thing that is implementation-defined / undefined behaviour is precisely what you’re doing here – type punning via unions. Technically, the Standard only guarantees the state of the active member of union, i.e. the last member that was accessed before the current operation. It does not specify that if you read into the uint8_t, then write to bits 0~2 using a Bitfield instance, and then try to read 3~5 using another Bitfield, that you will get what you expect. Technically. a compiler is well within its rights to return random values for the 2nd and 3rd of these accesses.

    Sadly, I’m not aware of a way around this (without messes like, e.g. using an external class that refers to the raw byte without a union). MOST compilers/implementations will conveniently behave how you imply – as I can attest to that after extensive usage of a much more sophisticated version of this pattern in my own programs. However, this extremely convenient behaviour is, sadly, very much NOT guaranteed, so you should have a very clear caveat about that prominently in your post.

    I only just learned this and am sort of wishing I hadn’t embedded a technically undefined operation so deeply in several of my current projects. Your post gave me no warning that I’d be doing so.

  7. db

    Having said all that, this might well provide some encouragement for us:
    http://stackoverflow.com/a/11906997

    This quote from the standard implies that, although as mentioned, packing multiple Bitfield structs into a union and accessing them vs another type might be UB…
    …accessing the Bitfields themselves might not, so long as they are “structs that share a common initial sequence [, so] it is permitted to inspect the common initial part of any of them.”

    So, if like your Bitfield here (but not all of mine), they are all structs/classes beginning with the same sequence (member uint8_t), maybe this is standard guaranteed! πŸ™‚ I really hope so, otherwise I have a tonne of rewriting to do… I have to produce portable, defined-behaviour code as far as possible.

  8. AWJ

    Hi,

    This is very clever code, but it isn’t actually portable when the “raw” type underlying the bitfields is larger than a uint8_t. Take this example, pretty typical for an emulated machine’s scroll registers:

    union ScrollRegister {
    uint16_t raw;
    BitField<0,3> fine;
    BitField<3,13> coarse;
    };

    Here the “raw” member is a uint16_t, but the metaprogram-determined type for “fine” is uint8_t. When you access the “fine” member, you’re type-punning a uint16_t as a uint8_t. This will probably work when compiled for Intel or little-endian ARM targets, but it’s compiler-dependent at best and definitely won’t work on a big-endian target (not that there are many of those around now that Intel and ARM have taken over the world… but anyway)

    Rather than calculating the minimum type for each bitfield using template metaprogramming, portability requires that you explicitly specify the same type as the underlying “raw” data for each of the bitfields in the union.

  9. AWJ

    Remember that you can make it truly portable (and, AFAIK, completely standards-compliant C++11!) just by explicitly specifying the underlying integer type.

    Another thing these bitfields are terrifically useful for is decoding instructions in a CPU emulator or a disassembler, especially DSP/RISC/VLIW architectures.

  10. db

    AWJ – “This is very clever code, but it isn’t actually portable when the β€œraw” type underlying the bitfields is larger than a uint8_t.”

    Actually, when it’s not a [unsigned/signed/plain] char. Something that a lot of people don’t realise is that uint8_t is not guaranteed to be a typedef for unsigned char and, under some bizarre exotic implementation, might generate implementation-/undefined behaviour too. Unlikely? Sure. Worth the risk? Nope.

    I had never considered using anything other than unsigned char in such templates. Even when I need to refer to multi-byte things, like my endian class, I use multiple unsigned char and do all the manipulation myself.

    And yes, this sort of class can be incredibly useful, especially when combined with the ‘common initial sequence’ allowance. I’d have a near impossible task if these things weren’t well-defined behaviour. (jinx, touch wood, etc.)

  11. D

    I’m now leaning towards the opinion that this is indeed UB.

    My growing concern was that the ‘read from any union member in the common initial sequence’ (CIS) allowance seemed formally only to apply to allowances made via a named union instance and a named CIS-having struct therein… NOT accesses made via such a union member’s __own_ member function, which has no idea it’s in a union.

    Due to my growing doubts, I asked about this following thread on the C++ std-discussion mailing list. A convincing – and even worse – counterargument has been raised: Whereas the allowance is made for simply reading members of the CIS via the union type – in contrast – calling a member function on a union member that isn’t active is UB, even if it _doesn’t_ read any data member…

    Your ‘portable bitfields’ involve such calls, where the call itself is UB. So it’s worse than a question of whether the call can portably reinterpret a CIS member last written by a different union member. The call simply _cannot be made_.

    I mean, draw your own conclusions… but I would not recommend using this pattern as it seems to be formally UB per the Standard. And now I have to totally rethink my whole program so as to avoid it. Thanks…

    https://groups.google.com/a/isocpp.org/forum/#!topic/std-discussion/mc4LuH7Hx7A/unsubscribe

  12. Evan Teran Post author

    @D,

    Hmm, it seems that your conclusions about this being UB are correct. Unfortunately, there doesn’t seem to be a particularly good solution for this use case that isn’t UB in one form or another.

    Personally, I would love it if the C++ standards comity decided to change the rules such that unions consisting of only POD types can have any members read/written regardless of the “active” state.

    This would make a lot of code like this simpler. Sorry if this entry led you in the wrong direction :-/

  13. D

    Evan: Ah well, we live and learn.* Thank you for listening and adding the warning, which is the most important thing. I appreciate that.

    *What we mostly learn in C/C++ Land is that intuition isn’t worth squat! As this all made perfect ‘intuitive’ sense to me till recently, when I started ponderning what the “common initial sequence” allowance really means (ignoring the dangerously vague wording added to C only by N685) and becoming aware of how nebulous the concept of lifetime is in C++ (and, but to a lesser extent, C). By the rules, we can’t have what we imagined – several objects living at and using the same address – despite how safe and easy it seems.

    And sure, our intuitions matched how a non-prankster compiler handles such code, but for how long? It’s documented that g++ – currently implementation-defines (same as C99) union type-punning to do the bytewise reinterpretation you want.. But I’m not aware that g++ defines the result of calling methods on inactive objects… and anyway, it’s fatal to rely on things that are NOW implementation-defined but which, by that very fact, need not work in the next compiler version. If changes to the optimisation process break this, whether intentionally or not, then we’d better hope we’ve already completed our replacement of it all.

    Speaking of which, my refactoring is going fairly well, and perhaps ironically, I’m ending up with code very similar to my own original idea, before I ever found this… πŸ˜‰ Having said that, I’m also migrating over the various enhancements I ended up making to your ideas here, so it’s not been entirely without benefit πŸ˜‰

  14. Brian

    Small typo; the post-decrement operator in the final snippet is calling the increment operator. Looks like a copy/paste bug.

  15. Moz in Oz

    Just in terms of debugging on trickster compilers, couldn’t you assert the assumptions? I’m sure there’s some trickery possible to declare, set then test in a static_assert, but it definitely seems to work as C-style dynamic asserts. I split the const out to help make it obvious what’s going on):

    assert((Register{1<<5 + 1<<2 + 1}).bits.field1 == 1); // field1 in LSB
    assert((Register{1<<5 + 1<<2 + 1}).bits.field2 == 1);
    assert((Register{1<<5 + 1<<2 + 1}).bits.field3 == 1);
    assert((Register{1<<5 + 1<<2 + 1}).bits.field3 == 3); // should fail!

    dem: main.cpp:34: int main(): Assertion `(Register{32 + 4 + 1}).bits.field3 == 2' failed.

Leave a Reply

Your email address will not be published. Required fields are marked *