PDA

View Full Version : Will we get rid of the 32 bit flag variables?


Bandobras
December 20, 2007, 19:26
Here is another old rant of mine (from the same post as the one about indentation). Any news? I know it's not so essential for V, because it does not expand (so much), but since V is a base for variants...

One more question --- was there any attempt to rewrite the ugly multiple 32 bit flag variables (bitfields) into some decent scheme, like packed bit arrays or even byte arrays or linked lists of flags, etc.? I mean, any attempt other than the new TOME engine. If not, why is it not the first priority of the V maintainer? (I know one answer --- because V does not need more flags.smile

Edit: I've found two implementations of bit vectors license-compatible with Angband: publib and libbit-vector-perl (which is a C library under the hood, despite the name). I guess linked lists would be much more efficient and understandable, but bit vectors will induce the least code changes (in particular, no problems with serialization).

Edit2: Here a relevant r.g.r.a thread: http://groups.google.pl/group/rec.games.roguelike.angband/browse_thread/thread/7baa7846eee76469/fe132c46d9afa5d7?q=%22bitfield%22&lnk=ol&hl=en&
Several years later and several variants later people still expand the number of TR1_, TR2_, TR3_, ... bitfields by hand over and over again. BTW, one of the advocates of bit vectors in this thread was the current V mainanter, so there is still hope...

P.S. I have no clue which licence or which maintainer I had in mind...

konijn_
December 20, 2007, 22:25
Here is another old rant of mine (from the same post as the one about indentation). Any news? I know it's not so essential for V, because it does not expand (so much), but since V is a base for variants...

P.S. I have no clue which licence or which maintainer I had in mind...

I just dont know, there was stuff out there in 2001 that made the bitflags look saintly. So I am assuming that was really a pet peeve instead of a serious problem ;)

I would say the number 1 issues for V are to have mouse-code and big screen for all platforms and small screen for portable devices and makefiles for RISC OS and makefiles that make nice OSX applications and makefiles that can create .deb files etc.

But that's me of course ;)

Cheers,
T.

zaimoni
December 21, 2007, 01:13
"Edit: ... I guess linked lists would be much more efficient and understandable, but bit vectors will induce the least code changes (in particular, no problems with serialization)."

In this particular application, linked lists are automatically bloated. (Generally, any of the one-datum one-node data structures are automatically bloated and should be avoided whenever either RAM efficiency or memory manager efficiency dominates other considerations. For V, it's RAM efficiency.)

Also: C vs. C++ makes a huge difference. Efficient bitvectors with the high-level flags proposed in the Google-indexed thread are much easier to do in C++.

Bandobras
December 21, 2007, 06:54
I just dont know, there was stuff out there in 2001 that made the bitflags look saintly. So I am assuming that was really a pet peeve instead of a serious problem ;)

As a person that added a couple of stats, and so a few bits to the bitfields all around the code in a certain variant, I assure you this is a nightmare. And I didn't even have the ambition to add a whole new TR17_* bitfield, though the bits I added didn't quite fit in the old bitfields. That was just too horrible to attempt. (In the result a certain variant has rings of sustain two stats and mushrooms of restore two stats, because one bit had to be used for two stats in many cases --- which turned out a good game-play idea, surprisingly. :)) Moreover, variants have different number of TR* bitfields which makes their code unnecessarily divergent even in places that don't use the new features.

In this particular application, linked lists are automatically bloated.

This is not so automatic.

(Generally, any of the one-datum one-node data structures are automatically bloated and should be avoided whenever either RAM efficiency or memory manager efficiency dominates other considerations. For V, it's RAM efficiency.)

In that case, lists may be better than bitvectors, because items and monsters usually have only a few of the tens or hundreds of flags, depending on variant. Then, for an average item, you have, say, one two-word list node with the only flag the item actually has. The alternative is a set of fixed 4- or 6- or 10-word long bitvectors with all but one flags at 0.

If the optional properties were integers instead of booleans, the lists would be unbeatable. But there are still only a few such properties, e.g. monster mana that many monsters just don't have.

Also: C vs. C++ makes a huge difference. Efficient bitvectors with the high-level flags proposed in the Google-indexed thread are much easier to do in C++.

I think the main issue is ease of use. If we employ an external library, we don't even have to worry about details (until we try to port the library to some obscure mobile phone, that is...). I wonder if there are any simple tried solutions...

konijn_
December 21, 2007, 13:42
As a person that added a couple of stats, and so a few bits to the bitfields all around the code in a certain variant, I assure you this is a nightmare. And I didn't even have the ambition to add a whole new TR17_* bitfield, though the bits I added didn't quite fit in the old bitfields. That was just too horrible to attempt. (In the result a certain variant has rings of sustain two stats and mushrooms of restore two stats, because one bit had to be used for two stats in many cases --- which turned out a good game-play idea, surprisingly. :)) Moreover, variants have different number of TR* bitfields which makes their code unnecessarily divergent even in places that don't use the new features.

Well, I added to a certain variant several bitfields and it worked out just great.

Cheers,
T.

Bandobras
December 21, 2007, 14:35
Well, everybody knows, primitive evil variants are easy to expand. ;P

Seriously, it may be easier with no 4GAI and no extended (pseudo-)ID logic. Or perhaps you are just a boy-genius with a nice IDE or 'grep' mastery. Or perhaps you are a variant maintainer, which means you keep the whole code in your head all time time, anyway. ;) Me, I'm just a sidekick.

zaimoni
December 21, 2007, 16:04
Seriously, it may be easier with no 4GAI and no extended (pseudo-)ID logic.Let's not give me yet more excuses for reimplementing the 4GAI.Or perhaps you are just a boy-genius with a nice IDE or 'grep' mastery.It would seem to me that average intellect is well up to

grep -n -F ____ > Test.txt

where ___ is any fixed-length string of exact case of interest.

Bandobras
December 21, 2007, 16:31
Well,

grep -n -F "&f4" * > Test.txt

as in (but also in several other related functions' invocations)

/* Extract the flags */
object_flags(o_ptr, &f1, &f2, &f3, &f4);

produced 58 matches in the Un code and this is just a randomly chosen pattern of all those that have to be checked when adding f5... ;<

zaimoni
December 21, 2007, 16:47
In that case, lists may be better than bitvectors, because items and monsters usually have only a few of the tens or hundreds of flags, depending on variant. Then, for an average item, you have, say, one two-word list node with the only flag the item actually has. The alternative is a set of fixed 4- or 6- or 10-word long bitvectors with all but one flags at 0.This is truer for the savefile size than the in-memory size. Any moderately-full level's going to have enough nodes (about a thousand should be enough) to take minutes to load if those linked lists are ultimately malloc'd, even with a non-debugging memory manager. And if you're using an alternate method to allocate memory for the linked lists -- any "simplicity gains" are a mirage, even if they're wrapped in an external library.

With a linked list, the RAM cost per node includes
* at least one pointer to the next node [one for a singly-linked list]
* memory manager overhead for the node itself. I suspect the typical minimum is 8 bytes on a 32-bit pointer system with a traditional *NIX tree-indexed free memory heap and no automatic debugging checks. My reimplementation for Windows costs 16 bytes in overhead.

Using a C external library doesn't change any of this. And I don't see how a C external library can get you the efficient AND/OR testing of bitwise conditions.
I think the main issue is ease of use.Within C:
* Yes, tracking which flag goes with which bitmap bucket is a pain.
* The simplest way to handle that as a bitmap
** convert the bitmap flag holders to an array of uint32_t
** provide an alternate set of macros that doesn't have the TR mess, just the logical names
** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.

Compile-time detecting when logical AND/OR tests can be merged is a bit trickier; we may have to retain the TR macros to support this transparently. The Boost preprocessor library wipes out at 256, which isn't really enough for all variants.

zaimoni
December 21, 2007, 16:55
Within C:
* Yes, tracking which flag goes with which bitmap bucket is a pain.
* The simplest way to handle that as a bitmap
** convert the bitmap flag holders to an array of uint32_t
** provide an alternate set of macros that doesn't have the TR mess, just the logical names
** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.

Compile-time detecting when logical AND/OR tests can be merged is a bit trickier; we may have to retain the TR macros to support this transparently. The Boost preprocessor library wipes out at 256, which isn't really enough for all variants.Once V starts seeing SVN commits again, I'd be comfortable providing this patch (to see what it does to the source code readability).

Bandobras
December 21, 2007, 17:14
Once V starts seeing SVN commits again, I'd be comfortable providing this patch (to see what it does to the source code readability).

Wow, great!

Any moderately-full level's going to have enough nodes (about a thousand should be enough) to take minutes to load if those linked lists are ultimately malloc'd, even with a non-debugging memory manager.

This sounds awful. I have lots of experience of lists, trees, etc. on languages with fast garbage collection (no, I don't mean Java nor Lisp) and I construct and destruct (Edit: not destruct, but abandon, destruction is done by garbage collector) millions of nodes per second, I think. One bit overhead per node, too. (Edit: yeah, I've just tested: my program, a complex compiler, consumes around 1MB of memory per second on my puny 2GHz 32bit processor, which is around a half or a third of a million nodes; but during that second many more nodes are constructed and abandoned during very complex computations.)

And if you're using an alternate method to allocate memory for the linked lists -- any "simplicity gains" are a mirage, even if they're wrapped in an external library.

Once I've worked with C code that had a long cycle of nodes preallocated at initialization and used and released from the cyclic pool. I was very error-prone and looked stupid, but now I begin to understand... You are right that such schemes deduct from the simplicity and efficiency of lists.

** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.


You mean this is how it looks like after macro-expansion? But even then it looks better than bitfield variables. :)

Compile-time detecting when logical AND/OR tests can be merged is a bit trickier; we may have to retain the TR macros to support this transparently.

I'd be surprised if it was a bottleneck, but even then we have low-level access to bitmap[], so we can test whole words of it at will or make specialized macros that depend on location of simlar flag groups inside bitmap[].

zaimoni
December 21, 2007, 18:13
Any moderately-full level's going to have enough nodes (about a thousand should be enough) to take minutes to load if those linked lists are ultimately malloc'd, even with a non-debugging memory manager.
This sounds awful.Agreed. I ran into a much worse version of this while working on a chemical engineering project. (Building a 5,300+ node red-black binary tree was taking hours at 1GHz going through MingW32's standard malloc implementation; ~800 nodes was still taking ~10 minutes.

Unfortunately, the red-black binary tree was backing a std::set which I needed the deduplication of keys effect from. This process was moved from the main flygram suite to a moderately configurable data table distiller.)

The data in these nodes was a pair of long doubles -- trivial construction/destruction. I'm pretty sure that less than one second went into everything other than juggling memory management.
** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.
You mean this is how it looks like after macro-expansion? But even then it looks better than bitfield variables. :)Yes.

zaimoni
December 22, 2007, 02:36
Once V starts seeing SVN commits again, I'd be comfortable providing this patch (to see what it does to the source code readability).
Wow, great!Takkaria: yes or no on the proposed patch? (Would need to discuss structure on angband-dev if yes. There are some low-level dependencies that need consideration in the design stage.)

takkaria
December 22, 2007, 03:29
Takkaria: yes or no on the proposed patch? (Would need to discuss structure on angband-dev if yes. There are some low-level dependencies that need consideration in the design stage.)

I'm willing to see people's ideas, but I'm not committing yet. If you put together an example of the kind of code it'll result in, then I can give you a much better answer.

zaimoni
December 22, 2007, 08:24
** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.You mean this is how it looks like after macro-expansion? But even then it looks better than bitfield variables. :)Regrettably, that is not a concrete gain I could rationalize sinking the effort for into Zaiband. There are some low-level dependencies (the A_... stat constants and corresponding pval/sustain flags) that are noticeably harder to hand-verify post-change. [Ideally, these dependencies should be checked in debug-builds explicitly by assert(). In C++, make that static asserts so that it won't even compile if gotten wrong.]

The concrete gains I've seen so far in prototype:
* using C_COPY and C_WIPE in a few places from z-virt.h
* iterating over loops controlled by array size to future-proof load/save code and the flag abstraction code in a few places

Pete Mack
December 22, 2007, 09:58
various people write:
** tests then become (bitmap[FLAG/32] & (1<<(FLAG%32))), which a competent optimizing compiler should handle as efficiently as the current code.


I don't understand what you want to do with this. It (or something vaguely like it) will work to store flag values, but it wont let you do ordinary bit-vector math.

In particular, what happens if you want to match more than one flag variable at a time? You may want to do things logically like:

flags & (RCONF | RCHAOS) /* Example from NPP */

You can't do this if the high order bits and low order bits are dependent the way they would be in your example.

You just can't OR together array indexes to get valid array indexes. They are ordinal numbers, not orthogonal bits.

Bandobras
December 22, 2007, 14:02
The original example wasn't mine, but I couldn't ignore that one:

flags & (RCONF | RCHAOS) /* Example from NPP */

This is one of the many bad things induced by bit arithmetic. Such coding is very error prone, especally when you extend your variant. If the 'flags' bitfields is not the one variable (of a few) that holds the RCONF and RCHAOS bits, you have an error that people may not notice for years. If the RCONF and RCHAOS bits land at opposite sides of a bitfiled boundary you can get another error and you cannot write the code in this way at all, so you get irregular code.

This should be written

object_has_flag(o_ptr, F_CONF) && object_has_flag(o_ptr, F_CHAOS)

and you are safe against all errors I mentioned and then some.

Edit: corrected my code :)

Edit2: the F_CONF, etc. flags can be the same constant for object flags, monster flags, character flags, etc. because there is plenty of space in a bit vector, so no problem with unused bits. The macros can catch inappropriate flags, though, to point out errors, or they can translate the flags to save space in the bit vector, when really needed.

Edit3: actually, bit vectors are not necessary for the high-level approach I proposed above, bietfields would do, with much messier macros, but even the following mid-level code is quite safe:

check_flag(vecor, F_CONF) && check_flag(vector, F_CHAOS)

zaimoni
December 22, 2007, 14:39
I don't understand what you want to do with this. It (or something vaguely like it) will work to store flag values, but it wont let you do ordinary bit-vector math.I didn't say this was a good idea, merely that it was possible. The experiment of constructing the patch is worthwhile, even though I cannot do a great job because I'm prejudiced against it.In particular, what happens if you want to match more than one flag variable at a time? You may want to do things logically like:

flags & (RCONF | RCHAOS) /* Example from NPP */

You can't do this if the high order bits and low order bits are dependent the way they would be in your example.In C, you're hung with a high-level interface. Thus, use macro-wrappers so you can do this.

If I was making the final decision for V (I'm not, I merely am willing to provide patching as a code experiment), based on what I'm seeing in prototype I'd stop with:
* sanity-checking the flags for low-level dependencies with assert() [V probably needs this regardless]
* make a policy decision on whether converting to bitvector arrays is worth future-proofing, and C_COPY/C_WIPE applicability.
* making a policy decision on whether the trivial syntax change for newbie bit test/set/update is worth supporting two macro sets. If it is, make a policy decision on whether to use enum or #define for the flag index version. C dictates going from flag index to bitmap. See angband-dev for what the #define changes will look like for object flags in the enum case.
* Design requirement: leave in the bitvector math capability on array elements, as it's moronic to ditch that.

In C++ with Boost libraries, "just" use a template function with two parameters and a conditional type definition to detect whether or not the two flags are in the same bucket. You lose all C++ compilers too archaic for Substitution Failure Is Not An Error (SFINAE). [While not relevant to this construction, a number of Boost libraries do imply not using MSVC++'s ISO-breaking Safe STL.]

zaimoni
December 22, 2007, 14:54
This is one of the many bad things induced by bit arithmetic. Such coding is very error prone, especally when you extend your variant. If the 'flags' bitfields is not the one variable (of a few) that holds the RCONF and RCHAOS bits, you have an error that people may not notice for years.This is what automatic unit tests are for -- in any language, not just C. C just makes it easier by standardizing the NDEBUG flag for omitting them in release builds.If the RCONF and RCHAOS bits land at opposite sides of a bitfiled boundary you can get another error and you cannot write the code in this way at allWrong. Pete's example works perfectly fine as long as the two flags really are in the same bitfield bucket. (EDIT: Assumed "another error" was accurate. As a bitfield boundary is well-defined only within a bitvector, this is the same error as your first example.)The macros can catch inappropriate flags, though, to point out errors, or they can translate the flags to save space in the bit vector, when really needed.Almost wrong. You need Boost Preprocessor to do this cleanly at all (the *one* Boost library that can live in C), and the standard release version wipes out at 256 (hard-limiting you to a 256-bit bitvector).