domenica 29 dicembre 2013

Define TRUE and FALSE in C

What is FALSE and TRUE in C?

The C standard does not explicitly define TRUE and FALSE, you have to do it somewhere in your headers. Which is the best practice to define them?
It's not such a dumb question, because you may fall in this case:
  #define MASKBIT_0    0x01
  #define MASKBIT_1    0x02
  #define MASKBIT_2    0x04
  unsigned int bits;
  if (bits & MASKBIT_2) {
The example above works correctly, but if you are somewhere absent and you accidentally write:
  #define FALSE         0
  #define TRUE          1
  if ((bits & MASKBIT_2) == TRUE) {
The "if" test will fail, also if MASKBIT_2 is set. This is because ((bits & MASKBIT_2) == 4), that is not 1 (your defined TRUE).

To be honest, the problem is that writing if ((bits & MASKBIT_2) == TRUE) is not correct (you should use the if (bits & MASKBIT_2) semantics), but it's obvious what it means and I'd prefer there was a way to define TRUE and FALSE so that this kind of test would be supported.

Approach 1

  #define FALSE         0
  #define TRUE          1

As shown above, it's not a so good approach, although works in most cases.

Approach 2

Some experienced programmer ("The Linux Programming Interface", not exactly a johnny-come-lately...) suggest the following:
  typedef enum { FALSE, TRUE } Boolean;

The reasons for that are good: enum will add some static-assign-checking of the Boolean declared variables. Not so much checking indeed, enum is not a great checker in C: can't check dinamically assigned varibles, and enum variables are typically int.

Approach 3

Other experienced programmer (i.e. gnome glib) suggest that:
  #define FALSE         0
  #define TRUE          (!(FALSE))

This is another good approach: the FALSE condition in C/C++ is ALWAYS == 0; on the contrary, TRUE condition is not always 0, but usually is a "NOT 0" condition. On the other side, the if ((bits & MASKBIT_2) == TRUE) test will fail anyway: (!(FALSE)) is expanded as a constant (usually 1). There is no static-assign-checking.

Beside that, someone suggest:
  #define TRUE         (1==1)
  #define FALSE        (!TRUE)

Rather than this, I like more:
  #define FALSE        (1==0)
  #define TRUE         (!(FALSE))
In this way, you automatically assign the FALSE value also if (for a strange, rare case) your compiler DOES NOT use 0 as a FALSE value, and you assign a TRUE as the negation. This is "logically" good, but does not solve che if ((bits & MASKBIT_2) == TRUE) problem.


So what I could do to mix-up the best practices?
Here it is:
  typedef enum { FALSE = (1==0), TRUE = (1==1) } boolean;

In order to avoid the if ((bits & MASKBIT_2) == TRUE) problem, I suggest the following solutions:
1) Use the following function or a similar macro:
  boolean test (int val)
      return (val > 0);
usage: if (test(bits & MASKBIT_2))

2) Do not use bitmask #defines such as MASKBIT_2, instead use the following function or a similar macro:
  boolean test (unsigned int bitpool, int bitnr)
      return ((bitpool & (1 << bitnr)) != 0);
usage: if (test(bits, 2))
I like the solution (2) more, because not defining any MASKBIT, you are forced to use the test() function. You can always define verbose bit names such as:
  #define MYBIT   2

Using the solution (2), you may want to use also the following small functions/MACRO to set and clear bits:
  unsigned int set (unsigned int bitpool, int bitnr)
      return (bitpool | (1 << bitnr));
  unsigned int clr (unsigned int bitpool, int bitnr)
      return (bitpool & ~(1 << bitnr));

(or you can pass a pointer to bitpool, if you prefer).

(!(FALSE)) or (1==1)?

It's better to use this?:
  #define TRUE     (!(FALSE))
or this?:
  #define TRUE     (1==1)

They are quite the same. The result is exactly the same. I think that it's slightly better #define TRUE (1==1) only because reading (!(FALSE)) may let you think that the result is anything else than FALSE, which is not correct.

Theoretically, you can't know how the compiler expand (!(FALSE)). Remember that FALSE is an int value you defined before: hypotetically, the compiler could expand (!(FALSE)) with a different value than (1==1); for example, it could expand #define FALSE (1==0) with 0, and (!(FALSE)) with -1 or 0xFFFFFFFF, and on the contrary expand (1==1) with 1. I don't think that any compiler do that, anyway.

Best Practices

I found the following "Best Practices" on a forum discussion:

Given the de facto rules that zero is interpreted as FALSE and any non-zero value is interpreted as TRUE, you should never compare boolean-looking expressions to TRUE or FALSE. Examples:
  if (thisValue == FALSE)  // Don't do this!
  if (thatValue == TRUE)   // Or this!
  if (otherValue != TRUE)  // Whatever you do, don't do this!
Why? Because many programmers use the shortcut of treating ints as bools. They aren't the same, but compilers generally allow it. So, for example, it's perfectly legal to write
  if (strcmp(yourString, myString) == TRUE)  // Wrong!!!
That looks legitimate, and the compiler will happily accept it, but it probably doesn't do what you'd want. That's because the return value of strcmp() is
   0 if yourString == myString
  <0 if yourString < myString
  >0 if yourString > myString
So the line above returns TRUE only when yourString > myString.
The right way to do this is either
  // Valid, but still treats int as bool.
  if (strcmp(yourString, myString))
  // Better: lingustically clear, compiler will optimize.
  if (strcmp(yourString, myString) != 0)
  if (someBoolValue == FALSE)     // Redundant.
  if (!someBoolValue)             // Better.
  return (x > 0) ? TRUE : FALSE;  // You're fired.
  return (x > 0);                 // Simpler, clearer, correct.
  if (ptr == NULL)                // Perfect: compares pointers.
  if (!ptr)                       // Sleazy, but short and valid.
  if (ptr == FALSE)               // Whatisthisidonteven.
You'll often find some of these "bad examples" in production code, and many experienced programmers swear by them: they work, some are shorter than their (pedantically?) correct alternatives, and the idioms are almost universally recognized. But consider: the "right" versions are no less efficient, they're guaranteed to be portable, they'll pass even the strictest linters, and even new programmers will understand them.
Isn't that worth it?


I'm not sure that my considerations are defintely good. I may ignore something that is important for this topic. Discussion is open.

Nessun commento:

Posta un commento