Recently, we wanted to make a test and see how we could find the maximum value between two variables using bitwise operations.

We ended up with the following peculiar way to get the biggest value between two variables using bitwise operations

r = a ^ ((a ^ b) & -(a < b));

The above formula has two modes:

- When
`a < b`

- When
`a >= b`

### When `a < b`

then the formula will change as follows:

r = a ^ ((a ^ b) & 0xFFFFFFFF);

As we all (should) know, when one of the operators on a bitwise `AND`

operation is composed only from `1`

s, then the result is whatever value the other operator was holding.

So, the formula then simplifies as follows:

r = a ^ (a ^ b);

which is equal to

r = b;

because we when we apply twice the same value using `XOR`

on another value, we revert back to the original value (so the second `^a`

nullifies the first `^a`

)

### When `a >= b`

then the formula will change as follows:

r = a ^ ((a ^ b) & 0x00000000);

When one of the operators on a bitwise `AND`

operation is composed only from `0`

s, then the result is always `0`

no matter what value the other operator was holding.

So, the formula then simplifies as follows:

r = a ^ (0x00000000);

which is equal to

r = a;

because when one of the operators in a `XOR`

operation is only composed from `0`

s then the result will be the value of the other operator, no matter what it was.

### Full example

Below you will find a full example that compares the execution speed of the two methods by executing each several thousands of time on the same random data.

[Download not found]

#include <stdio.h> #include <time.h> #include <stdlib.h> int main() { { const clock_t start = clock(); srand(10); unsigned long int i; unsigned int max = 0; for (i = 0; i < 1000000000; i++) { const int a = rand(); max = max < a ? a : max; } const clock_t end = clock(); const float seconds = (float) (end - start) / CLOCKS_PER_SEC; printf("Seconds elapsed %f\tIf statement. Overall max value = %u\n", seconds, max); } { const clock_t start = clock(); srand(10); unsigned long int i; unsigned int max = 0; for (i = 0; i < 1000000000; i++) { const int a = rand(); max = a ^ ((a ^ max) & -(a < max)); } const clock_t end = clock(); const float seconds = (float) (end - start) / CLOCKS_PER_SEC; printf("Seconds elapsed %f\tBitwise operation. Overall max value = %u\n", seconds, max); } return 0; }

### Results

Our results show that using the traditional if statement with assignment is faster than using our formula as expected.

Which makes sense as there is an if statement in the formula as well and then additional operations to get the result, instead of just the assignment.

Seconds elapsed 5.770000 If statement. Overall max value = 2147483647 Seconds elapsed 6.180000 Bitwise operation. Overall max value = 2147483647

10 times bigger input

Seconds elapsed 57.450001 If statement. Overall max value = 2147483647 Seconds elapsed 63.869999 Bitwise operation. Overall max value = 2147483647

This post is also available in: Greek

Did you try to get the mean time elapsed ? I think in theory, branching is a time consuming thing. Don’t know why did you get vice-versa results.

When branching can be predicted, the CPU (with other hardware close to the CPU) will assume the result of the branch.

Then it will continue to compute the code that is available on the predicted side before verifying the result of the branch.

If the prediction is wrong, the CPU will throw away the results and compute the correct side of the branch.

In this example, it appears that the CPU was able to assume which the value of the branch would be (in most cases), and that is why it was fast.

It would be interesting to see if the conclusion (“if” statement wins) still holds when you pre-populate 1000000000 random values in an array before starting the timer (and entering the loop where you evaluate the bitwise-op expression). The downside of the method you are currently using is that, 1) the timer is taking into account the time you spend generating the random number and 2) depending on the compiler’s configuration, the rand() function call can be inlined, and if that is the case, either the compiler can omit some non-dependent instructions or the CPU’s OoO (Out-of-Order execution) mechanism can “bypass” those instructions and find the max value. Basically, the current method you are using is coupling the rand() function with the real expression to be evaluated, and this could lead to inaccurate results.