## Testing Equality

One omission that’s particularly irritating in current versions is that you can’t test for equality of vectors. This is a shame, because it’s surprisingly difficult to write code for this testing. With AltiVec, the comparison instructions set a condition register, which can be branched on directly, while SSE requires a bit more ingenuity.

SSE doesn’t have instructions for testing equality on integers, only on floating-point values. Fortunately, we can use these anyway, because equality for floating-point quantities just means that the bits are all the same, just as it does for integers.

When we perform a comparison with SSE, we get a vector of 4 `int`s.
Since these either have all bits set to `1` or all bits set to
`0`, we can just take the top bit from each one and put it into a scalar.
If this is `0`, C interprets it as `false`; otherwise, it’s
`true`. With AltiVec, the effort is a little easier, because we can just
use the `vec_all_eq` intrinsic to test whether all of the values are the
same.

Finally, we have to implement a scalar version, in case someone tries to compile it on a machine without AltiVec or SSE. We end up with the following function:

INLINE int equal(v4si v1, v4si v2) { #if defined(__ALTIVEC__) return vec_all_eq((vector int)v1, (vector int)v2); #elif defined(__SSE__) v4si compare = __builtin_ia32_cmpeqps((v4sf)v1, (v4sf)v2); return __builtin_ia32_movmskps((v4sf)compare); #else int * s1 = (int*)&v1; int * s2 = (int*)&v2; return ( s1[0] == s2[0] && s1[1] == s2[1] && s1[2] == s2[2] && s1[3] == s2[3] ); #endif }

We can use this function anywhere we would otherwise use `==`, if we
were dealing with scalar quantities. The `equal.c` example uses this
function; try changing the values of the two vectors to make sure it works:

$ gcc -std=c99 -msse equal.c $./a.out $ gcc -arch ppc -std=c99 -faltivec equal.c $ ./a.out $ gcc -std=c99 equal.c && ./a.out

In each instance, `foo` and `bar` had the same values.

Note that the `-arch` switch is used only when cross-compiling. In
this example, I compiled the AltiVec/Power PC code path on an Intel Mac, and it
then ran in emulation under Rosetta. This approach isn’t likely to give
very good performance, but it’s quite convenient for testing multiple code
paths on the same machine. Since it won’t give an accurate performance
metric, however, it’s always better to test code like this on a real
system.