Getting about the same code in C and OpenCL has lots of advantages, when maximum optimisations and vectors are not needed. One thing I bumped into myself was that rounding in C++ is different, and decided to implement the OpenCL-functions for rounding in C.
The OpenCL-page for rounding describes many, many functions with this line:
destType
convert_destType
<_sat><_roundingMode
>(sourceType
)
So for each sourceType-destType combination there is a set of functions: 4 rounding modes and an optional saturation. Easy in Ruby to define each of the functions, but takes a lot more time in C.
The 4 rounding modes are:
Modifier | Rounding Mode Description |
---|---|
_rte |
Round to nearest even |
_rtz |
Round towards zero |
_rtp |
Round toward positive infinity |
_rtn |
Round toward negative infinity |
The below pieces of code should also explain what the functions actually do.
Round to nearest even
This means that the numbers get rounded to the closest number. In case of 3.5 and 4.5, they both round to the even number 4. Thanks for Dithermaster, for pointing out my wrong assumption and clarifying how it should work.
inline int convert_int_rte (float number) { int sign = (int)((number > 0) - (number < 0)); int odd = ((int)number % 2); // odd -> 1, even -> 0 return ((int)(number-sign*(0.5f-odd))); }
I’m sure there is a more optimal implementation. You can fix that in Github (see below).
Round to zero
This means that positive numbers are rounded up, negative numbers are rounded down. 1.6 becomes 1, -1.6 also becomes 1.
inline int convert_int_rtz (float number) { return ((int)(number)); }
Effectively, this just removes everything behind the point.
Round to positive infinity
1.4 becomes 2, -1.6 becomes 1.
inline int convert_int_rtp (float number) { return ((int)ceil(number)); }
Round to negative infinity
1.6 becomes 1, -1.4 becomes 2.
inline int convert_int_rtp (float number) { return ((int)floor(number)); }
Saturation
Saturation is another word for “avoiding NaN”. It makes sure that numbers are between INT_MAX and INT_MIN, and that NaN returns 0. If not used, the outcome of the function can be anything (-2147483648 in case of convert_int_rtz(NAN) on my computer). Saturation is more expensive, so therefore it’s optional.
inline float saturate_int(float number) { if (isnan(number)) return 0.0f; // check if the number was already NaN return (number>MAX_INT ? (float)MAX_INT : number
Effectively the other functions become like:
inline int convert_int__sat_rtz (float number) { return ((int)(saturate_int(number))); }
Doubles, longs and getting started.
Yes, you need to make functions for all of these. But you could ofcourse also check out the project on Github (BSD licence, rudimentary first implementation).
You’re free to make a double-version of it.
Vincent, I’m afraid you got these wrong.
_rte, “round to nearest even” should be thought of “round to nearest integer, tie-breaking 0.5 to go towards the even number”. So (with your example numbers) 6.5 becomes 6, 7.5 becomes 8, but 7 stays 7 (does NOT become 8!).
_rtz is correct (could also be called truncate to zero). 6.3 and 6.9 become 6, -6.3 and -6.9 become -6.
_rtp always round up towards positive infinity. 6.3 and 6.9 become 7, -6.3 and -6.9 become -6. It’s the same as the C function ceil(). In C using casting it would need to check sign since typically casting is _rtz.
Likewise, _rtn (which you have incorrectly section header labelled as “Round to positive infinity” but it should be “Round to negative infinity”) rounds down towards negative infinity. 6.3 and 6.9 become 6, -6.3 and -6.9 become -7. It’s the same as the C function floor(). In C using casting it would need to check sign.
Some reference: http://developer.amd.com/resources/documentation-articles/articles-whitepapers/new-round-to-even-technique-for-large-scale-data-and-its-application-in-integer-scaling/
So, only _rte is wrong, whereas _rtp and _rte could be replaced by a function? I’m going to check _rte. I based that one on the definition and did not use in my current code (only the other 3).
The code you wrote for _rtp and _rtz include 0.5 which makes them incorrect. They would be more like “(number < 0) ? int(number-1) : int(number)" but even that is not correct for the integers. Just use ceil() and floor().
Now I get you! It simply worked in the cases I used it in my own code. Bad luck or good luck? I’m pretty sure I’m going to write out all functions in normal C, as I find the specs very unclear.
Thanks for taking the time to explain what I did wrong. I feel slightly ashamed, but it’s better to learn than to stay wrong.