Basic Concepts: OpenCL Convenience Methods for Vector Elements and Type Conversions

In the series Basic Concepts I try to give an alternative description to what is said everywhere else. This time my eye fell on alternative convenience methods in two cases which were introduced there to be nice to devs with i.e. C/C++ and/or graphics backgrounds. But I see it explained too often from the convenience functions and giving the “preferred” functions as a sort of bonus which works for the cases the old functions don’t get it done. Below is the other way around and I hope it gives better understanding. I assume you have read another definition, so you see it from another view not for the first time.



Vector Elements

Vectors can be seen as structs on which the computations can be implied to all the elements at the same time. Each element can be accessed by .sX with X being 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F depending on the number of elements in the vector with a 16-vector having all 16 and an 8-vector only 0 to 7. The convenience methods are .x, .y, .z, .w, .hi, .lo, .even en .odd. Below are the the methods defined in the standard. The abbreviation N.D. stands for non-defined. For 3-vectors some functions are not explicitly not-defined, but vague in how to be implemented, so therefore I put “??” to them.

vector2 vector3 vector4 vector8 vector16
x .s0 .s0 .s0 N.D. N.D.
.y .s1 .s1 .s1 N.D. N.D.
.z .s2 .s2 .s2 N.D. N.D.
.w .s3 N.D. .s3 N.D. N.D.
.hi .s1 ?? .s23 .s4567 .s89ABCDEF
.lo .s0 ?? .s01 .s0123 .s01234567
.even .s0 ?? .s02 .s0246 .s02468ACE
.odd .s1 ?? .s13 .s1357 .s13579BDF

To get an idea what a float4 is, here is an (incompletely) description:

struct float4 {
….float s0, s1, s2, s3;
….float x, y, z, w;
….float hi, lo, odd, even;
….float2 s01, s02, s03, s10, s12, s13, s20, s21, s23, s30, s31, s32;
….float2 xy, xz, xw, yx, yz, yw, zx, zy, zw, wx, wy, wz;
….float3 s012, s021, s023, s032, s031, s013, … /* etc */
….float3 xyz, xzy, xzw, … /* etc */
….float4 s0123, s0132, … /* etc */
/* etc – see remark below */

} float4

We are missing i.e. float8 s10123422, but that is quite hard to define in a struct (and neither is defined well in the definitions which imply no repetitions of elements). Just try if .s0011 and .xxyy works with your drivers.


Next are conversions between types. The specified and complete function is using convert_destType<_sat><_roundingMethod>. Most developers are familiar with explicit conversions like:

float a = 5.6f;
int b = (int) a; // = 5

In OpenCL this is the convenience function and only works with ascalars and one rounding mode without saturation; a explicit conversion ‘(destType)’ can be described as ‘convert_destType_rte’ (or ‘convert_destType’).

You do use (type) when you want to widen a scalar to a vector. For example:

float8 f = (float8) 1.0f;

If you get used to convert_ then you don’t have think which method to use depending on if its a scalar or vector and depending if you need rte-rounding or another rounding and depending if you need saturation or not. As a bonus the rounding modes with 2 examples.

float convert_int_rte convert_int_rtz convert_int_rtp convert_int_rtn
+1.6f 2 1 2 1
-1.6f -2 -1 -1 -2
+1.4f 1 1 2 1
-1.4f -1 -1 -1 -2

Thank you

Thank you for your time; I hoped you liked the alternative view. Check the rest of the series, while it is still small.

Related Posts


Improving FinanceBench

...  QuantLib is a C++ library. Unfortunately, languages like OpenCL, CUDA, and OpenACC cannot directly operate on C++ data structures, ...

Kabul - An Afghan National Police (ANP) students study basic logistics training at the Afghan Ministry in Interior (MOI) August 1, 2010. The Afghan MOI is providing the first ever computer based logistics training for ANP personnel. (U.S. Air Force photo/ Staff Sergeant Matt Davis)

OpenCL and CUDA programming training in Amsterdam

...  architecture and writing of efficient GPU software using OpenCL and CUDA. The dates are subject to change to allow those interested ...


Join us at the Dutch eScience Symposium 2019 in Amsterdam

Soon there will be another Dutch eScience Symposium 2019 in Amsterdam. We thought it might be a good place to meet and listen to e-science talks. Stre ...


We accelerated the OpenCL backend of pyPaSWAS sequence aligner

...  year we accelerated the OpenCL-code in PaSWAS, which is open source software to do DNA/RNA/protein ...  ...