Basic Concepts: OpenCL Convenience Methods for Vector Elements and Type Conversions

In the series Basic Concepts I try to give an alternative description to what is said everywhere else. This time my eye fell on alternative convenience methods in two cases which were introduced there to be nice to devs with i.e. C/C++ and/or graphics backgrounds. But I see it explained too often from the convenience functions and giving the “preferred” functions as a sort of bonus which works for the cases the old functions don’t get it done. Below is the other way around and I hope it gives better understanding. I assume you have read another definition, so you see it from another view not for the first time.

 

 

Vector Elements

Vectors can be seen as structs on which the computations can be implied to all the elements at the same time. Each element can be accessed by .sX with X being 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F depending on the number of elements in the vector with a 16-vector having all 16 and an 8-vector only 0 to 7. The convenience methods are .x, .y, .z, .w, .hi, .lo, .even en .odd. Below are the the methods defined in the standard. The abbreviation N.D. stands for non-defined. For 3-vectors some functions are not explicitly not-defined, but vague in how to be implemented, so therefore I put “??” to them.

Convenience
alternative
vector2vector3vector4vector8vector16
x.s0.s0.s0N.D.N.D.
.y.s1.s1.s1N.D.N.D.
.z.s2.s2.s2N.D.N.D.
.w.s3N.D..s3N.D.N.D.
.hi.s1??.s23.s4567.s89ABCDEF
.lo.s0??.s01.s0123.s01234567
.even.s0??.s02.s0246.s02468ACE
.odd.s1??.s13.s1357.s13579BDF

To get an idea what a float4 is, here is an (incompletely) description:

struct float4 {
….float s0, s1, s2, s3;
….float x, y, z, w;
….float hi, lo, odd, even;
….float2 s01, s02, s03, s10, s12, s13, s20, s21, s23, s30, s31, s32;
….float2 xy, xz, xw, yx, yz, yw, zx, zy, zw, wx, wy, wz;
….float3 s012, s021, s023, s032, s031, s013, … /* etc */
….float3 xyz, xzy, xzw, … /* etc */
….float4 s0123, s0132, … /* etc */
/* etc – see remark below */

} float4

We are missing i.e. float8 s10123422, but that is quite hard to define in a struct (and neither is defined well in the definitions which imply no repetitions of elements). Just try if .s0011 and .xxyy works with your drivers.

Conversions

Next are conversions between types. The specified and complete function is using convert_destType<_sat><_roundingMethod>. Most developers are familiar with explicit conversions like:

float a = 5.6f;
int b = (int) a; // = 5

In OpenCL this is the convenience function and only works with ascalars and one rounding mode without saturation; a explicit conversion ‘(destType)’ can be described as ‘convert_destType_rte’ (or ‘convert_destType’).

You do use (type) when you want to widen a scalar to a vector. For example:

float8 f = (float8) 1.0f;

If you get used to convert_ then you don’t have think which method to use depending on if its a scalar or vector and depending if you need rte-rounding or another rounding and depending if you need saturation or not. As a bonus the rounding modes with 2 examples.

floatconvert_int_rteconvert_int_rtzconvert_int_rtpconvert_int_rtn
+1.6f2121
-1.6f-2-1-1-2
+1.4f1121
-1.4f-1-1-1-2

Thank you

Thank you for your time; I hoped you liked the alternative view. Check the rest of the series, while it is still small.

 

Related Posts

windows-start-opencl

How to install OpenCL on Windows

...  your Windows machine ready for OpenCL is rather straightforward. In short, you only need the latest drivers ...  that the ...

IWOCL-logo

IWOCL 2017 – all the talks

...  the PDF. Heterogeneous Computing Using Modern C++ with OpenCL Devices - Rod Burns and Ruyman Reyes (Codeplay) This hands-on ...

<!--:en-->library_60022<!--:-->

Install OpenCL on Debian, Ubuntu and Mint orderly

...  you read different types of manuals how to compile OpenCL software on Linux, then you can get ...  is after it all. Note that ...

Selectie_235

NVIDIA ended their industry-leading support for OpenCL in 2012

...  for the samples in one zip-file, scroll down. The removed OpenCL-PDFs are also available for download.This sentence "NVIDIA’s ...