How to vectorize code in C++?
Would you like to know how to vectorize code in C++ ? because the material I found on the internet is a bit over the top about it.
I understand how to vectorize the use, not only of vectors, but of doing in a single step a whole sequence of steps, that is, doing at once d= (c+e)/2;
instead of repeating these step for each position of the Matrix d[i][j] = (c[i][j]+e[i][j])/2;
For example how to vectorize the following program ?
#include <iostream>
using namespace std;
int main(){
int d[4][4],c[4][4],e[4][4];
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
c[i][j] =i+j;
e[i][j] = 4*i;
}
}
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
d[i][j] = (c[i][j]+e[i][j])/2;
if(d[i][j]<3){
d[i][j]=3;
}
}
}
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
cout << d[i][j] << " ";
}
cout << endl;
}
return 0;
}
When I use the vectorization flag to see how many loops are being vectorized with the help of -O2 -ftree-vectorize -fopt-info-vec-optimized
it answers me "vectorized loop" i.e. only one loop has been vectorized and if I use a-all instead of -optimized
it returns me that many parts of the program have not been vectorized.
1 answers
The problem is that the conditional if
contained within the second loop does not allow it to be optimized by the compiler:
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
d[i][j] = (c[i][j]+e[i][j])/2;
if(d[i][j]<3){
d[i][j]=3;
}
}
}
A solution to this problem is to replace the conditional if
with a conditional ternário
, for example:
for(int i=0;i<4;i++){
for(int j=0;j<4;j++){
d[i][j] = (c[i][j]+e[i][j])/2;
d[i][j] = ( d[i][j] < 3 ) ? 3 : d[i][j];
}
}
Build Test GCC
:
$ g++ -v -O2 -ftree-vectorize -fopt-info-vec-optimized vect.cpp -o vect
Output:
[...]
Analyzing loop at vect.cpp:21
Analyzing loop at vect.cpp:14
vect.cpp:14: note: vect_recog_divmod_pattern: detected:
vect.cpp:14: note: pattern recognized: patt_3 = patt_4 >> 1;
Analyzing loop at vect.cpp:15
vect.cpp:15: note: vect_recog_divmod_pattern: detected:
vect.cpp:15: note: pattern recognized: patt_77 = patt_1 >> 1;
Vectorizing loop at vect.cpp:15
vect.cpp:15: note: LOOP VECTORIZED.
Analyzing loop at vect.cpp:8
Analyzing loop at vect.cpp:9
Vectorizing loop at vect.cpp:9
vect.cpp:9: note: LOOP VECTORIZED.
vect.cpp:4: note: vectorized 2 loops in function.
[...]
References:
Https://locklessinc.com/articles/vectorize /
Https://gcc.gnu.org/projects/tree-ssa/vectorization.html
EDIT:
The answer applies only to the 4.8
version of GCC
.
The version 7.0
, is already capable of vectorizing loops without the need to replace the conditionals if
by ternary operators through the optimization option -fsplit-loops
.
Reference: https://clearlinux.org/blogs/gcc-7-importance-cutting-edge-compiler