'Data alignment inside a structure in Intel Fortran
I'm trying to align in memory the following type of data:
type foo
real, allocatable, dimension(:) :: bar1, bar2
!dir$ attributes align:64 :: bar1
!dir$ attributes align:64 :: bar2
end type foo
type(foo), allocatable, dimension(:) :: my_foo
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))
! somewhere here I need to tell the compiler that data is aligned
! for a simple array with name `bar` I would just do:
!dir$ assume_aligned bar1: 64
!dir$ assume_aligned bar2: 64
! but what do I do for the data type I have, something like this?
!dir$ assume_aligned my_foo(1)%bar1: 64
!dir$ assume_aligned my_foo(1)%bar2: 64
do i = 1, 100
my_foo(1)%bar1(i) = 10.
my_foo(1)%bar2(i) = 10.
end do
As you can see, it's an array of foo
type structures, that has two large arrays bar1
and bar2
as variables that I need to be aligned near cache boundaries in the memory.
I kind of know how to do that for simple arrays (link), but I have no idea how to do that for this sort of complex data structure. And what if my_foo
wasn't of size 1, but was of size, say, 100? Do I loop through them?
Solution 1:[1]
Ok, case semi-closed. The solution turned out to be pretty straightforward. You just use pointers and do an assume_aligned
to them. That should take care of it.
type foo
real, allocatable, dimension(:) :: bar1, bar2
!dir$ attributes align:64 :: bar1
!dir$ attributes align:64 :: bar2
end type foo
type(foo), target, allocatable, dimension(:) :: my_foo
real, pointer, contiguous :: pt_bar1(:)
real, pointer, contiguous :: pt_bar2(:)
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))
pt_bar1 = my_foo(1)%bar1
pt_bar2 = my_foo(1)%bar2
!dir$ assume_aligned pt_bar1:64, pt_bar2:64
pt_bar1 = 10.
pt_bar2 = 10.
do
loops are still not vectorized smh. Like if I do the same thing like this
do i = 1, 100
pt_bar1(i) = 10.
pt_bar2(i) = 10.
end do
it won't be vectorized.
UPD.
Ok, this does the job (also need to add -qopenmp-simd
flag to the compiler):
!$omp simd
!dir$ vector aligned
do i = 1, 100
pt_bar1(i) = 10.
pt_bar2(i) = 10.
end do
Also if you're looping through my_foo(j)%...
make sure to free the pointers after each iteration with pt_bar1 => null()
etc.
PS. Thanks to BW from our department for this help. :) Sometimes personal communication > stackoverflow (not always, only sometimes).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |