AMDGPU provides a fixed frequency clock since some generations back.
However, the frequency is variable by card and must be looked up at
runtime. This patch adds a new device environment line for the clock
frequency so that we can use it in the same way as NVPTX. This is the
correct implementation and the version in ASO should be replaced.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp | ||
---|---|---|
833 | This doesn't need to be 1000000000UL? |
openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp | ||
---|---|---|
833 | A 32-bit integer fits at least two billion, so we're just under here. |
openmp/libomptarget/test/offloading/wtime.c | ||
---|---|---|
31 | Probably a more reliable way is to check the diff between duration and 0.0. as long as it is greater than epsilon we are good. |
openmp/libomptarget/test/offloading/wtime.c | ||
---|---|---|
31 | The check will fail if the time is 0.01 right? You can potentially just do compile and run since you put an assertion there. |
Update test, I think the host and device miught use different deltas so it failed sometimes.
It's taken from the device libs, and when I tested myself I got a number that was in-line with what I got on the host, and I tested it on two cards with different frequencies.
This doesn't need to be 1000000000UL?