- Expand GK_*s (i.e. GFX6 -> GFX600, GFX601, etc.)
- This allows us to choose features correctly in some cases (for example, fast fmaf is available on gfx600, but not gfx601)
- Move HasFMAF, HasFP64, HasLDEXPF to GPUInfo tables
- Add HasFastFMA, HasFastFMAF to GPUInfo tables
- Add missing tests
Details
Diff Detail
Event Timeline
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
357–360 | No. All targets that support double precision should report FAST_FMA. Only targets with full rate v_fma_f32 should report FAST_FMAF |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
357–360 | It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target. |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
357–360 | Now that the compiler knows the target it seems the clang options that specify fast_fma et al should be removed and the runtimes changed to not set them. The implementation of when fast fma[f] is present should match the amdgcn td files which have all gfx9 and some pre-gfx9 targets supporting fast fmaf (the ones that have full rate double precision). |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
159 | What does this mean? Has it now been addressed by this patch? | |
275–276 | To be consistent should this be: if (isAMDGCN(getTriple())) Similar comment elsewhere. | |
282 | This was incorrect in the old code. Only full rate FP64 gcn targets have fast FMAF. | |
lib/Basic/Targets/AMDGPU.h | ||
88–91 | Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target? | |
99–103 | Suggest reordering to be in a logical groups of FP32 and FP64: bool HasFMAF; bool HasFastFMAF; bool HasLDEXPF; bool HasFP64; bool HasFastFMA; | |
109 | Suggest adding a comment here which is a header for the columns to make it easier to check if the settings are right. For example: // Name Canonical Kind HasFMAF HasFP64 HasLDEXPF HasFastFMA HasFastFMAF | |
140 | Same comment as above. Also, I think the fast_fma and fast_fmaf columns are reversed. All gcn has fast_fma, it is fast_fmaf that varies. | |
151 | gfx702 should have true for fast fmaf. Does the TD file need correcting too? | |
332 | I found the original operand order easier to read:-) |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
349–353 | I'm not sure why this is here. No languages we support have this AFAIK. We should probably add a comment that this is deprecated and remove it in a year or so. |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
159 | This class has a member called *GPU*. We are using the *CPU* that is passed as an argument.
No. | |
lib/Basic/Targets/AMDGPU.h | ||
140 | fast_fmaf is the last column. I think those are in the correct order. | |
151 | Not according to the TD files in our BE: https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/AMDGPU.td#L545 | |
332 | I was trying to match the style above. But I can change it back. |
lib/Basic/Targets/AMDGPU.h | ||
---|---|---|
103 | I guess the HasFastFMA is for simplicity? It is a synonym for HasFP64. |
lib/Basic/Targets/AMDGPU.cpp | ||
---|---|---|
346–347 | Sorry, I was referring to all of the __HAS_* macros. |
Would it be better to position these at the beginning/end of the respective enumerations so it is more obvious that they must be updated when adding a new target?