Leonc on Apr 13 2022, 10:39 AM.Authored by
There are a very large number of changes, so older changes are hidden. Show Older Changes
Right, but why can't global_atomic_fadd and generic_atomic_fadd be functions that just contain a call to the corresponding builtin and have an appropriate "target-cpu"= attribute? I guess you would need to do inlining in the backend once you know what subtarget you are compiling for. Is that viable?
The problem is in the called function with the appropriate attributes. The function isn't deleted and we still need to produce something. The current situation is it works in cases where the wrong subtarget also happens to have the same encoding, but breaks on targets that never had an equivalent instruction encoded
For the relevant cases in the library, we key off the subtarget feature in target-features. The problem is in the final codegen we've propagated the real cpu to the leaf function, which then artificially has an incompatible target-features. We would have to have some code fixup the target-cpu to some random other device with compatible features
Is there a generic way to do that without adding maintenance overhead, or is it better that we don't implicitly support new subtargets/generations in case they have different encodings of v_illegal?