AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM but I'm not an AArch64 expert. Rafael is on a vacation, so please tag someone knowledgeable from LLVM AArch64 community (e.g. git blame llvm/unittests/Support/Host.cpp in TEST(getLinuxHostCPUName, AArch64)) to double-check if it's a correct and the only constant island access restriction that needs to be enforced.
bolt/test/runtime/AArch64/constant-island-alignment.s | ||
---|---|---|
1 | nit: bolt tool -> BOLT |
Please add a PR fixing bzip2 AArch64 test: https://github.com/rafaelauler/bolt-tests/blob/main/test/AArch64/bzip2.test
Hello @Amir ! I've prepared the fix https://github.com/rafaelauler/bolt-tests/pull/15
Hello @t.p.northover ! I would we thankful I you will review the patch while the @rafaelauler is on vacation. The very main part is located in BinaryEmitter.cpp, that is we will always emit constant island on 8 bytes alignment since currently we don't track here in BOLT is the instruction that will load data from constant island may support unaligned access or not - we will just always aligne it on 8 bytes boundaries. Thank you for your time in advance!
Accepting to unblock dependent diffs.
We'd still appreciate @t.p.northover or someone from ARM community to have a glance.
@yota9, similar to my concerns on other diffs, I think it's better for regression tests to check for a specific condition and not to rely on the correct execution of the test binary. This way you will be able to run tests on other architectures.
Regarding the change itself, have you considered aligning constant islands so that they straddle a minimum number of cache lines? The trivial way to achieve this is to align them at cache line boundary, but it's not always the optimal way.
Hello @maksfb !
This way you will be able to run tests on other architectures.
Sure, but the testing is currently does not run aarch64 tests at all.
I don't really mind, but sometimes it is just inconvenient to make such tests. And I still would like some tests to be runnable.. I think we need to discuss the testing policy.
Regarding the change itself, have you considered aligning constant islands so that they straddle a minimum number of cache lines?
This could be discussed, but we need to check if there will be any winning in this, I don't think this is a good way just to align to cache line size, it should be more clever system, but this is beyond the goals of this review. But thank you for you suggestion anyway :)
It's fair to say that sometimes creating non-runnable test is way easier. However, in this case, will it be sufficient to locate the island in the output binary (e.g. with llvm-objdump) and make sure the address is aligned ({{0|8}})?
Regarding the change itself, have you considered aligning constant islands so that they straddle a minimum number of cache lines?
This could be discussed, but we need to check if there will be any winning in this, I don't think this is a good way just to align to cache line size, it should be more clever system, but this is beyond the goals of this review. But thank you for you suggestion anyway :)
Totally, I don't mean it for this diff. However, could you create a constant for the alignment (instead of sizeof(uint64_t)) that could be easily changed for running experiments?
Overall LGTM.
I have a general concern that the estimation of the size for islands and the actual emission are decoupled and could get out of sync. At the same time, I don't have a good suggestion on how to fix it. For functions, to calculate the size we create a temporary emitter and run the same code used for real emission, get the size of the emitted code and then discard the code.
bolt/test/AArch64/constant-island-alignment.s | ||
---|---|---|
12 ↗ | (On Diff #418228) | nit: place the CHECK next to the asm directive that corresponds to the island being checked. I don't know how the names are assigned for islands and if it's possible to give it a distinctive name. |
nit: bolt tool -> BOLT