Somehow UBSan would only report the unaligned load in TestLinuxCore.py when running the tests with reproducers. This patch fixes the issue by using a memcpy in GetDouble and GetFloat method.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
lldb/include/lldb/Utility/DataExtractor.h | ||
---|---|---|
25 | What platforms are we not expecting to have __builtin_is_aligned on? |
lldb/include/lldb/Utility/DataExtractor.h | ||
---|---|---|
25 | nvm, I just realized this is coming from clang not llvm. |
There is no requirement that data be aligned when stored in a buffer that DataExtractor points to, so this patch doesn't seem correct. Since GetData() calls can return unaligned pointers, the usual way to make this work is to to make a local variable and do a memcpy into that local variable. See the ReadSwapInt32() function to see how it is done for integers. Then we need to swap the bytes depending on the byte size. So maybe a function that can do byte swapping for any sized data can be made. Then we can call it from the Get<T> function. More comments inline.
lldb/include/lldb/Utility/DataExtractor.h | ||
---|---|---|
25–29 | Remove this because we don't require alignment for anything in a DataExtractor. | |
984 | We should probably pass in the fail value if we want this to work for more than floats? template <typename T> T Get(lldb::offset_t *offset_ptr, const T fail_value) const { | |
985 | Make a local "T" here: T val = fail_value; constexpr size_t src_size = sizeof(T); | |
992 | remove this, no alignment is required. | |
993 | Add a memcpy call: memcpy(&val, src, src_size); | |
995 | remove since "val" was created above | |
996–1000 | Might be nice to place this into a method function that swaps data of any size. Then other functions can re-use it if needed. | |
1000 | return "return val;" here | |
1003 | We can't rely on alignment, so return our local T value: return val; |
lldb/include/lldb/Utility/DataExtractor.h | ||
---|---|---|
996–1000 | I left it inline here so we can copy the data in place instead copy & swapping. |
Just need some DataExtractorTest.cpp additions to test the unaligned loads for float and doubles?
lldb/source/Utility/DataExtractor.cpp | ||
---|---|---|
626 | nit: "0.0f" instead of "0.0" |
FYI: values like 4.0 or 2.25 work well for float values in tests as they encode perfectly on all systems.
Here are the next values for 32 bit and 64 bit floats:
float a = 2.25; 0x40100000
double b = 2.25; 0x4002000000000000
You might disable this on windows as I believe they have 10 byte floats for doubles?
Just need some sanity checks to make sure the sizeof(float) == 4 and sizeof(double) == 8 for the tests to make sure this doesn't fail on systems where that isn't true.
lldb/unittests/Utility/DataExtractorTest.cpp | ||
---|---|---|
308 | You might want to make check if the size of a float is 4 bytes here and return without testing anything if it isn't or this will fail. | |
332 | You might want to make check if the size of a float is 4 bytes here and return without testing anything if it isn't or this will fail. | |
356 | You might want to make check if the size of a double is 8 bytes here and return without testing anything if it isn't or this will fail. Windows used 10 byte floats for doubles I believe? | |
380 | You might want to make check if the size of a double is 8 bytes here and return without testing anything if it isn't or this will fail. Windows used 10 byte floats for doubles I believe? |
I was stil checking the Windows float thing but skipping them based on the sizeof() seems more robust.
The approach is right, but can be simplified.
I'd also remove the sizeof checks in the tests. We have a lot of code depending on the fact that floats and doubles are consistent across targets (e.g. the swapByteOrder function I mention). If that ever becomes untrue it would be probably nice to get an early warning about it.
long doubles are a different story, as they behave differently on pretty much every platform. That's why GetLongDouble is full of ifdefs, and is still wrong, because normally we want to read the targets notion of a "long double" not that of a host. That's why I would ideally want to replace these float accessors with a single unified accessor like `llvm::APFloat GetFloat(uint64_t *, llvm::fltSemantics), but that's a story for another day.
lldb/include/lldb/Utility/DataExtractor.h | ||
---|---|---|
992–1002 | memcpy(&val, src, src_size); if (m_byte_order != endian::InlHostByteOrder()) llvm::sys::swapByteOrder(val); return val; will only work float and double (and all integers..) but it doesn't like you need anything else anyway. Long double is broken enough to not care about it. | |
lldb/source/Utility/DataExtractor.cpp | ||
630 | In this case the f suffix is actually not appropriate. | |
lldb/unittests/Utility/DataExtractorTest.cpp | ||
380 | Windows doesn't use 10 bytes for doubles (definitely not on x86, but I would be very surprised if it was true elsewhere too). You're probably confusing this with x86 "extended" floats, which some ABIs map to "long double". These contain 10 bytes of useful data, but due to alignment considerations their sizeof is usually 12 or 16. |
What platforms are we not expecting to have __builtin_is_aligned on?