This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Use SmallVector for handling register data
ClosedPublic

Authored by DavidSpickett on Jun 23 2023, 5:56 AM.

Details

Summary

Previously lldb was using arrays of size kMaxRegisterByteSize to handle
registers. This was set to 256 because the largest possible register
we support is Arm's scalable vectors (SVE) which can be up to 256 bytes long.

This means for most operations aside from SVE, we're wasting 192 bytes
of it. Which is ok given that we don't have to pay the cost of a heap
alocation and 256 bytes isn't all that much overall.

With the introduction of the Arm Scalable Matrix extension there is a new
array storage register, ZA. This register is essentially a square made up of
SVE vectors. Therefore ZA could be up to 64kb in size.

https://developer.arm.com/documentation/ddi0616/latest/

"The Effective Streaming SVE vector length, SVL, is a power of two in the range 128 to 2048 bits inclusive."

"The ZA storage is architectural register state consisting of a two-dimensional ZA array of [SVLB × SVLB] bytes."

99% of operations will never touch ZA and making every stack frame 64kb+ just
for that slim chance is a bad idea.

Instead I'm switching register handling to use SmallVector with a stack allocation
size of kTypicalRegisterByteSize. kMaxRegisterByteSize will be used in places
where we can't predict the size of register we're reading (in the GDB remote client).

The result is that the 99% of small register operations can use the stack
as before and the actual ZA operations will move to the heap as needed.

I tested this by first working out -wframe-larger-than values for all the
libraries using the arrays previously. With this change I was able to increase
kMaxRegisterByteSize to 256*256 without hitting those limits. With the
exception of the GDB server which needs to use a max size buffer.

Diff Detail

Event Timeline

DavidSpickett created this revision.Jun 23 2023, 5:56 AM
DavidSpickett requested review of this revision.Jun 23 2023, 5:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2023, 5:56 AM

This assumes that the usual use case is:

  • Make a small vector.
  • Resize it to what you need.
  • Use the content like an array.

Every case I found matched that but still, it's not the safest API ever. So I've tested this on arm64 of course, and did try to run the test suite with asan but issues with leaks in Python prevented that. No amount of filtering seemed to help.

I see Green Dragon runs an lldb sanitizers bot, so my plan would be to rely on that to catch any tricky issues prior to any SME changes going in.

lldb/include/lldb/Utility/RegisterValue.h
36

When the ZA register is added, this will change to 256*256. For now nothing should change apart from the slight overhead of SmallVector.

JDevlieghere accepted this revision.Jun 23 2023, 10:25 AM

This assumes that the usual use case is:

  • Make a small vector.
  • Resize it to what you need.
  • Use the content like an array.

Every case I found matched that but still, it's not the safest API ever.

Yeah, I don't see how we can make this any safer without introducing something like the DataExtractor but for writing rather than reading, which would be overkill, and if we care about the cost of heap allocations, probably too slow as well. The current approach doesn't seem any less safe than the buffer we had before so I think it's fine.

I see Green Dragon runs an lldb sanitizers bot, so my plan would be to rely on that to catch any tricky issues prior to any SME changes going in.

The sanitized bot on GreenDragon runs on Intel and I assume the "risky' changes only apply to arm64 as that's the only architecture that needs to scale beyond the default 256? Anyway I haven't seen the leaks issue you've mentioned locally so I'm happy to run a sanitized build on arm64.

This revision is now accepted and ready to land.Jun 23 2023, 10:25 AM
Matt added a subscriber: Matt.Jun 23 2023, 9:17 PM

The sanitized bot on GreenDragon runs on Intel and I assume the "risky' changes only apply to arm64 as that's the only architecture that needs to scale beyond the default 256? Anyway I haven't seen the leaks issue you've mentioned locally so I'm happy to run a sanitized build on arm64.

The risk for existing targets is that .size() won't reflect the size we're about to write to it. With it defaulting to 256 bytes of stack space and us using it like an array, we'd probably skip a lot of the SmallVector asserts around that.

Potentially I could "proxy" .resize and assert that it had been called at some point but it's still possible to forget to resize it again if it was needed. It would make more sense to look at the methods we pass the SmallVector's storage to instead. If they could take the container directly, then they could do some checks.

I will look into that, and I will also try the sanitized build again because now I wonder if lit's env cleaning was removing the options I needed.

Make a resize in EmulateInstructionARM64::EmulateLDPSTP uncondintional.

Though in this case the register for an ldp/stp could only be an x register
which is 8 bytes, so there would have been enough allocated already.

Drop {} in one place now that the resize is not conditional. The logic
there also seems faulty, but in other ways I will stay out of for now.

I did attempt another sanitizers build and realised I should have been
setting leak checker options. That got me to missing symbols, and
no amount of preloading seemed to help there. Not sure what went wrong.

I looked at asserting or adding some checker methods. Given that you're
going to resize to X then pass X as the size to the subsequent call,
there's not much value in it. It would prevent some typos maybe.

On the same theme I looked at adding overloads that took smallvector
directly, but with only 3/4 calls to each one it just didn't seem worth it.

The existing uses are compact enough to check manually and/or in a
sanitized build.

This revision was landed with ongoing or failed builds.Jun 27 2023, 2:15 AM
This revision was automatically updated to reflect the committed changes.