This is an archive of the discontinued LLVM Phabricator instance.

scudo: Use DC GZVA instruction in storeTags().
ClosedPublic

Authored by pcc on Apr 20 2021, 4:26 PM.

Details

Summary

DC GZVA can operate on multiple granules at a time (corresponding to
the CPU's cache line size) so we can generally expect it to be faster
than STZG in a loop.

Diff Detail

Event Timeline

pcc requested review of this revision.Apr 20 2021, 4:26 PM
pcc created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2021, 4:26 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
eugenis accepted this revision.Apr 21 2021, 10:42 AM

LGTM
I wonder if doing the size check before the DCZID check could speed up small allocations, and maybe raising the threshold value could help.
But we can worry about that later.

This revision is now accepted and ready to land.Apr 21 2021, 10:42 AM
This revision was landed with ongoing or failed builds.Apr 21 2021, 1:54 PM
This revision was automatically updated to reflect the committed changes.