This is an archive of the discontinued LLVM Phabricator instance.

Add sanitizer-specific GlobalValue attributes.
ClosedPublic

Authored by hctim on May 20 2022, 4:09 PM.

Details

Summary

Plan is the migrate the global variable metadata for sanitizers, that's
currently carried around generally in the 'llvm.asan.globals' section,
onto the global variable itself.

This patch adds the attribute and plumbs it through the LLVM IR and
bitcode formats, but is a no-op other than that so far.

Diff Detail

Event Timeline

hctim created this revision.May 20 2022, 4:09 PM
hctim requested review of this revision.May 20 2022, 4:09 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 20 2022, 4:09 PM
hctim updated this revision to Diff 431508.May 23 2022, 3:56 PM

Add an extra Global creation path from the frontend, and fix up a comment from Kirill on the other revision.

hctim updated this revision to Diff 431810.May 24 2022, 3:06 PM

Remove extra creation path, turns out it's a duplicate of elsewhere.

(friendly ping)

kstoimenov accepted this revision.May 26 2022, 2:58 PM

LGTM, but get one from vitalybuka@ too.
Sorry about delayed review, I missed it. Next time ping me if I don't respond within a day.

llvm/lib/IR/Globals.cpp
225

Do we expect that 'this' is in the map? If so could you add an assert here using 'find' for example?

This revision is now accepted and ready to land.May 26 2022, 2:58 PM
vitalybuka requested changes to this revision.EditedMay 27 2022, 5:10 PM

all stuff under llvm/... here needs testing, Bitcode and AsmParser have many tests usable as examples

Also probably better to separate from clang changes:
Patch 1: extends LLVM IR
Patch 2: Integrate new LLVM IR feature into clang

LangRef.rst part will probably requires wider review. I suspect there will be strong opinions.
So maybe you can extract just a document into a 3rd patch.

Also it does not really document what new attributes do.

For review context you can organize them into Stack.

This revision now requires changes to proceed.May 27 2022, 5:10 PM
hctim updated this revision to Diff 433889.Jun 2 2022, 2:30 PM
hctim marked an inline comment as done.

Add round-trip tests, fork out the clang and langref changes for further commits.

hctim added a comment.Jun 2 2022, 2:33 PM

Also, the bitset now doesn't have positive-inclusions for ASan/HWASan, and no mention of MTE yet (which will come later). Sanitizers are applied to all GVs implicitly, so we only need to carry around the exclude masks. Added a comment to the struct that clarifies this.

vitalybuka added inline comments.
llvm/include/llvm/AsmParser/LLToken.h
440

it should be next to other kw_ stuff

llvm/include/llvm/IR/GlobalValue.h
123

It's subclass data, but you keep this data in this class.

316–328
331

get/set for each flag?

llvm/lib/AsmParser/LLParser.cpp
55

you should put these inside of fuctions where it's used.

1263

it would be nice to keep existing "else if" style one per kw
with separate set/get it should be easy

1280

why it's not a part of parseFnAttributeValuePairs ?

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
3547

Why DynInit has own record?

hctim marked 7 inline comments as done.Jun 3 2022, 6:12 PM
hctim added inline comments.
llvm/include/llvm/AsmParser/LLToken.h
440

hoisted them up

llvm/include/llvm/IR/GlobalValue.h
123

this is the spare bits over for the subclass to use. because we keep an additional bit in this class, we give one less bit out to the subclass.

331

as discussed offline, no point with our nice new members.

llvm/lib/AsmParser/LLParser.cpp
1263

i think there's enough commonality with the hasSanitizerMetadata() -> getSanitizerMetadata -> set flag -> setSanitizerMetadata -> lex that i'd normally opt to factor this out anyway. cleaned up the helper function using the new method of setting attributes though.

1280

as discussed offline, global attributes have different semantics / no generalized parsing, unlike functions

hctim updated this revision to Diff 434228.Jun 3 2022, 6:12 PM
hctim marked 4 inline comments as done.

Update from Vitaly's comments.

vitalybuka added inline comments.Jun 3 2022, 7:19 PM
llvm/include/llvm/AsmParser/LLToken.h
181

Have you considered in Attributes.inc ?

/// Can be used as global attribute.
def GvAttr : AttrProperty;

llvm/include/llvm/IR/GlobalValue.h
329

serialization should not be in this header, this bit stuff is /Bitcode/ specific, both should be there

llvm/lib/AsmParser/LLLexer.cpp
583

redundant comment, obvious from the name

llvm/lib/IR/AsmWriter.cpp
3540

AsmWriter/Parser should be tested in llvm-project/llvm/test/Assembler/
e.g. Assembler/diglobalvariable.ll

LGTM, with Asm test and bitcode stuff moved as explained.

hctim updated this revision to Diff 434618.Jun 6 2022, 2:35 PM
hctim marked 5 inline comments as done.

Update w/ Vitaly's comments.

llvm/include/llvm/AsmParser/LLToken.h
181

I took a look at this, but adding GV support to Attributes.inc requires a lot more plumbing. There's bits to be added to Verifier.cpp, LLParser.cpp, LLLexer.cpp, Attributes.cpp, etc. I think it's better to leave that as a future refactoring exercise, along with the rest of the GV attributes.

llvm/lib/IR/AsmWriter.cpp
3540

/s/diglobalvariable/globalvariable-attributes, done

also tested in compatibility.ll

vitalybuka added inline comments.Jun 6 2022, 7:17 PM
llvm/include/llvm/IR/GlobalValue.h
329

serialization should not be in this header, this bit stuff is /Bitcode/ specific, both should be there

I don't see it's done

llvm/lib/IR/AsmWriter.cpp
3543–3552

even if it looks redundant I would not recommend to optimize here,
AsmWriter should not care about semantics. if you let SanitizerMetadata assign this way, then AsmWriter should store it as is.

llvm/lib/IR/Globals.cpp
70

It should copy unconditionally.
E.g. if "this" had Metadata, but src does not , it must be reset here.

231
vitalybuka added inline comments.Jun 7 2022, 1:16 PM
llvm/include/llvm/AsmParser/LLToken.h
397

do we need no_sanitize on IR level?
We should map it to particular no_sanitize_nnnnn

vitalybuka updated this revision to Diff 434955.EditedJun 7 2022, 2:21 PM

fix rebase (mostly undo)
I didn't plan to update anything here, and somehow uploaded D126929 here by mistake

hctim marked 5 inline comments as done.Jun 8 2022, 9:45 AM
hctim added inline comments.
llvm/include/llvm/AsmParser/LLToken.h
397

Removed and mapped it to the no_sanitize_*.

llvm/include/llvm/IR/GlobalValue.h
329

hmm, yeah, sorry bout that. bad upload i guess.

llvm/lib/IR/AsmWriter.cpp
3543–3552

roger, done

llvm/lib/IR/Globals.cpp
70

thanks for the catch.

we do have an invariant about "don't call getSanitizerMetadata when HasSanitizerMetada is false", and i'd rather not force these structs to be stored if there's no metadata, but i have fixed the issue.

231

as above, rather not waste memory on structs if they don't exist, so keeping the invariant

hctim updated this revision to Diff 435219.Jun 8 2022, 9:45 AM
hctim marked 5 inline comments as done.

Update.

vitalybuka accepted this revision.Jun 8 2022, 1:26 PM
vitalybuka added inline comments.
llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
1379

I guess VBR encoding does not like UINT_MAX, it will occupy more space then unconditional pushback.
Please use the same approach as: Vals.push_back(GV.hasComdat() ? VE.getComdatID(GV.getComdat()) : 0);

This revision is now accepted and ready to land.Jun 8 2022, 1:26 PM
This revision was landed with ongoing or failed builds.Jun 10 2022, 12:45 PM
This revision was automatically updated to reflect the committed changes.
hctim marked an inline comment as done.