This is an archive of the discontinued LLVM Phabricator instance.

[StackSafety] Make full LTO to attach metadata if MTE is enabled
Changes PlannedPublic

Authored by vitalybuka on May 15 2020, 4:17 PM.

Details

Reviewers
eugenis
Summary

Right now need StackSafety info only for -fsanitize=memtag and
target is arch64. In future we will need that for asan, hwasa on
any target.

I assume the only way LTO knows about nothing about sanitizers,
so we should always to insert the pass. However we'd like skip
actual work if not sanitizers where enabled. To do so we can scan
function attributes, but more efficient is just to add a module flag.

Depends on D80039.

Diff Detail

Event Timeline

vitalybuka created this revision.May 15 2020, 4:17 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 15 2020, 4:17 PM

I think this is an OK approach.
Perhaps a module flag would be even better, then it can be decoupled from "sanitize_memtag", and it would not require iterating over the entire module. You can check how it is used in CFI: !"CFI Canonical Jump Tables".

It could also be better, instead of iterating over all functions, simply run the global pass, but make sure that it does not do too much work on non-sanitize_memtag functions. I.e. the function pass should bail out on such functions, and then the data flow will have no data to run on, so that should also be pretty fast.

vitalybuka added a comment.EditedMay 15 2020, 5:12 PM

It could also be better, instead of iterating over all functions, simply run the global pass, but make sure that it does not do too much work on non-sanitize_memtag functions. I.e. the function pass should bail out on such functions, and then the data flow will have no data to run on, so that should also be pretty fast.

Actually I don't think it's right this to do. Function pass is analysis. If it's not needed, just don't query it.

module flag

Harbormaster failed remote builds in B56945: Diff 264398!
Harbormaster failed remote builds in B56944: Diff 264397!

If I'm understanding this correctly, the stack-safe metadata on allocas is both produced and consumed at LTO-time. So at the point where the stack-safe metadata would be produced, we can compute whether any later passes will query it.

Given that, why do you need a module flag?

If I'm understanding this correctly, the stack-safe metadata on allocas is both produced and consumed at LTO-time. So at the point where the stack-safe metadata would be produced, we can compute whether any later passes will query it.

Given that, why do you need a module flag?

I assumed a pass knows nothing about consumers. Can you point to some examples how this can be done?

vitalybuka added a comment.EditedMay 18 2020, 4:57 PM

If I'm understanding this correctly, the stack-safe metadata on allocas is both produced and consumed at LTO-time. So at the point where the stack-safe metadata would be produced, we can compute whether any later passes will query it.

Given that, why do you need a module flag?

I assumed a pass knows nothing about consumers. Can you point to some examples how this can be done?

Oh, I assume the idea to take that from AArch64TargetMachine.
Still AArch64TargetMachine is going to use this info only if sanitizers where enabled. Also right now SSI is used for memtag. but asan, hwasan may follow.
I assume LTO PM knows nothing about if clang was using sanitizers, so we have always to insert the pass. Then we'd like to return from pass if nothing to do there.
The pass may check if there are functions with corresponding sanitizer attributes. However with module flag we can make it more efficient and avoid scanning all functions.

BTW. function attribute was used in the first snapshot, but @eugenis proposed the flag and I agree with him that now it's nicer.

vitalybuka edited the summary of this revision. (Show Details)May 18 2020, 5:03 PM

Okay, I think we're mostly on the same page, then.

I have a few issues here:

  1. Whether the backend wants this information is really a per-function decision, not a per-module decision; using module-level metadata is sort of weird.
  2. Having weird rules like this makes it harder to write a non-clang frontend producing LLVM IR.
  3. All metadata produced and used by in-tree code needs to be documented in LangRef.
  4. Is there some reason this analysis needs to run in the middle of the pass pipeline, as opposed to running it closer to where we actually use the information? Along those lines, do we need to encode this as metadata at all?

Okay, I think we're mostly on the same page, then.

Probably most important is Q no. 4.

I have a few issues here:

  1. Whether the backend wants this information is really a per-function decision, not a per-module decision; using module-level metadata is sort of weird.

Correct. Seems like redundancy.
BTW. Even if the function is not sanitized, it's stack safety analysis can be useful for sanitized callers.

  1. Having weird rules like this makes it harder to write a non-clang frontend producing LLVM IR.

So it's traidoff , quit pass early vs more computations. I prefer the flag, but not strongly.

  1. All metadata produced and used by in-tree code needs to be documented in LangRef.

I'll update if we go this way

  1. Is there some reason this analysis needs to run in the middle of the pass pipeline, as opposed to running it closer to where we actually use the information? Along those lines, do we need to encode this as metadata at all?

In the current design StackSafety is ModulePass and AArch64StackTagging is FunctionPass.
We have some difficulty of using ModulePass from FunctionPass without significant reordering passes. But I suspect it's possible to find good solution. I'll try to get more details.
If we go that way then this patch is not needed.

vitalybuka planned changes to this revision.May 26 2020, 2:20 PM

I'll probably abandon this. I've converted the pass into pure analysis and I need to upstream that.