This is an archive of the discontinued LLVM Phabricator instance.

`ForceFunctionAttrs`: support overriding attributes via csv file
ClosedPublic

Authored by Puneeth-A-R on Jul 18 2023, 9:06 AM.

Details

Summary

Update ForceFunctionAttrs pass to optionally take its input from a csv file, for example, function-level optimization attributes. A subsequent patch will enable the pass pipeline to be aware of these attributes, and this pass will be used to test that is the case. Eventually, the annotations would be driven by an agent, e.g. a machine learning-based policy.

This patch is a part of GSoC 2023, more details can be found here

Diff Detail

Event Timeline

Puneeth-A-R created this revision.Jul 18 2023, 9:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 9:06 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
Puneeth-A-R requested review of this revision.Jul 18 2023, 9:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 9:06 AM
This comment was removed by Puneeth-A-R.

Updated revision

mtrofin added inline comments.Jul 18 2023, 4:21 PM
llvm/include/llvm/Transforms/Utils/FunctionAnnotator.h
17

you need to run clang-format on your changes - (e.g. this #endif should be lined up at the beginning of the line; the class statement should be lined up with namespace

you can for example do git clang-format HEAD~1 at this point.

llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
11

remove commented code - if it's part of a future change, you can have that in your local repo (uncommented :) ).

35

I think we generally don't want to use std::ifstream. Use MemoryBuffer. Kind of like this:

auto BufferOrError = MemoryBuffer::getFileOrSTDIN(CSVFilePath);
  if (!BufferOrError) {
    Ctx.emitError("Error opening output specs file: " + CSVFilePath + " : " +
                  BufferOrError.getError().message());
    return std::nullopt;
  }
...
now use BufferOrError.get()->getBuffer()

Then, you can traverse the StringRef with split and so on.

The main value of the StringRef approach (and, IIUC, the reason it's preferred in this codebase) is it doesn't allocate lots of little string objects on the heap.

43

nit: const Function &F

Puneeth-A-R marked an inline comment as done.Jul 19 2023, 7:33 AM
Puneeth-A-R added inline comments.
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
43

Turns out, there isn't an overload of Function that has addFnAttr function in it. So that can't be made a const.

Puneeth-A-R marked an inline comment as done.Jul 20 2023, 8:35 AM
Puneeth-A-R updated this revision to Diff 546108.EditedAug 1 2023, 9:55 AM

Got rid of std::ifstream. Now using MemoryBuffer. And made other requested changes.

Puneeth-A-R marked 2 inline comments as done.Aug 1 2023, 9:59 AM
Puneeth-A-R added inline comments.Aug 1 2023, 10:04 AM
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
21

@mtrofin Does this need to be closed? If so, how to go about doing it? From several examples that I saw, it seems this doesn't need to be closed.

mtrofin added inline comments.Aug 2 2023, 10:01 AM
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
12
12

I'm a bit confused how the test works since it's not passing -opt-level-attribute-name?

That aside, maybe have the default value not be "" but instead the actual attribute name. If there can only be one value, maybe just make this a const char* for now.

17

could you give this flag a less general name like -func-annotator-csv-file-path

21

See the implementation - the file is loaded in entirety in the memory buffer, then closed.

27

naming - first letter capital

but more importantly, doesn't BufferOrError.get() give you the memory buffer?

28

nit: why SkipBlanks=false? (fine by me, the csv is for test - just curious)

28

Itr (first letter must be capital)

we tend to call iterators "I" or "It"

34

This traversal assumes the order in which you'll visit the functions in the module coincides with the order they are in the csv. While this is for test, it's also fragile and easily avoidable: you can iterate through the csv, and use Module::getFunction(StringRef) to fetch the corresponding Function* - which will be nullptr if it's not defined/declared.

35

I think the preferred style is to use "empty()", i.e. if (!It->split(',').second.empty())

also - while at it. Don't call operator*(), just dereference, so It->split() should work.
(this last comment applies in a few more places)

71

newline

llvm/test/Transforms/Annotations/FunctionAnnotation.csv
7

newline

llvm/test/Transforms/Annotations/FunctionAnnotator.ll
2

more canonical is to have -csv-file-path=%S/FunctionAnnotation.csv

%S will be substituted to the directory of the .ll you're running.

4

this .ll should define some functions - specifically, those you name in the csv. can be as easy as

define void @first_function() {
  ret
}

etc.

Puneeth-A-R marked 10 inline comments as done.Aug 7 2023, 12:30 PM
Puneeth-A-R added inline comments.
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
12

Will be talking about the test case on call.

21

Oh yeah! Silly me.

27

BufferOrError.get() seems to return llvm::ErrorOr<std::__1::unique_ptr<llvm::MemoryBuffer>> and not a plain memory buffer.

28

I wonder why... just a convention or is there a specific reason?

Puneeth-A-R marked 3 inline comments as done.Aug 7 2023, 12:37 PM

Order of functions appearing in CSV file can now be independent of their
sequence of appearance in code/IR + more.

mtrofin added inline comments.Aug 7 2023, 1:00 PM
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
28

The "first letter capital" is coding convention (https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly)

The name itself is more like what became, arguably, convention - and it's easier to read by others if we follow it.

Updated test case.

Puneeth-A-R marked 2 inline comments as done.Aug 9 2023, 8:57 AM
mtrofin added inline comments.Aug 9 2023, 2:42 PM
llvm/lib/Transforms/Utils/FunctionAnnotator.cpp
33

Nit: I'd avoid having this as a flag. Let's just say the first row must be header.

56

how about instead of a while loop:

for(; !It.is_at_end(); ++It) {

then no need for ++It; everywhere

61

I think you want to use errs() here. Probably could use LLVMContext::diagnose, too, but since this is for test, errs() should be fine. In any case, not outs().

llvm/test/Transforms/Annotations/FunctionAnnotator.ll
3

you want to check also that first_function has associated with it O1, while second_function has O3... etc. If you run your opt command line over this .ll file, you should see how the annotations appear in the output text.

also check stderr for the messages saying that fourth and fifth functions weren't found.

For the latter, just add 2>&1 to the end of the opt command line, so you can combine stdout and stderr together to what FileCheck sees. the error messages for fourth and fifth functions should come in order - because of how you traverse the csv.

Puneeth-A-R marked 5 inline comments as done.Aug 16 2023, 8:54 AM

Final patch

Puneeth-A-R retitled this revision from [WIP] GSoC 2023: Pass to annotate functions with appropriate optimization level. to Pass to annotate functions with appropriate optimization level..Aug 16 2023, 10:41 AM
Puneeth-A-R edited the summary of this revision. (Show Details)
Puneeth-A-R marked an inline comment as done.

it seems like it'd be better to extend ForceFunctionAttrsPass rather than add a separate pass

Updated patch

Final working patch

mtrofin added inline comments.Aug 23 2023, 11:53 AM
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
84 ↗(On Diff #552830)

I'd check explicitly that CSVFilePath.empty(). Right now you're saying "we'll do the other behavior if the csv file path is invalid", but that's surprising to a user that means to use a csv file and mis-types it. So it should be:

  • if the CSVFilePath is empty (== not given), then do the old thing
  • otherwise, do the new thing, which includes emitting error if the path is invalid.

Final working patch 2.0

Puneeth-A-R marked an inline comment as done.Aug 23 2023, 12:07 PM

Updated patch 3.0

Updated patch 4.0

mtrofin accepted this revision.Aug 23 2023, 12:33 PM
This revision is now accepted and ready to land.Aug 23 2023, 12:33 PM

I think this hardcodes a bit too much about the "opt-level" attribute. I was hoping that we could make force-function-attrs generic enough without knowing specifically about opt-level.

e.g. the csv would look like

first_function,opt-level=O1
second_function,opt-level=O3
third_function,opt-level=O1
fourth_function,opt-level=O2
fifth_function,opt-level=O3

and it could contain other attributes like

second_function,cold
fifth_function,foo=bar
aeubanks requested changes to this revision.Aug 23 2023, 1:10 PM
This revision now requires changes to proceed.Aug 23 2023, 1:10 PM

if you were planning on submitting opt-level-specific things, I'd be hesitant to land it unless there was proof that it would actually be used

but if this patch just made force-function-attrs more generic, that'd be good to land

Final Updated Patch.

Final Updated Patch.

lgtm, some nits you can easily address.

@aeubanks, ptal - this update should address your concerns, iiuc?

technically you would need a test for the error case, but maybe that's overkill because this is a test-only pass.

llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
35 ↗(On Diff #555632)

you don't need this anymore, since it's taken care of by the csv format.

107 ↗(On Diff #555632)

Attrkind (style guide)

also, since you're setting it right after from the getAttrKindFromName function call, just initialize it from that rather than setting it to None, then calling that function.

115 ↗(On Diff #555632)

Nit: it is more readable, I think, if you did this:

for(++It... etc)
  auto SplitPair = ..
  if (SplitPair.second.empty()) {
    errs() << ...
    break;
  }
  ...

because you don't need an else (since you break), and thus less indentation.

Also break because... why bother continuing. It's an error anyway.

Puneeth-A-R marked 2 inline comments as done.Sep 4 2023, 9:37 AM
Puneeth-A-R added inline comments.
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
115 ↗(On Diff #555632)

Well, the if (!SplitPair.second.empty()) { does not have a corresponding else. I think, the else that you are referring to actually reports the error if the function name is present in the CSV file, but is not found in the module's IR. Because, if the function doesn't exist in the IR, adding any attributes to it would not be possible. But the upcoming lines in the CSV might have a correct function name. So that's what the continue does on the next line. We skip the current iteration and move on to the next.

Final updated patch.

Final Updated Patch.

Puneeth-A-R marked an inline comment as done.Sep 4 2023, 10:53 AM

lgtm, some nits you can easily address.

@aeubanks, ptal - this update should address your concerns, iiuc?

yeah this is what I was imagining, thanks!

technically you would need a test for the error case, but maybe that's overkill because this is a test-only pass.

should be pretty straightforward

; RUN: not opt -passes=forceattrs -func-annotator-csv-file-path=doesntexist.csv < %s | FileCheck %s --check-prefix=CSVNOTFOUND
; CSVNOTFOUND: Cannot open CSV file
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
36 ↗(On Diff #555772)

I'd shorten this to something like forceattrs-csv-path

37 ↗(On Diff #555772)
Path to CSV file containing lines of function names and attributes to add to them in the form of `f1,attr1` or `f2,attr2=str`
78 ↗(On Diff #555772)

still unused, let's revert this

80 ↗(On Diff #555772)

nit: remove random empty lines

83 ↗(On Diff #555772)

nit: remove braces for one line if statements (but not ones with else statements that aren't also one line)
https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

105 ↗(On Diff #555772)

you can have string attributes without a value, but we can leave that for a future patch, please add a TODO

106 ↗(On Diff #555772)

should also report errors on this

113 ↗(On Diff #555772)

should probably report_fatal_error if the function doesn't exist (at the end of the pass so we can report all missing functions)

116 ↗(On Diff #555772)

since the flags for this pass are orthogonal, we shouldn't only run the csv code if both the csv flag and the other force-function-attrs flags are passed, we should do both

llvm/test/Transforms/ForcedFunctionAttrs/FunctionAnnotation.csv
1 ↗(On Diff #555772)

remove?

aeubanks added inline comments.Sep 4 2023, 9:38 PM
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
111 ↗(On Diff #555772)

this should also have a test

you can reuse the .ll test but with a different .csv that contains a function name not in the module

Puneeth-A-R marked 6 inline comments as done.Sep 5 2023, 7:25 AM
Puneeth-A-R added inline comments.
llvm/test/Transforms/ForcedFunctionAttrs/FunctionAnnotation.csv
1 ↗(On Diff #555772)

@mtrofin? We had decided upon having the header, should I go ahead & remove it?

Puneeth-A-R marked 2 inline comments as done.Sep 5 2023, 8:14 PM
Puneeth-A-R added inline comments.
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
113 ↗(On Diff #555772)

@aeubanks is it okay to have this as a TODO for now?

aeubanks added inline comments.Sep 6 2023, 9:03 AM
llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
113 ↗(On Diff #555772)

sure

Puneeth-A-R marked 2 inline comments as done.Sep 7 2023, 9:35 AM
mtrofin added inline comments.Sep 7 2023, 9:59 AM
llvm/test/Transforms/ForcedFunctionAttrs/FunctionAnnotation.csv
1 ↗(On Diff #555772)

I'm ambivalent, I kind of like it's there because it's easy to read (in the test, esp.), but if @aeubanks really wants it out, it's also easy to remove.

aeubanks added inline comments.Sep 7 2023, 10:08 AM
llvm/test/Transforms/ForcedFunctionAttrs/FunctionAnnotation.csv
1 ↗(On Diff #555772)

this patch isn't specific to optimization levels anymore, so I don't see the point in keeping it. and it'll break if we start checking that function names here are actually in the IR

Puneeth-A-R marked 4 inline comments as done.Sep 7 2023, 10:31 AM

Final patch.

almost looks good, just a couple of suggestions

llvm/lib/Transforms/IPO/ForceFunctionAttrs.cpp
118 ↗(On Diff #556175)

we can only return this if we haven't made any changes to the IR.

the simplest way is have a bool Changed = false; at the top, set it to true in the !CSVFilePath.empty() block, and change the below to

if (hasForceAttributes()) {
  for (Function &F : M.functions())
    forceAttributes(F);
  Changed = true;
}
// Just conservatively invalidate analyses if we've made any changes, this isn't likely to be important.
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
llvm/test/Transforms/ForcedFunctionAttrs/FunctionAnnotator.ll
35 ↗(On Diff #556175)

we typically put all RUN lines at the very top

Final Patch.

Final Patch.

Puneeth-A-R marked 2 inline comments as done.Sep 7 2023, 11:20 AM
aeubanks accepted this revision.Sep 7 2023, 11:21 AM

lgtm, I actually really like this addition to the pass, thanks!

This revision is now accepted and ready to land.Sep 7 2023, 11:21 AM

please update the commit title and description before pushing

mtrofin retitled this revision from Pass to annotate functions with appropriate optimization level. to `ForceFunctionAttrs`: support overriding attributes via csv file.Sep 7 2023, 3:29 PM
mtrofin edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Sep 7 2023, 3:40 PM
This revision was automatically updated to reflect the committed changes.