This is an archive of the discontinued LLVM Phabricator instance.

LowerBitSets: Extend pass to support functions as bitset members.
ClosedPublic

Authored by pcc on Aug 7 2015, 6:30 PM.

Details

Summary

This change extends the bitset lowering pass to support bitsets that may
contain either functions or global variables. A function bitset is lowered to
a jump table that is laid out before one of the functions in the bitset.

Also add support for non-string bitset identifier names. This allows for
distinct metadata nodes to stand in for names with internal linkage,
as done in D11857.

Diff Detail

Event Timeline

pcc updated this revision to Diff 31567.Aug 7 2015, 6:30 PM
pcc retitled this revision from to LowerBitSets: Extend pass to support functions as bitset members..
pcc updated this object.
pcc added a reviewer: kcc.
pcc added a subscriber: llvm-commits.
pcc updated this revision to Diff 31870.Aug 11 2015, 2:59 PM
  • Permit non-string metadata as a bitset identifier
pcc updated this revision to Diff 31872.Aug 11 2015, 3:06 PM
  • Update docs to reflect metadata identifier change
pcc added a reviewer: jfb.Aug 17 2015, 1:01 PM
pcc added a subscriber: jfb.
pcc updated this object.Aug 17 2015, 1:05 PM
jfb added inline comments.Aug 18 2015, 10:58 AM
lib/Transforms/IPO/LowerBitSets.cpp
226–227

Comment is out of date.

576

I'm not sure I understand padding elements. Could you explain further?

584

Does this work with thread-local globals?
I'm also a bit worried about visibility / section / linkage type, but I may be misunderstanding the implications of what this patch adds.

594

Could this be a virtual function instead? report_fatal_error by default, and only implemented for x86. I'm assuming the creation will also need to be specialized.

657

Can Globals ever be empty? It seems a bit silly, but not entirely unlikely if someone just runs the pass.

672

Give it some dummy internal name starting with __ so it's easier to figure things out. I'm assuming there's a test that'll check for that function name too.

701

clang-format, here and a few other places.

1017

IIUC duplicate indices can't occur?

1034

Eek, the above lines make me cry. LLVM doesn't have a way to create ArrayRef taking into account its magical type hierarchy? It would be nice to avoid the reinterpret_cast here, since it relies heavily on the GlobalObject / GlobalVariable / Function class hierarchy layout...

pcc updated this revision to Diff 32783.Aug 20 2015, 7:03 PM
  • Address review comments
lib/Transforms/IPO/LowerBitSets.cpp
226–227

Updated.

576

I've added a little more explanation to the comments above. Regarding the purpose of the padding elements, this is already documented at http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html#padding-to-powers-of-2

584

Does this work with thread-local globals?

It does not. I've added a couple of checks below that ensure that no thread-local globals or globals with an explicit section can be members of bitsets.

I'm also a bit worried about visibility / section / linkage type

Conceptually, linkage and visiblity are both properties of the symbol, rather than of its position in the object file (as thread-localness and section are), so they can be copied from the original entity to the alias.

Linkage is already copied to the alias. I'm also copying visibility now.

I may be misunderstanding the implications of what this patch adds.

This isn't actually new code; see line 579 of the old code below. For some reason the diff doesn't match up correctly here. But thanks all the same for raising these issues :)

594

It could be, but I would prefer to do this when we add support for another architecture (which is mostly blocked on an implementation of http://lists.llvm.org/pipermail/llvm-dev/2015-July/088748.html or something similar). Once something like that goes in, we should be able to simplify this code significantly as well.

657

It cannot. The buildBitSetsFromGlobalVariables function handles any empty global sets -- see line 894 of the new code. I've added a comment here to make this clear.

672

Done, but it isn't too useful as we end up renaming all functions appearing in bitsets anyway. Not sure if there's much we can do about that without making this code even more complicated.

701

Done

1017

Correct, clarified in the comments.

1034

LLVM doesn't have a way to create ArrayRef taking into account its magical type hierarchy?

I'm not sure if there can be a general way to do this due to covariance/contravariance issues.

It would be nice to avoid the reinterpret_cast here, since it relies heavily on the GlobalObject / GlobalVariable / Function class hierarchy layout...

I decided to just rebuild the array separately on each branch of this conditional. It's a little duplicative, but the code is probably clearest this way.

jfb accepted this revision.Aug 20 2015, 7:15 PM
jfb edited edge metadata.

A few comments, but lgtm overall. It would be nice to get another review on this as it's somewhat tricky stuff :-)

lib/Transforms/IPO/LowerBitSets.cpp
594

That sounds fine, though the documentation should be updated the the appropriate x86 limitations, and potentially evolution out of this.

657

Could you just assert at the function entry? Code tends to evolve past comments ;-)

This revision is now accepted and ready to land.Aug 20 2015, 7:15 PM
pcc updated this revision to Diff 32789.Aug 20 2015, 8:04 PM
pcc edited edge metadata.
  • Address comments
kcc added inline comments.Aug 24 2015, 3:57 PM
docs/BitSets.rst
13–23

maybe a bullet/numbered list?

33

You introduce the concept of jump tables w/o explaining them.
Maybe add a paragraph telling what jump table is?

lib/Transforms/IPO/LowerBitSets.cpp
939

quite a bit of code in a single function. Maybe split it into several?

1017

Two pieces of very similar code.
Can we use some C++ magic to not duplicate the code (template? use the parent class?)

pcc updated this revision to Diff 33036.Aug 24 2015, 6:39 PM
  • Address kcc comments
docs/BitSets.rst
13–23

Done

33

Done

lib/Transforms/IPO/LowerBitSets.cpp
939

This and one other part of the code is now split out.

1017

We could, but I felt that it would be clearest to write the code out on each branch. It's not a lot of code.

kcc edited edge metadata.Aug 25 2015, 6:37 PM

I am still trying to wrap around the code, failing so far.
Maybe you could add a high-level algorithm description in the comments?
Also, two more minor request for more comments.

docs/BitSets.rst
33

jump table feels like a key concept here and deserves a separate paragraph with some more details,
maybe even an illustration

test/Transforms/LowerBitSets/function-ext.ll
2

here and below, please add tests explaining what we are testing for

pcc updated this revision to Diff 33176.Aug 25 2015, 9:10 PM
pcc edited edge metadata.
  • Add more docs
pcc added a comment.Aug 25 2015, 9:11 PM

Maybe you could add a high-level algorithm description in the comments?

Done

docs/BitSets.rst
33

Done,

test/Transforms/LowerBitSets/function-ext.ll
2

Done

kcc added inline comments.Aug 26 2015, 4:10 PM
lib/Transforms/IPO/LowerBitSets.cpp
725

Why do we do this?

Yes, this will reduce the size of the jump table by 8 bytes,
but is this optimization worth the extra complexity?
We must try to be as simple as possible

pcc added inline comments.Aug 26 2015, 4:21 PM
lib/Transforms/IPO/LowerBitSets.cpp
725

The motivation wasn't so much optimization but the fact that the jump table is guaranteed to appear in an executable section if we do this. However, if it would work just to set the section name it may be best to do that instead (and fix it if not). Let me take a look.

krasin added a subscriber: krasin.Aug 26 2015, 5:06 PM

A few shallow comments...

lib/Transforms/IPO/LowerBitSets.cpp
609

What is 0xe9?

781

What happens, if the last function isDiscardableIfUnused? Could it be that a consequent pass would drop the whole jump table?

785

May be a lame question: what if __function_bitset_dummy is taken?

test/Transforms/LowerBitSets/function-ext.ll
4

a minor typo: s/aare/are/

test/Transforms/LowerBitSets/nonstring.ll
21

Other newly added tests have CHECK statements inside foo / bar / baz functions, and this test does not. Is it intentional?

pcc updated this revision to Diff 33286.Aug 26 2015, 5:59 PM
  • Address comments
lib/Transforms/IPO/LowerBitSets.cpp
609

This would be the opcode for the PC-relative jmp instruction. I thought this was clear enough from the variable name and how it is used, but maybe not?

725

Okay, done. The code is much simpler now.

781

This comment is obsoleted by the move to a global variable for the jump table. (But any uses of the jump table would also count as uses of the function, so this shouldn't matter.)

785

Also obsolete

test/Transforms/LowerBitSets/function-ext.ll
4

Fixed and a few other typos here

test/Transforms/LowerBitSets/nonstring.ll
21

I suppose we could do that here as well; done.

kcc added a comment.Aug 31 2015, 5:27 PM

I think I will soon run out of comments. :)

lib/Transforms/IPO/LowerBitSets.cpp
701

instead of 0xe9, 8, 5, and 0xcc in this function please define something like

static const int kIntCode = 0xcc;
static const int kJmpCode = 0xe9;

static const int kOffsetDispSize = 4;
...

use getJumpTableEntrySize instead of '8'

732

Should this be named "Functions"?

768

Is this part of comment still relevant?

pcc updated this revision to Diff 34100.Sep 4 2015, 8:01 PM
  • Update documentation
  • Address review comments
pcc added inline comments.Sep 4 2015, 8:02 PM
lib/Transforms/IPO/LowerBitSets.cpp
950

Done

981

Done

1017

Done

kcc accepted this revision.Sep 8 2015, 12:21 PM
kcc edited edge metadata.

LGTM

I am out of useful comments (except one minor one below)

lib/Transforms/IPO/LowerBitSets.cpp
986

static, please (or anon namespace, or a class member )

This revision was automatically updated to reflect the committed changes.