This is an archive of the discontinued LLVM Phabricator instance.

[RFC/WIP] CodeGen: Initial support for references to absolute symbols in immediate operands.
Needs ReviewPublic

Authored by pcc on Sep 30 2016, 6:22 PM.

Details

Summary

Our support for references to absolute symbols is not very good. The symbol
will be resolved accurately in non-PIC code, but suboptimally: the symbol
reference cannot currently appear as the immediate operand of an instruction,
and the code generator cannot make any assumptions about the value of the
symbol (so for example, it could not use a R_X86_64_8 relocation if the
value is known to be in the range 0..255).

In PIC mode, if the reference is not known to be DSO-local, the value is loaded
from the GOT (or a synthetic GOT entry), which again means suboptimal code. If
the reference is known to be DSO-local, the symbol will be referenced with a PC
relative relocation and therefore cannot be resolved properly to an absolute
value (c.f. D19844). The latter case in particular would seem to indicate
that a representational change is required for correctness to distinguish
references to absolute symbols from references to regular symbols. The
specific change I have in mind is to add an "absolute" flag to GlobalObject.

To allow the code generator to narrow absolute global references appearing
as immediates, we would allow !range metadata on GlobalObjects. This would
be similar to existing !range metadata, but it would apply to the "address"
of the attached GlobalObject, rather than any value loaded from it. See
for example the test case included with this patch, which involves a testb
instruction with an absolute immediate operand.

This patch implements part of this change. Rather than introducing the
"absolute" flag, it just checks for !range metadata on GlobalObjects
(switching to the correct representation is a TODO). At the SDAG level,
if a global reference has !range metadata and the target/object format
supports the required relocations (ELF only for now), we combine it into a
special ConstantSDNode which holds a reference to the symbol. Like any other
ConstantSDNode, this ConstantSDNode can appear as an immediate operand. This
means that we now have ConstantSDNodes which are not in fact constant integers,
but to a certain extent this seems like the least worst approach. I considered
other approaches before arriving at this one:

  • Introduce a new opcode for absolute symbol constants. This intuitively seemed like the least risky approach, as individual instructions could "opt in" to the new absolute symbol references. However, this seems hard to fit into the existing SDAG pattern matching engine, as the engine expects each "variable" to have a specific opcode. I tried adding special support for "either of the two constant opcodes" to the matcher, but I could not see a good way to do it without making fundamental changes to how patterns are matched.
  • Use the ISD::Constant opcode for absolute symbol constants, but introduce a separate class for them. This also seemed problematic, as there is a strong assumption (both in existing SDAG code and in generated code) of a many-to-one mapping from opcodes to classes.

In the end I decided that because

  • this ConstantSDNode is morally similar in certain respects to a regular constant that can appear as an immediate, and SDAG already has an opcode which in principle inhibits looking at the constant value (i.e. TargetConstant)
  • we can consider documenting absolute symbol references as an experimental feature
  • we can avoid exposing this feature to C and so we have control over how it is used

it may be reasonable to proceed even in light of potential type confusion
problems.

Event Timeline

pcc updated this revision to Diff 73184.Sep 30 2016, 6:22 PM
pcc retitled this revision from to [RFC/WIP] CodeGen: Initial support for references to absolute symbols in immediate operands..
pcc updated this object.
pcc added reviewers: chandlerc, eugenis, mehdi_amini, sanjoy.
pcc added subscribers: krasin, llvm-commits.
chandlerc edited edge metadata.Sep 30 2016, 11:38 PM

I'm *really* excited to see this fixed. Our handling of these references has been a source of a lot of frustration.

A small comment on the overall design:

This patch implements part of this change. Rather than introducing the
"absolute" flag, it just checks for !range metadata on GlobalObjects
(switching to the correct representation is a TODO).

So you're still planning on adding the absolute bit here? What will that gain us over the range metadata?

At the SDAG level,
if a global reference has !range metadata and the target/object format
supports the required relocations (ELF only for now), we combine it into a
special ConstantSDNode which holds a reference to the symbol. Like any other
ConstantSDNode, this ConstantSDNode can appear as an immediate operand. This
means that we now have ConstantSDNodes which are not in fact constant integers,
but to a certain extent this seems like the least worst approach.

Oof. I see your explanation of why, but this does seem painful. We have both "target" and "opaque" constants already though (although i have *no* idea what "opaque" constants really are), so this doesn't really seem to make things much worse.

In some ways, modeling relocation constants as "target constants" makes a lot of sense...

pcc added a comment.Oct 4 2016, 12:39 PM

So you're still planning on adding the absolute bit here? What will that gain us over the range metadata?

After thinking about this a little more, I think we don't need the absolute bit if we just define !range to mean "the address of this symbol is fixed at link time, and this is its range". That would seem to be the correct definition -- it would override any assumptions about the symbol value that may be derived from the relocation model, code model and symbol visibility rather than being a strict limitation (e.g. consider non-PIC code compiled with the medium code model -- this would be equivalent to a !range of [0..2^32) on every global but the metadata could in principle provide a wider range on 64-bit targets).

Oof. I see your explanation of why, but this does seem painful. We have both "target" and "opaque" constants already though (although i have *no* idea what "opaque" constants really are), so this doesn't really seem to make things much worse.

Agreed, though this has become more painful as I've been extending this patch to handle globals in more places. One possibility for lessening the pain may be to introduce derived classes of ConstantSDNode (one for ConstantInt and another for globals). I'll continue experimenting to see if I can find the best approach.

sanjoy resigned from this revision.Jan 29 2022, 5:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2022, 5:36 PM