This is an archive of the discontinued LLVM Phabricator instance.

[globalisel] Allow SrcOp to convert an APInt and render it as an immediate operand (MO.isImm() == true)
ClosedPublic

Authored by dsanders on Apr 30 2019, 9:43 AM.

Event Timeline

dsanders created this revision.Apr 30 2019, 9:43 AM

For the use cases, I think this should just be int64_t instead of APInt

For the use cases, I think this should just be int64_t instead of APInt

int64_t was my first thought for it but I got rather worried about the potential mistakes due to registers being unsigned. For example, is 1 a register or an immediate? It turned out that my worries were moot as the compiler doesn't allow both an unsigned and an int64_t overload anyway (https://godbolt.org/z/zO8LhX)

For the use cases, I think this should just be int64_t instead of APInt

int64_t was my first thought for it but I got rather worried about the potential mistakes due to registers being unsigned. For example, is 1 a register or an immediate? It turned out that my worries were moot as the compiler doesn't allow both an unsigned and an int64_t overload anyway (https://godbolt.org/z/zO8LhX)

For the use cases, I think this should just be int64_t instead of APInt

int64_t was my first thought for it but I got rather worried about the potential mistakes due to registers being unsigned. For example, is 1 a register or an immediate? It turned out that my worries were moot as the compiler doesn't allow both an unsigned and an int64_t overload anyway (https://godbolt.org/z/zO8LhX)

I vaguely remember trying to do something like this for SrcOp before, and hit this problem but didn't continue working on it. If you disambiguate the constant type, it works: https://godbolt.org/z/aLyc1T
I don't think using needing to use INT64_C is really worse than having to construct an APInt. I really despise using a plain unsigned for registers, but fixing that would be a hopelessly large project

ehhh

I agree it would be nice to use an int64_t because it matches up with the MachineOperand API. (e.g. getImm() returns an int64_t)

But, OTOH, I can see the benefit of disambiguating the type using an APInt.

Overall, the patch looks fine to me. I'd be fine with either solution. @arsenm, do you feel strongly about this?

How terrible would it be to use APInt as the interface type, but then store it as int64_t?

How terrible would it be to use APInt as the interface type, but then store it as int64_t?

That seems like a good compromise to me.

How terrible would it be to use APInt as the interface type, but then store it as int64_t?

That seems like a good compromise to me.

I don't think I agree. APInt as an interface type won't really save any typing, so what advantage would this compromise have? It's about the same to write SrcOp(APInt(x)) vs SrcOp(int64_t(x)), but the APInt version is more expensive and if we're not actually handling arbitrary precision it's quite a bit more confusing.

How terrible would it be to use APInt as the interface type, but then store it as int64_t?

That seems like a good compromise to me.

I don't think I agree. APInt as an interface type won't really save any typing, so what advantage would this compromise have? It's about the same to write SrcOp(APInt(x)) vs SrcOp(int64_t(x)), but the APInt version is more expensive and if we're not actually handling arbitrary precision it's quite a bit more confusing.

I think it'd be nice to not worry about picking the wrong overloads. For eg, if you have an unsigned constant and you forget to explicitly cast to int64_t it will still compile (and pick the register version even though we wanted the immediate) - where as if the only way to get immediate is APInt and we forget to cast, it will not compile.

We could try to be heroic and use something stronger than unsigned for registers

We could try to be heroic and use something stronger than unsigned for registers

We could try to be heroic and use something stronger than unsigned for registers

Are you talking about using stronger types for registers just in the Builder interface? What would the usage look like? My big concern (besides the massive change this would take) is how verbose it would be to use the APIs.

We could try to be heroic and use something stronger than unsigned for registers

We could try to be heroic and use something stronger than unsigned for registers

Are you talking about using stronger types for registers just in the Builder interface? What would the usage look like? My big concern (besides the massive change this would take) is how verbose it would be to use the APIs.

I mean everywhere, but that's a huge project. I imagine a struct Register { unsigned X; isVirtual(); isPhysical() }

We could try to be heroic and use something stronger than unsigned for registers

We could try to be heroic and use something stronger than unsigned for registers

Are you talking about using stronger types for registers just in the Builder interface? What would the usage look like? My big concern (besides the massive change this would take) is how verbose it would be to use the APIs.

I mean everywhere, but that's a huge project. I imagine a struct Register { unsigned X; isVirtual(); isPhysical() }

I'm sure Aditya could cook up one of his clang plugins to automate it.

We could try to be heroic and use something stronger than unsigned for registers

Are you talking about using stronger types for registers just in the Builder interface? What would the usage look like? My big concern (besides the massive change this would take) is how verbose it would be to use the APIs.

I mean everywhere, but that's a huge project. I imagine a struct Register { unsigned X; isVirtual(); isPhysical() }

FWIW, that would be my preferred solution if it weren't for the sheer amount of time required to achieve it. It's always felt wrong that we use unsigned to represent registers.

We could try to be heroic and use something stronger than unsigned for registers

Are you talking about using stronger types for registers just in the Builder interface? What would the usage look like? My big concern (besides the massive change this would take) is how verbose it would be to use the APIs.

I mean everywhere, but that's a huge project. I imagine a struct Register { unsigned X; isVirtual(); isPhysical() }

FWIW, that would be my preferred solution if it weren't for the sheer amount of time required to achieve it. It's always felt wrong that we use unsigned to represent registers.

I tried prototyping it, and it seems to not be entirely infeasible: D63496

I think managing the migration inside GlobalISel should be manageable enough to avoid having an unsigned overload of SrcOp

I'm currently rebasing this patch switching APInt to uint64_t and one thing I'm a bit worried about is the case where the implicit conversion from Register to unsigned causes registers to be passed where immediates are expected. At the moment our plan guards against the case where Register is correctly used but not against trivial misuse like:

unsigned Reg = Register();
SrcOp(Reg)

and with the current unsigned constructor, this:

SrcOp(1)

will be treated as a register.

I had a look into the feasibility of eliminating the implicit conversions and that's way too big a task at the moment to block this on fixing that (I'll post some patches that propagate Register a bit more tomorrow, they were a side-effect of looking into it). I have a potential alternative though. We can add some deleted constructors like in https://godbolt.org/z/0zQOJX. The downside of this is plain constants like 0 won't work, you need to use 0ul. Does that limitation sound ok if it makes mixing up registers and immediates harder?

I'm currently rebasing this patch switching APInt to uint64_t and one thing I'm a bit worried about is the case where the implicit conversion from Register to unsigned causes registers to be passed where immediates are expected. At the moment our plan guards against the case where Register is correctly used but not against trivial misuse like:

unsigned Reg = Register();
SrcOp(Reg)

and with the current unsigned constructor, this:

SrcOp(1)

will be treated as a register.

I had a look into the feasibility of eliminating the implicit conversions and that's way too big a task at the moment to block this on fixing that (I'll post some patches that propagate Register a bit more tomorrow, they were a side-effect of looking into it). I have a potential alternative though. We can add some deleted constructors like in https://godbolt.org/z/0zQOJX. The downside of this is plain constants like 0 won't work, you need to use 0ul. Does that limitation sound ok if it makes mixing up registers and immediates harder?

Of course, the moment I post I notice a case I'd missed. AArch64::WZR is an unsigned, we'd have to pass it as Register(AArch64::WZR) to avoid hitting the deleted constructor

arsenm added a comment.Aug 1 2019, 8:52 PM

I'm currently rebasing this patch switching APInt to uint64_t and one thing I'm a bit worried about is the case where the implicit conversion from Register to unsigned causes registers to be passed where immediates are expected. At the moment our plan guards against the case where Register is correctly used but not against trivial misuse like:

unsigned Reg = Register();
SrcOp(Reg)

and with the current unsigned constructor, this:

SrcOp(1)

will be treated as a register.

I had a look into the feasibility of eliminating the implicit conversions and that's way too big a task at the moment to block this on fixing that (I'll post some patches that propagate Register a bit more tomorrow, they were a side-effect of looking into it). I have a potential alternative though. We can add some deleted constructors like in https://godbolt.org/z/0zQOJX. The downside of this is plain constants like 0 won't work, you need to use 0ul. Does that limitation sound ok if it makes mixing up registers and immediates harder?

Of course, the moment I post I notice a case I'd missed. AArch64::WZR is an unsigned, we'd have to pass it as Register(AArch64::WZR) to avoid hitting the deleted constructor

This is already an issue I wasn’t able to come up with a good solution for independent of adding the new imm SrcOp

I'm currently rebasing this patch switching APInt to uint64_t and one thing I'm a bit worried about is the case where the implicit conversion from Register to unsigned causes registers to be passed where immediates are expected. At the moment our plan guards against the case where Register is correctly used but not against trivial misuse like:

unsigned Reg = Register();
SrcOp(Reg)

and with the current unsigned constructor, this:

SrcOp(1)

will be treated as a register.

I had a look into the feasibility of eliminating the implicit conversions and that's way too big a task at the moment to block this on fixing that (I'll post some patches that propagate Register a bit more tomorrow, they were a side-effect of looking into it). I have a potential alternative though. We can add some deleted constructors like in https://godbolt.org/z/0zQOJX. The downside of this is plain constants like 0 won't work, you need to use 0ul. Does that limitation sound ok if it makes mixing up registers and immediates harder?

Of course, the moment I post I notice a case I'd missed. AArch64::WZR is an unsigned, we'd have to pass it as Register(AArch64::WZR) to avoid hitting the deleted constructor

This is already an issue I wasn’t able to come up with a good solution for independent of adding the new imm SrcOp

The main thing I need to establish to update and land a version of this patch is what our preference is. Do we prefer?:

  1. having immediates wrapped in APInt and registers being unsigned/Register. This is more compatible with the current code.
  2. immediates being APInt/uint64_t/int64_t, immediate constants being 0ul, registers being wrapped in Register, and unsigned being invalid. Later when the implicit Register->unsigned conversion is deleted re-allow unsigned as immediates
  3. something else
  1. is less invasive to the current code and is much easier to land but 2. better aligns with our long term thoughts on eliminating unsigned for registers. My opinion depends on difficulty with 2 being my first choice and 1 being my fallback if changing unsigned->Register significantly exceeds my expectations in terms of how much porting work to the existing code is needed.

What are your preferences?

I would vote for option 2

If option 2 is possible then go for it. If it's too much for work for now, then fallback to 1.

Since option 2 is more in line with the long-term vision, I think it would be better to go down that route. Better to do it sooner than later.

Looks like we have agreement on option 2. Thanks. I'll update the patch in that direction

dsanders updated this revision to Diff 213173.Aug 2 2019, 9:49 PM

Rebase and update to use uint64_t (and ban registers in unsigned)

dsanders edited the summary of this revision. (Show Details)Aug 2 2019, 9:54 PM
arsenm added inline comments.Aug 2 2019, 10:17 PM
llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
119

Usually arbitrary immediates use signed int64_t, as MachineOperand does

212

Ditto

dsanders updated this revision to Diff 213424.Aug 5 2019, 12:00 PM
  • uint64_t -> int64_t
arsenm accepted this revision.Aug 5 2019, 12:10 PM

LGTM

This revision is now accepted and ready to land.Aug 5 2019, 12:10 PM
This revision was automatically updated to reflect the committed changes.