Page MenuHomePhabricator

programmerjake (Jacob Lifshay)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 20 2018, 4:47 PM (92 w, 1 d)

Recent Activity

Feb 4 2020

programmerjake added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

From what I recall, the plan is to implement this by using fixed-size vector types combined with VL-based ops. MVL would be the size of those vector types.

Feb 4 2020, 10:12 AM · Restricted Project
programmerjake added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

From what I recall, the plan is to implement this by using fixed-size vector types combined with VL-based ops. MVL would be the size of those vector types.

Feb 4 2020, 10:03 AM · Restricted Project

Feb 2 2020

programmerjake added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

We (Libre-SoC, provisionally renamed from Libre-RISCV) are currently building a processor that supports variable-length vector operations by having each operation specify the starting register in a flat register file, then relying on VL telling it how many elements to operate on, which, when divided by the number of elements per register, directly translates to the number of registers to operate on. So, if VL is out of bounds, the instructions can overwrite registers past the end of the range assigned by the register allocator and/or trap. This would probably force use of option #1 above, at least for our processor. Our ISA design is still incomplete, so we might add (or already have) a mechanism allowing use of option #2 or #3 if there is a sufficient reason (will have to see what the rest of Libre-SoC think).

Presumably you have an efficient way to somehow force the VL into the intended range to support strip-mining of loops? The exact strategy doesn't matter, anything that avoids VL being "out of bounds" should make the other options work just fine. (Assuming there aren't other, larger problems with mapping VP operations to your ISA.)

Feb 2 2020, 1:07 PM · Restricted Project
programmerjake added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

(This was gonna be an inline comment on D69891, but it's more of a general conceptual issue, so I decided to move it here.)

Right now, LangRef changes in D69891 describe the restriction on the EVL value as this:

The explicit vector length (%evl) is only effective if it is non-negative, and when that is the case, its value is in the range:

0 <= %evl <= W,   where W is the vector length.

The restriction is good, but this wording doesn't specify what happens when %evl is not in that range. Some sort of undefined behavior, I assume, but this must be explicitly stated, especially since there are many ways in which it could be undefined. I don't recall previous discussion of this detail and I don't know what you have in mind, but some possibilities I see:

  1. The instruction has capital-UB undefined behavior. This gives the greatest flexibility to backends (e.g., allows generation of code that traps if %evl is too large) but I don't know of any architecture that needs this much flexibility and it constrains IR optimizations (code hoisting etc.) the most.
  2. The instruction returns poison (i.e., all result lanes are poison) and all lanes are (potentially, non-deterministically) enabled regardless of the mask parameter. This is less restrictive for IR optimizations (e.g., integer vp.add can unconditionally be speculated) but still allows backends to unconditionally use SETVL-style "stripmining" instructions that are not generally consistent (across architectures) w.r.t. which lanes become active when a vector length greater than the hardware vector length is requested.
  3. %EVLmask is undef, that's all. As consequence, lanes disabled by the %mask argument definitely stay disabled, but for other lanes (where the mask has a 1 or an undef) it's non-deterministic whether they are active. As far as I can see, this has pretty much the same implications for IR optimizations and backends (excluding hypothetical pathological architectures) but is less of a special case to specify and directly captures the diversity of hardware behavior that (presumably) motivates this restriction on EVL.

Off the cuff, I would suggest the last option.

Feb 2 2020, 10:25 AM · Restricted Project

Oct 9 2019

programmerjake accepted D68055: Add -fgnuc-version= to control __GNUC__ and other GCC macros.
Oct 9 2019, 2:12 PM · Restricted Project
programmerjake requested changes to D68055: Add -fgnuc-version= to control __GNUC__ and other GCC macros.
Oct 9 2019, 1:38 PM · Restricted Project

Oct 1 2019

programmerjake added a comment to D68055: Add -fgnuc-version= to control __GNUC__ and other GCC macros.

The __GNUG__ macro is defined to be 4 rather than matching __GNUC__

Oct 1 2019, 11:08 PM · Restricted Project

Sep 30 2019

programmerjake added a comment to D68055: Add -fgnuc-version= to control __GNUC__ and other GCC macros.

Shouldn't __GNUG__ match __GNUC__?

Sep 30 2019, 1:58 PM · Restricted Project

Feb 13 2019

programmerjake resigned from D57504: RFC: Prototype & Roadmap for vector predication in LLVM.
Feb 13 2019, 11:55 AM · Restricted Project

Feb 1 2019

programmerjake added inline comments to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.
Feb 1 2019, 10:45 AM · Restricted Project
programmerjake added a comment to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.

We will also need to adjust gather/scatter and possibly other load/store kinds to allow the address vector length to be a divisor of the main vector length (similar to mask vector length). I didn't check if there are intrinsics for strided load/store, those will need to be changed too, to allow, for example, storing <scalable 3 x float> to var.v in:

Feb 1 2019, 5:06 AM · Restricted Project
programmerjake requested changes to D57504: RFC: Prototype & Roadmap for vector predication in LLVM.
Feb 1 2019, 2:28 AM · Restricted Project