This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Disable <vscale x 1 x *> types with Zve32x or Zve32f.
ClosedPublic

Authored by craig.topper on Jun 21 2022, 9:44 AM.

Details

Summary

According to the vector spec, mf8 is not supported for i8 if ELEN
is 32. Similarily mf4 is not suported for i16/f16 or mf2 for i32/f32.

Since RVVBitsPerBlock is 64 and LMUL is calculated as
((MinNumElements * ElementSize) / RVVBitsPerBlock) this means we
need to disable any type with MinNumElements==1.

For generic IR, these types will now be widened in type legalization.
For RVV intrinsics, we'll probably hit a fatal error somewhere. I plan
to work on disabling the intrinsics in the riscv_vector.h header.

Diff Detail

Event Timeline

craig.topper created this revision.Jun 21 2022, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2022, 9:44 AM
craig.topper requested review of this revision.Jun 21 2022, 9:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2022, 9:44 AM

Add basic sanity test that type legalization will at least try to handle these types.

arcbbb accepted this revision.Jun 23 2022, 1:12 AM

It's great that MinElts can be obtained from RISCV::RVVBitsPerBlock / Subtarget.getELEN()
LGTM. Thanks!

This revision is now accepted and ready to land.Jun 23 2022, 1:12 AM

Warning, I may be miss understanding the problem you're solving here, but...

The case you mention appear to be specific to when VLEN=32 right? If so, a cleaner way of phrasing the illegal cases would seem to be to compute the effective vector length after LMUL (i.e. VLEN/8 for mf8), and then disallow any case where the implied vector length is shorter than a single element of the element type.

A case to consider, what happens if VLEN=64? Should we be disallowing e.g. mf2 e64? (Seems like we should be right?) If so, can we approach the problem the same?

The use of RVVBitsPerBlock feels like a red-herring here. In particular, it's not clear to me why the value remains 64 on a VLEN=32 configuration.

Anyways, purely non-blocking comment. Not sure if I've actually understood this or not. :)

This revision was landed with ongoing or failed builds.Jun 23 2022, 8:49 AM
This revision was automatically updated to reflect the committed changes.

Warning, I may be miss understanding the problem you're solving here, but...

The case you mention appear to be specific to when VLEN=32 right? If so, a cleaner way of phrasing the illegal cases would seem to be to compute the effective vector length after LMUL (i.e. VLEN/8 for mf8), and then disallow any case where the implied vector length is shorter than a single element of the element type.

The problem is still valid for VLEN>=64 ELEN=32. While you are correct that a SEW=8 LMUL=1/8 would fit in a register for that config, there is no requirement in the spec for hardware to support it. The spec says that that the smallest fractional LMUL is SEWMIN/ELEN. Which would be 8/32 or 1/4 in this config. The spec goes on to say "For a given supported fractional LMUL setting, implementations must support SEW settings between SEWMIN and LMUL * ELEN, inclusive." So if ELEN is 32, LMUL=1/4 is only required to support SEW=8 and LMUL=1/2 is only required to support SEW=8 or 16.

A case to consider, what happens if VLEN=64? Should we be disallowing e.g. mf2 e64? (Seems like we should be right?) If so, can we approach the problem the same?

mf2 e64 was already implicitly disabled. 1 x vscale x i64 is LMUL=1.

The use of RVVBitsPerBlock feels like a red-herring here. In particular, it's not clear to me why the value remains 64 on a VLEN=32 configuration.

You are correct RVVBitsPerBlock should change with Zve32 in order to support VLEN=32. Unfortunately, it would change the type mapping from vscale to LMUL and require substantial changes to the tablegen patterns. Naively I think it would roughly double the size of the isel table. Since we use MVT types to pick instructions register classes, we would need 2 sets of patterns.

If we did change RVVBitsPerBlock for Zve32 the problem I'm trying to fix here would go away.

Warning, I may be miss understanding the problem you're solving here, but...

The case you mention appear to be specific to when VLEN=32 right? If so, a cleaner way of phrasing the illegal cases would seem to be to compute the effective vector length after LMUL (i.e. VLEN/8 for mf8), and then disallow any case where the implied vector length is shorter than a single element of the element type.

The problem is still valid for VLEN>=64 ELEN=32. While you are correct that a SEW=8 LMUL=1/8 would fit in a register for that config, there is no requirement in the spec for hardware to support it. The spec says that that the smallest fractional LMUL is SEWMIN/ELEN. Which would be 8/32 or 1/4 in this config. The spec goes on to say "For a given supported fractional LMUL setting, implementations must support SEW settings between SEWMIN and LMUL * ELEN, inclusive." So if ELEN is 32, LMUL=1/4 is only required to support SEW=8 and LMUL=1/2 is only required to support SEW=8 or 16.

I think I can still phrase that in the "does this fit" manner. I just need to introduce a MINVLEN (which is simply ELEN), and ask whether an implied vector length for a given LMUL contains a least one element.

craig.topper added a comment.EditedJun 23 2022, 9:50 AM

Warning, I may be miss understanding the problem you're solving here, but...

The case you mention appear to be specific to when VLEN=32 right? If so, a cleaner way of phrasing the illegal cases would seem to be to compute the effective vector length after LMUL (i.e. VLEN/8 for mf8), and then disallow any case where the implied vector length is shorter than a single element of the element type.

The problem is still valid for VLEN>=64 ELEN=32. While you are correct that a SEW=8 LMUL=1/8 would fit in a register for that config, there is no requirement in the spec for hardware to support it. The spec says that that the smallest fractional LMUL is SEWMIN/ELEN. Which would be 8/32 or 1/4 in this config. The spec goes on to say "For a given supported fractional LMUL setting, implementations must support SEW settings between SEWMIN and LMUL * ELEN, inclusive." So if ELEN is 32, LMUL=1/4 is only required to support SEW=8 and LMUL=1/2 is only required to support SEW=8 or 16.

I think I can still phrase that in the "does this fit" manner. I just need to introduce a MINVLEN (which is simply ELEN), and ask whether an implied vector length for a given LMUL contains a least one element.

Let's run through the math.

vscale is defined as (VLEN/RVVBitsPerBlock)
So a <vscale x 1 x i8> type is ((VLEN/RVVBitsPerBlock) * 1 * 8 bits).

Let's replace VLEN with MINVLEN(or ELEN).
size = ((ELEN/RVVBitsPerBlock) * 1 * 8 bits).

If we want to know how many 8 bit elements that holds we get
minnumelts = ((ELEN/RVVBitsPerBlock) * 1.

For ELEN = 32 and RVVBitsPerBlock=64 that value is less than 1.

We want to exclude all types <vscale x y x i8> that can't hold a whole element. Or types where ((ELEN/RVVBitsPerBlock) * y) < 1 is true.

This is true when y < (RVVBitsPerBlock / ELEN).

If RVVBitsPerBlock followed ELEN, the definition of vscale would also follow ELEN and this would go away.