This is an archive of the discontinued LLVM Phabricator instance.

[Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h
ClosedPublic

Authored by probinson on Jun 28 2023, 11:27 AM.

Diff Detail

Event Timeline

probinson created this revision.Jun 28 2023, 11:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 11:27 AM
probinson requested review of this revision.Jun 28 2023, 11:27 AM

+ cfe-commits which didn't get added automatically.

pengfei added inline comments.Jun 28 2023, 11:11 PM
clang/lib/Headers/avx2intrin.h
1324

j

1352

j

1407

j

1483

j

1509

j

1560

j

3474

A more intrinsic guide format is MEM[__X+j:j]

3506

ditto.

3538

ditto.

3570

ditto.

3602

MEM[j+31:j] := __Y[j+31:j]

3632

ditto.

3662

ditto.

3692

ditto.

probinson added inline comments.Jun 29 2023, 7:47 AM
clang/lib/Headers/avx2intrin.h
3474

LoadXX is the syntax in the gather intrinsics, e.g. _mm_mask_i32gather_pd. I'd prefer to be consistent.

probinson updated this revision to Diff 535804.Jun 29 2023, 7:55 AM

s/:7/:j/ correcting bit references.

probinson marked 6 inline comments as done.Jun 29 2023, 7:55 AM
pengfei added inline comments.Jun 29 2023, 9:31 AM
clang/lib/Headers/avx2intrin.h
3474

I think the problem here is the measurement is easily confusing.
From C point of view, __X is a int pointer, so we should + i rather than i * 4
From the other part of the code, we are measuring in bits, but here i * 4 is a byte offset.

probinson added inline comments.Jun 29 2023, 9:49 AM
clang/lib/Headers/avx2intrin.h
3474

Well, the pseudo-code is clearly not C. If you look at the gather code, it computes a byte address using an offset multiplied by an explicit scale factor. I am doing exactly the same here.

The syntax MEM[__X+j:j] is mixing a byte address with a bit offset, which I think is more confusing. To be fully consistent, using [] with bit offsets only, it should be

k := __X*8 + i*32
result[j+31:j] := MEM[k+31:k]

which I think obscures more than it explains.

pengfei added inline comments.Jun 30 2023, 12:09 AM
clang/lib/Headers/avx2intrin.h
3474

Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., why assuming __X is measuring in bytes?
That's why I think it's clear to make both in bits.
I made a mistake here, I wanted to propose MEM[__X+j+31: __X+j]. It matches with Intrinsic Guide.

probinson added inline comments.Jun 30 2023, 7:16 AM
clang/lib/Headers/avx2intrin.h
3474

We assume __X is in bytes because that's how addresses work on X86. Adding a bit offset to a byte address makes no sense. I see that is how existing Intel documentation works, which does not make it correct.

To "make both in bits" means multiplying __X by 8, as in the example in my previous comment. Or coming up with a different syntax that makes the difference clear.
MEM(__X)[j+31:j] or even MEM[__X][j+31:j] would be preferable.

pengfei accepted this revision.Jun 30 2023, 8:15 AM

LGTM.

clang/lib/Headers/avx2intrin.h
3474

My intention is to match with Intrinsic Guide as possible. Multiplying by 8 cannot achive it, but I cannot deny __X in bytes makes sense.
So I'm fine to use a different syntax.

This revision is now accepted and ready to land.Jun 30 2023, 8:15 AM
This revision was landed with ongoing or failed builds.Jun 30 2023, 8:31 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2023, 8:31 AM