- Use computed VF for stress testing.
- If the computed VF does not produce vector code (VF smaller than 2), force VF to be 4.
- Test vectorization of i64 data on AArch64 to make sure we generate VF != 4 (on X86 that was already tested on AVX).
Differential D59952
[VPLAN] Minor improvement to testing and debug messages. fpetrogalli on Mar 28 2019, 11:50 AM. Authored by
Details
Diff Detail
Event TimelineComment Actions Thanks Francesco. I think we can only remove the arbitrary picking of VF = 4 with VPlanStressTest. I think we still need the other parts for testing, to build VPlans for loops without the pragma, which are then not vectorized.
Comment Actions Hi all, by the comment so far, I think we can abandon this change-set, because:
If you agree, I will abandon this patch. Francesco Comment Actions Yep, I agree on that we should keep the stress testing mechanism. It will be very useful to make sure that the construction and predication (and maybe other transformation) are robust enough since we can run it on loop nests that are not necessarily vectorizable. Comment Actions I suggested removing setting VF = 4 for VPlanStressTest now that we programmatically determine the VF, to streamline things a bit. In a way, just have VPlanStressTest mean: build a VPlan for any loop you can, with the automatically chosen VF. I thought the point was to just have a VF to build a VPlan with and as it currently stands we build the same VPlans with VF = 4 or any other VF I think. With the programmatically determined VF, we would be a little closer to reality in the stress testing. As Hideki mentioned, having a constant factor for stress testing might have other benefits later on though Comment Actions yeah, this patch was a follow up on the comment from @fhahn : https://reviews.llvm.org/D57598#1444238 I might have misunderstood his request. I will abandon the patch. Thank you, Francesco Comment Actions
Yes, that sounds good to me. In the worst case, we could always use stress testing + user VF to achieve that. Comment Actions I have updated the patch to address all the concerns. Now VF = 4 for stress testing only if the computed VF does not produce vector code (VF < 2). Thank you all. Comment Actions Should we add a small lit test that covers the new expected behavior? Probably reusing one from D57598 with the stress testing flag would suffice. Comment Actions Good idea. Something that invokes stress testing and sets VF = 4 when the computed VF is 1. Did I get it right? Comment Actions
I believe that the test in test/Transforms/LoopVectorize/X86/outer_loop_test1_no_explicit_vect_width.ll is already doing that, as the AVX output is producing code with VF = 8. Or are you saying this needs to be done explicitly for stress testing? Comment Actions I have addressed all concern raised in this review. In particular, I have added a test for computed VF != 4 on AArch64 (on X86 that was just the case for AVX). Comment Actions
Yes, something that covers the new stress testing behavior that this patch is introducing, to make sure that we preserve it. It could be a test using the stress testing flag and VF!=4 coming from determinePlanVF and/or a test using stress testing flag and where the default VF=4 is taken. Comment Actions Thanks Francesco! I think the test looks good now. If you need someone to commit it, just let me know. Comment Actions Thank you all. @fhahn , did you say you volunteered for committing the patch? Please proceed. :) Francesco Comment Actions Hi! One of the tests was missing "REQUIRES: asserts", so it failed on a non-assert build. I took the liberty to add that to the test in r358057. I hope that was okay! Comment Actions Hi - thank your for taking care of this, apologies for breaking your runs! Do you need a review? Francesco |