This is an archive of the discontinued LLVM Phabricator instance.

[RFC][DAG] Initially add nodes in the worklist in topological order.
Needs ReviewPublic

Authored by deadalnix on Jun 14 2023, 8:52 AM.

Details

Summary

In the same vein as D127115, this is step toward processing the dag in topological order.

This is a bit of a shotgun diff and will require many issue to be fixed before proceeding.

Diff Detail

Event Timeline

deadalnix created this revision.Jun 14 2023, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 8:52 AM
deadalnix requested review of this revision.Jun 14 2023, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2023, 8:52 AM
deadalnix retitled this revision from [RFX][DAG] Initially add nodes in the worklist in topological order. to [RFC][DAG] Initially add nodes in the worklist in topological order..Jun 14 2023, 8:53 AM
n-omer added a subscriber: n-omer.Jun 14 2023, 12:39 PM

oh no, not again :) A lot of these seem to be related to changes in extension/truncations - although I haven't noticed any common pattern

oh no, not again :) A lot of these seem to be related to changes in extension/truncations - although I haven't noticed any common pattern

This one should be the most disruptive one. After that, it's mostly in topological order, and there will be a couple of patch up to ensure this is always the case, but disturbance should be minimal.

deadalnix updated this revision to Diff 532126.Jun 16 2023, 6:41 AM

Fix numerous tests

foad added a comment.Jun 20 2023, 3:40 AM

This causes many interesting regressions for AMDGPU!

RKSimon added inline comments.Jun 20 2023, 4:28 AM
llvm/test/CodeGen/X86/abds.ll
31
t0: ch,glue = EntryToken
            t2: i32,ch = CopyFromReg t0, Register:i32 %0
          t3: i8 = truncate t2
        t7: i64 = sign_extend t3
            t5: i32,ch = CopyFromReg t0, Register:i32 %1
          t6: i8 = truncate t5
        t8: i64 = sign_extend t6
      t9: i64 = sub t7, t8
    t10: i64 = abs t9
  t11: i8 = truncate t10

This has regressed because the sign_extend(truncate()) have already been folded to sign_extend_inreg(any_extend()) before foldABSToABD is called via visitTRUNCATE, meaning we only call it later via visitABS directly and lose the smaller types.

deadalnix updated this revision to Diff 534335.Jun 25 2023, 7:18 AM

Another batch of tests

deadalnix updated this revision to Diff 534352.Jun 25 2023, 8:58 AM

More tests, notably a lot of RISCV ones.

As I go on, the changeset is becoming so big that phabricator is unable to display it. Maybe this isn't the right approach, but then, what's the right approach?

I'd probably focus initially on x86 only as that will have the most test changes and the most overlap with other targets.

deadalnix updated this revision to Diff 534517.Jun 26 2023, 6:18 AM

Revert to showing X86 tests only for now.

RKSimon added inline comments.Jun 28 2023, 6:13 AM
llvm/test/CodeGen/X86/avx2-shift.ll
416 ↗(On Diff #534517)

Many of the vector regressions are due to X86's combineVectorTruncation fold that is destroying truncate patterns too soon. We should be able to get rid of it with a little more work.

deadalnix updated this revision to Diff 537407.Jul 5 2023, 10:07 AM

rebase on top of D154522

deadalnix updated this revision to Diff 538066.Jul 7 2023, 3:54 AM

Rebase on top of D154592

deadalnix updated this revision to Diff 540381.Jul 14 2023, 5:36 AM

Rebase on top of variosu patches by @RKSimon

deadalnix updated this revision to Diff 543234.Jul 22 2023, 4:24 PM

Rebase and regenrate a bunch of tests

@deadalnix Please can you rebase? A lot of the vector truncate issues should be fixed now

RKSimon added inline comments.Aug 20 2023, 9:14 AM
llvm/test/CodeGen/X86/vselect.ll
5

We need to add AVX1/AVX2 prefixes back here:

; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx | FileCheck %s --check-prefix=AVX1,AVX1
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 | FileCheck %s --check-prefixes=AVX,AVX2

Given that this is not going to be resolved anytime soon and we're relatively early into the review, would you be open to moving this to a GitHub PR?

Given that this is not going to be resolved anytime soon and we're relatively early into the review, would you be open to moving this to a GitHub PR?

If this is what it takes, but I'm really not thrilled by the move to be honest. It's way harder to keep track of changes, notification are basically useless, and there is no kind of useful dashboard of what one need to pay attention to either.

Given that this is not going to be resolved anytime soon and we're relatively early into the review, would you be open to moving this to a GitHub PR?

If this is what it takes, but I'm really not thrilled by the move to be honest. It's way harder to keep track of changes, notification are basically useless, and there is no kind of useful dashboard of what one need to pay attention to either.

Yes its as much fun as a tooth extraction :| But I'm worried that Phab will not get much love going forward and we're months away from getting the x86 regressions fixed, let alone any other backend.

I've started my own WIP branch of this patch at https://github.com/RKSimon/llvm-project/tree/perf/D152928 so I can more easily work though addressing issues; I'll be rebasing + forcing pushes so don't rely on git history if you track it. But wherever you decide to keep the main patch will remain the reference.