Page MenuHomePhabricator

LuoYuanke (LuoYuanke)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 24 2018, 10:28 PM (142 w, 6 d)

Recent Activity

Today

LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

You can take my patch

Mon, Jun 21, 7:04 PM · Restricted Project
LuoYuanke updated the summary of D104678: [X86] Selecting fld0 for undefined value in fast ISEL..
Mon, Jun 21, 7:03 PM · Restricted Project
LuoYuanke updated the diff for D104678: [X86] Selecting fld0 for undefined value in fast ISEL..

Edit commit message.

Mon, Jun 21, 7:02 PM · Restricted Project
LuoYuanke requested review of D104678: [X86] Selecting fld0 for undefined value in fast ISEL..
Mon, Jun 21, 7:00 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

I'm not sure. To do it in the stackifier you need to do it any time the undef flag is present regardless of whether StackTop is 0. You instead would need to check whether the register is already present in the stack and only insert if it isn't. But there may be some complications with removing it from the stack later. I think removing things from the stack is based on kill flags, but I don't know if the undef would have a kill flag. So I guess you'd have to remember you inserted it and immediately remove it after the instruction? You would need to do this for any FP instruction not just ArgFPRW.

Mon, Jun 21, 5:05 AM · Restricted Project

Yesterday

LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Does this fix your test

diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp
index 44670a9..3e5d45b 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -3842,6 +3842,30 @@ unsigned X86FastISel::fastMaterializeConstant(const Constant *C) {
     return X86MaterializeFP(CFP, VT);
   else if (const GlobalValue *GV = dyn_cast<GlobalValue>(C))
     return X86MaterializeGV(GV, VT);
+  else if (isa<UndefValue>(C)) {
+    unsigned Opc = 0;
+    switch (VT.SimpleTy) {
+    default:
+      break;
+    case MVT::f32:
+      if (!X86ScalarSSEf32)
+        Opc = X86::LD_Fp032;
+      break;
+    case MVT::f64:
+      if (!X86ScalarSSEf64)
+        Opc = X86::LD_Fp064;
+      break;
+    case MVT::f80:
+      Opc = X86::LD_Fp080;
+      break;
+    }
+
+    if (Opc) {
+      Register ResultReg = createResultReg(TLI.getRegClassFor(VT));
+      BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(Opc), ResultReg);
+      return ResultReg;
+    }
+  }
 
   return 0;
 }
Sun, Jun 20, 6:24 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

I think because I use the

Sun, Jun 20, 3:21 AM · Restricted Project

Sat, Jun 19

LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

With -O0, the small case can also generate "IMPLICIT_DEF" and "CHS_Fp80". I think we are near to the root cause. Stay tuned.

Sat, Jun 19, 11:15 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Are you running the large and small cases the same way? Have looked at the SelectionDAG debug logs for the affected basic block in the large case?

I was able to trigger the error with llc -O2 -fast-isel on a simple test. So that is a path to create this but you haven’t answered if fast isel is being used in your case.

Sat, Jun 19, 10:53 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

BTW, do you think the patch to handle undef case in stackify pass reasonable?

No I do not. I have no reason to believe fneg or one arg instructions are the only things that could be effected. We need to understand why this is happening because isel was trying to prevent this.

Sat, Jun 19, 10:47 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

BTW, do you think the patch to handle undef case in stackify pass reasonable?

Sat, Jun 19, 10:39 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

I don’t know what llvm-extract —recursive does. I’ve only used -func to extract a single function I knew caused a compiler crash.

Converting IMPLICIT_DEF to undef flag is correct.

IMPLICIT_DEF can get created from ISD::UNDEF but as far I could see ISD::UNDEF for fp80 is supposed to Expand to ConstantFP 0.

I also think fneg of undef should be folded by getNode when it is created in SelectionDAGBuilder. Is this going through fast isel or something?

Sat, Jun 19, 10:37 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Here is IR.

Sat, Jun 19, 9:27 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Did you run llvm-extract to isolate the broken function first? bugpoint is not good at that.

Sat, Jun 19, 8:49 PM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Can you get IR and use bugpoint to reduce it? I'd really like to see the backend codegen that led to this case.

Sat, Jun 19, 6:26 PM · Restricted Project

Fri, Jun 18

LuoYuanke updated the diff for D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

Address Craig's comments.

Fri, Jun 18, 5:51 AM · Restricted Project
LuoYuanke added inline comments to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..
Fri, Jun 18, 5:49 AM · Restricted Project
LuoYuanke added a comment to D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..

How did you get a CHS with an undef input?

Fri, Jun 18, 5:48 AM · Restricted Project

Thu, Jun 17

LuoYuanke added a reviewer for D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW.: wxiao3.
Thu, Jun 17, 6:37 AM · Restricted Project
LuoYuanke added reviewers for D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW.: craig.topper, pengfei, RKSimon.
Thu, Jun 17, 12:40 AM · Restricted Project
LuoYuanke requested review of D104440: [X86] Fix bug when X86 stackify pass handle one ArgFPRW..
Thu, Jun 17, 12:39 AM · Restricted Project

Sun, Jun 13

LuoYuanke committed rG5be314f79ba7: [X86] Check immediate before get it. (authored by LuoYuanke).
[X86] Check immediate before get it.
Sun, Jun 13, 1:28 AM
LuoYuanke closed D104037: [X86] Check immediate before get it..
Sun, Jun 13, 1:27 AM · Restricted Project
LuoYuanke updated the diff for D104037: [X86] Check immediate before get it..

Update test case to pass expensive check.

Sun, Jun 13, 12:38 AM · Restricted Project
LuoYuanke reopened D104037: [X86] Check immediate before get it..
Sun, Jun 13, 12:37 AM · Restricted Project

Sat, Jun 12

LuoYuanke added a reverting change for rG9eb2f723c245: [X86] Check immediate before get it.: rG1e72b9d52f9c: Revert "[X86] Check immediate before get it.".
Sat, Jun 12, 10:57 PM
LuoYuanke committed rG1e72b9d52f9c: Revert "[X86] Check immediate before get it." (authored by LuoYuanke).
Revert "[X86] Check immediate before get it."
Sat, Jun 12, 10:57 PM
LuoYuanke added a reverting change for D104037: [X86] Check immediate before get it.: rG1e72b9d52f9c: Revert "[X86] Check immediate before get it.".
Sat, Jun 12, 10:57 PM · Restricted Project
LuoYuanke committed rG9eb2f723c245: [X86] Check immediate before get it. (authored by LuoYuanke).
[X86] Check immediate before get it.
Sat, Jun 12, 6:32 PM
LuoYuanke closed D104037: [X86] Check immediate before get it..
Sat, Jun 12, 6:31 PM · Restricted Project

Fri, Jun 11

LuoYuanke added a comment to D104037: [X86] Check immediate before get it..

I don't what Intel's original failure looked like, but here's a test that should reproduce this with -run-pass=machinelicm https://reviews.llvm.org/P8267 needs more cleanup.

I hacked the MIR just before machinelicm by sinking the CMP64mi32 and SETCCr into the loop. That makes MachineLICM want to unfold it since the load part is invariant being from a constant global.

Fri, Jun 11, 7:58 PM · Restricted Project
LuoYuanke updated the diff for D104037: [X86] Check immediate before get it..

Apply Craig's test case. Many thanks to Craig. :)

Fri, Jun 11, 7:55 PM · Restricted Project

Thu, Jun 10

LuoYuanke added a comment to D104037: [X86] Check immediate before get it..

Thank @lebedev.ri. This is triggered by our internal code, the same test case passes with llvm trunk code. BTW, do you think we need to check immediate before getting it?

Thu, Jun 10, 10:00 PM · Restricted Project
LuoYuanke added reviewers for D104037: [X86] Check immediate before get it.: craig.topper, pengfei.
Thu, Jun 10, 7:59 AM · Restricted Project
LuoYuanke requested review of D104037: [X86] Check immediate before get it..
Thu, Jun 10, 7:58 AM · Restricted Project
LuoYuanke committed rG63233da7230a: [X86][NFC] Fix typo. (authored by LuoYuanke).
[X86][NFC] Fix typo.
Thu, Jun 10, 7:49 AM

Wed, Jun 9

LuoYuanke accepted D103784: [X86] Support __tile_stream_loadd intrinsic for new AMX interface.

LGTM. Thank you!

Wed, Jun 9, 1:03 AM · Restricted Project, Restricted Project

Tue, Jun 8

LuoYuanke added inline comments to D103784: [X86] Support __tile_stream_loadd intrinsic for new AMX interface.
Tue, Jun 8, 7:51 AM · Restricted Project, Restricted Project

Thu, Jun 3

LuoYuanke added a comment to D99675: [llvm][clang] Create new intrinsic llvm.arithmetic.fence to control FP optimization at expression level.

We may add description on the intrinsic in docs/LangRef.rst.

Thu, Jun 3, 12:59 AM · Restricted Project

Wed, May 26

LuoYuanke committed rG4ed2b6cccdef: [X86][AMX] Fix a bug on tile config. (authored by LuoYuanke).
[X86][AMX] Fix a bug on tile config.
Wed, May 26, 6:59 AM
LuoYuanke closed D103145: [X86][AMX] Fix a bug on tile config..
Wed, May 26, 6:59 AM · Restricted Project
LuoYuanke updated the diff for D103145: [X86][AMX] Fix a bug on tile config..

Address Pengfei's comments.

Wed, May 26, 5:55 AM · Restricted Project
LuoYuanke added inline comments to D103145: [X86][AMX] Fix a bug on tile config..
Wed, May 26, 5:17 AM · Restricted Project
LuoYuanke added reviewers for D103145: [X86][AMX] Fix a bug on tile config.: pengfei, wxiao3, xiangzhangllvm.
Wed, May 26, 2:48 AM · Restricted Project
LuoYuanke updated the diff for D103145: [X86][AMX] Fix a bug on tile config..

Remove tab.

Wed, May 26, 2:48 AM · Restricted Project
LuoYuanke requested review of D103145: [X86][AMX] Fix a bug on tile config..
Wed, May 26, 2:46 AM · Restricted Project

Apr 27 2021

LuoYuanke committed rGd6c6db2feaab: [X86][AMX] Add description for AMX new interface. (authored by LuoYuanke).
[X86][AMX] Add description for AMX new interface.
Apr 27 2021, 1:06 AM
LuoYuanke closed D101059: [X86][AMX] Add description for AMX new interface..
Apr 27 2021, 1:05 AM · Restricted Project

Apr 22 2021

LuoYuanke accepted D101124: [X86][AMX][NFC] Avoid assert for the same immidiate value.

LGTM.

Apr 22 2021, 8:20 PM · Restricted Project
LuoYuanke added a reviewer for D101059: [X86][AMX] Add description for AMX new interface.: fhahn.
Apr 22 2021, 6:07 AM · Restricted Project
LuoYuanke added reviewers for D101059: [X86][AMX] Add description for AMX new interface.: pengfei, LiuChen3, xiangzhangllvm.
Apr 22 2021, 6:06 AM · Restricted Project
LuoYuanke updated the diff for D101059: [X86][AMX] Add description for AMX new interface..

Fix some descriptions.

Apr 22 2021, 6:05 AM · Restricted Project
LuoYuanke requested review of D101059: [X86][AMX] Add description for AMX new interface..
Apr 22 2021, 5:52 AM · Restricted Project
LuoYuanke accepted D101039: [X86][AMX][NFC] Remove assert for comparison between different BBs..

LGTM.

Apr 22 2021, 4:53 AM · Restricted Project
LuoYuanke added a comment to D101039: [X86][AMX][NFC] Remove assert for comparison between different BBs..

Is there any test case for it?

Apr 22 2021, 3:06 AM · Restricted Project

Apr 20 2021

LuoYuanke committed rGbcdaccfe3466: [X86][AMX] Verify illegal types or instructions for x86_amx. (authored by LuoYuanke).
[X86][AMX] Verify illegal types or instructions for x86_amx.
Apr 20 2021, 1:15 AM
LuoYuanke closed D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 20 2021, 1:14 AM · Restricted Project

Apr 19 2021

LuoYuanke committed rG519cf6e80781: [X86][AMX] Add description of x86_amx to LangRef. (authored by LuoYuanke).
[X86][AMX] Add description of x86_amx to LangRef.
Apr 19 2021, 11:30 PM
LuoYuanke closed D100032: [X86][AMX] Add description of x86_amx to LangRef..
Apr 19 2021, 11:30 PM · Restricted Project
LuoYuanke added inline comments to D100026: [X86] Support AMX fast register allocation.
Apr 19 2021, 8:13 PM · Restricted Project
LuoYuanke added inline comments to D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 19 2021, 8:06 PM · Restricted Project
LuoYuanke updated the diff for D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..

Address Craig's comments.

Apr 19 2021, 8:05 PM · Restricted Project
LuoYuanke added inline comments to D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 19 2021, 7:01 AM · Restricted Project
LuoYuanke added inline comments to D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 19 2021, 6:59 AM · Restricted Project
LuoYuanke added inline comments to D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 19 2021, 6:45 AM · Restricted Project

Apr 14 2021

LuoYuanke added inline comments to D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 14 2021, 5:45 AM · Restricted Project
LuoYuanke updated the diff for D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..

Address Roman's comments.

Apr 14 2021, 5:44 AM · Restricted Project
LuoYuanke added a reviewer for D100472: [X86][AMX] Verify illegal types or instructions for x86_amx.: lebedev.ri.
Apr 14 2021, 5:40 AM · Restricted Project
LuoYuanke updated the diff for D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..

Remove attribute in test case.

Apr 14 2021, 5:26 AM · Restricted Project
LuoYuanke added reviewers for D100472: [X86][AMX] Verify illegal types or instructions for x86_amx.: fhahn, craig.topper, pengfei.
Apr 14 2021, 5:23 AM · Restricted Project
LuoYuanke added inline comments to D100032: [X86][AMX] Add description of x86_amx to LangRef..
Apr 14 2021, 5:22 AM · Restricted Project
LuoYuanke requested review of D100472: [X86][AMX] Verify illegal types or instructions for x86_amx..
Apr 14 2021, 5:20 AM · Restricted Project

Apr 13 2021

LuoYuanke updated the diff for D100032: [X86][AMX] Add description of x86_amx to LangRef..

Address Pengfei's comments. Add amx description to BitCodeFormat.html.

Apr 13 2021, 6:46 PM · Restricted Project
LuoYuanke accepted D99966: [X86][AMX] Refactor for PostRA ldtilecfg pass..

LGTM. Thank you!

Apr 13 2021, 6:19 PM · Restricted Project
LuoYuanke added inline comments to D99966: [X86][AMX] Refactor for PostRA ldtilecfg pass..
Apr 13 2021, 7:30 AM · Restricted Project

Apr 12 2021

LuoYuanke updated the diff for D100032: [X86][AMX] Add description of x86_amx to LangRef..

Address Florian and Pengfei's comments.

Apr 12 2021, 11:46 PM · Restricted Project
LuoYuanke added inline comments to D100032: [X86][AMX] Add description of x86_amx to LangRef..
Apr 12 2021, 11:44 PM · Restricted Project
LuoYuanke accepted D99708: [X86] Enable compilation of user interrupt handlers..

LGTM. But wait one or two days to see if there is more comments from Craig and HJ.

Apr 12 2021, 10:17 PM · Restricted Project, Restricted Project
LuoYuanke accepted D99010: [X86][AMX] Hoist ldtilecfg.

LGTM. Thanks!

Apr 12 2021, 6:22 AM · Restricted Project

Apr 11 2021

LuoYuanke added inline comments to D99010: [X86][AMX] Hoist ldtilecfg.
Apr 11 2021, 1:09 AM · Restricted Project

Apr 8 2021

LuoYuanke added inline comments to D99010: [X86][AMX] Hoist ldtilecfg.
Apr 8 2021, 5:23 AM · Restricted Project
LuoYuanke edited reviewers for D100032: [X86][AMX] Add description of x86_amx to LangRef., added: fhahn; removed: Florian.
Apr 8 2021, 12:17 AM · Restricted Project
LuoYuanke added reviewers for D100032: [X86][AMX] Add description of x86_amx to LangRef.: Florian, craig.topper, pengfei.
Apr 8 2021, 12:08 AM · Restricted Project

Apr 7 2021

LuoYuanke updated the diff for D100032: [X86][AMX] Add description of x86_amx to LangRef..

Reformat it.

Apr 7 2021, 11:41 PM · Restricted Project
LuoYuanke added inline comments to D99010: [X86][AMX] Hoist ldtilecfg.
Apr 7 2021, 6:33 AM · Restricted Project
LuoYuanke requested review of D100032: [X86][AMX] Add description of x86_amx to LangRef..
Apr 7 2021, 5:59 AM · Restricted Project

Apr 6 2021

LuoYuanke added a comment to D99708: [X86] Enable compilation of user interrupt handlers..

LGMT. Thank you!

Apr 6 2021, 2:23 AM · Restricted Project, Restricted Project

Apr 2 2021

LuoYuanke added a comment to D99010: [X86][AMX] Hoist ldtilecfg.

Perhaps we need more comments and more test cases (maybe in a sperate file) to cover those scenario.

Apr 2 2021, 6:52 AM · Restricted Project

Apr 1 2021

LuoYuanke updated subscribers of D99708: [X86] Enable compilation of user interrupt handlers..
Apr 1 2021, 8:11 AM · Restricted Project, Restricted Project
LuoYuanke added a comment to D99708: [X86] Enable compilation of user interrupt handlers..

A user interrupt is different than a regular interrupt right? It doesn't make sense that we would change the behavior of the interrupt calling convention just because the the user interrupt instructions are enabled. That would occur just from passing a -march for a newer CPU wouldn't it?

Apr 1 2021, 8:10 AM · Restricted Project, Restricted Project
LuoYuanke added a reviewer for D99708: [X86] Enable compilation of user interrupt handlers.: hjl.tools.
Apr 1 2021, 8:08 AM · Restricted Project, Restricted Project
LuoYuanke added inline comments to D99010: [X86][AMX] Hoist ldtilecfg.
Apr 1 2021, 8:06 AM · Restricted Project

Mar 31 2021

LuoYuanke added a reviewer for D99010: [X86][AMX] Hoist ldtilecfg: wxiao3.
Mar 31 2021, 7:58 PM · Restricted Project
LuoYuanke added a comment to D99152: [AMX] Prototype for vector and amx bitcast..

Unfortunately this is not possible to use an opaque type with the AMX intrinsics at the moment, because of the way they are define. It is possible to use opaque types with intrinsics in general though, e.g. see https://llvm.godbolt.org/z/Ezhf6535c

My point is, you should be able to adjust the definitions of the AMX intrinsics and then just replace all occurrences of x86_amx in your examples with a opaque type you define in the module. But as I said initially, you don't need to do everything at once (and you probably shouldn't). I'd start with addressing the bitcast issue and tackle the x86_amx type itself once that is done.

(And I am also not saying that it definitely needs to be removed, only that if it should be kept in the long run, it would be good to specify it in the LangRef and should have a good justification, especially if there are no instructions that do anything meaningful with values of the type other than take it as arguments and return values. Opaque types are a suggestion for an alternative that *may* be viable without a dedicated first-class type)

Mar 31 2021, 6:25 AM · Restricted Project, Restricted Project

Mar 30 2021

LuoYuanke added a reviewer for D99565: [X86] Support replacing aligned vector moves with unaligned moves when avx is enabled.: RKSimon.
Mar 30 2021, 12:51 AM · Restricted Project, Restricted Project
LuoYuanke added reviewers for D99565: [X86] Support replacing aligned vector moves with unaligned moves when avx is enabled.: smaslov, lebedev.ri.
Mar 30 2021, 12:50 AM · Restricted Project, Restricted Project

Mar 29 2021

LuoYuanke added a comment to D99152: [AMX] Prototype for vector and amx bitcast..

Whether to further optimizations are correct is a different problem, but we need a specification for the builtins, intrinsics and the type before going any further in that direction.

I think you need to set the input to LLVM IR: https://gcc.godbolt.org/z/WexMjsas9

You should be able to use opaque types with overloaded intrinsics. I don't think you define an intrinsic to take a specific opaque type (because it's not known up front).

Mar 29 2021, 8:01 AM · Restricted Project, Restricted Project
LuoYuanke added a comment to D99152: [AMX] Prototype for vector and amx bitcast..

I think that point was not really clear during the discussion. Using load <256 x i32> to lower __tile_loadd() would indeed be incorrect. But I don't think that's happening at the moment, at least going from a simple example https://gcc.godbolt.org/z/KT5rczn8j

The load/store <256 x i32> is generated by front-end, because in C language tile is a vector <256 x i32>. The load/store <256 x i32> is transformed to llvm.x86.tileloadd64.internal/llvm.x86.tilestored64.internal in lib/Target/X86/X86LowerAMXType.cpp if the load result is to be an operand of amx intrinsics or the store value is returned from amx intrinsics.

Mar 29 2021, 6:43 AM · Restricted Project, Restricted Project
LuoYuanke updated subscribers of D99152: [AMX] Prototype for vector and amx bitcast..
Mar 29 2021, 6:16 AM · Restricted Project, Restricted Project
LuoYuanke updated subscribers of D99152: [AMX] Prototype for vector and amx bitcast..
Mar 29 2021, 6:15 AM · Restricted Project, Restricted Project