Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

adriantong1024 (Adrian Tong)
User

Projects

User does not belong to any projects.

User Details

User Since
May 11 2021, 10:18 PM (123 w, 4 d)

Recent Activity

Nov 1 2022

adriantong1024 committed rGb124295ef60f: Implement support for AArch64ISD::MOVI in computeKnownBits (authored by adriantong1024).
Implement support for AArch64ISD::MOVI in computeKnownBits
Nov 1 2022, 8:57 AM · Restricted Project, Restricted Project
adriantong1024 closed D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.
Nov 1 2022, 8:57 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.

Rebase.

Nov 1 2022, 8:51 AM · Restricted Project, Restricted Project
adriantong1024 committed rG4a8d1daadd47: Precommit test cases for D137108 (Support for AArch64ISD::MOVI in… (authored by adriantong1024).
Precommit test cases for D137108 (Support for AArch64ISD::MOVI in…
Nov 1 2022, 8:39 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.

Address test case comment from dmgreen. Thanks !

Nov 1 2022, 8:06 AM · Restricted Project, Restricted Project
adriantong1024 added a comment to D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.

There are other MOVI nodes created in ISel, have you considered adding support for those too?

MOVI,
MOVIshift,
MOVIedit,
MOVImsl,
FMOV,
MVNIshift,
MVNImsl,
Nov 1 2022, 7:55 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.
Nov 1 2022, 7:45 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.

Address comments from reviewers. Thanks !

Nov 1 2022, 7:43 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.
Nov 1 2022, 7:29 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.
Nov 1 2022, 7:20 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.

Address comments from reviewers.

Nov 1 2022, 7:18 AM · Restricted Project, Restricted Project

Oct 31 2022

adriantong1024 added a reviewer for D137108: Implement support for AArch64ISD::MOVI in computeKnownBits: Carrot.
Oct 31 2022, 2:05 PM · Restricted Project, Restricted Project
adriantong1024 requested review of D137108: Implement support for AArch64ISD::MOVI in computeKnownBits.
Oct 31 2022, 12:53 PM · Restricted Project, Restricted Project

Aug 9 2022

adriantong1024 accepted D111237: [TypePromotion] Search from ZExt + PHI.
Aug 9 2022, 10:41 AM · Restricted Project, Restricted Project

Jul 21 2022

adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

The primary issue with range extension thunks is that they clobber x16 (and in theory are allowed to clobber x17). So we'd need to ensure that all asm goto blocks clobber x16 and x17.

I can't think of any other issue with depending on range extension thunks, I guess. (See also https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#call-and-jump-relocations .)

I guess we could error if the inline asm used x16 or x17 and we did do MFS? IIRC, we already error for 32b r7 being reserved (under some conditions, which I've forgotten). Or maybe avoid MFS if the inline asm had x16 or x17 in its clobber list?

Yes, I think this is a good option. Thanks !

Jul 21 2022, 10:44 AM · Restricted Project, Restricted Project

Jul 14 2022

adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

AArch64 "b" has a range of +-128MB. Which isn't enough for arbitrary programs. So in general, you need a sequence like the following (assuming small code model):

adrp x0, dest
add x0, x0, :lo12:dest
blr x0

That is, unless you're okay with the restriction that your binary is at most 128MB. Which might be reasonable for the Linux kernel, I guess. But again, something you'd want to document...


That said, I'm surprised machine function splitting on aarch64 works without any other changes; currently, branch relaxation isn't aware of section markings at all. Or do you have some other out-of-tree patches?

Jul 14 2022, 2:52 PM · Restricted Project, Restricted Project
adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

The problem this patch is trying to fix is discovered on AArch64 where the conditional branch b.ge is too short after MFS places hot and cold blocks apart.

Isn't it straightforward to change b.ge to b.lt to the other branch destination, then b (no condition) to the label?

We can do this. I am slightly worried about performance as this is in a performance critical part of the code.

Generally, asm goto is modeled as the indirect branches being taken are the exceptional cases. So the indirect branch targets should be treated as if they were cold. If they're moved far away...good.

Otherwise if that's surprising for that code, it sounds like machine function splitting should be disabled for that function.

Jul 14 2022, 2:33 PM · Restricted Project, Restricted Project
adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

The problem this patch is trying to fix is discovered on AArch64 where the conditional branch b.ge is too short after MFS places hot and cold blocks apart.

Isn't it straightforward to change b.ge to b.lt to the other branch destination, then b (no condition) to the label?

Jul 14 2022, 2:23 PM · Restricted Project, Restricted Project
adriantong1024 abandoned D129677: Disable machine function splitting for functions with inline asm br.

It also seems like such a case is easy to fix; in the inline asm the user should just use the wider encoding.

There's always going to be some limit to the distance unless you use an indirect branch. Well, except on weird targets like x86, where a direct branch can reach everywhere. I guess on targets like AArch64, you could mark the branch destinations as functions, and let the linker could insert a stub. But people writing inline asm probably don't expect a branch to clobber x16/x17...

Once we start imposing any restriction on the distance, we need a patch like this: if the destination is in a different section, we can't promise anything about the distance.

I wouldn't expect this to be an issue on x86, though; even on 64-bit, people normally use the "small" code model, so a "jmp" should reach anywhere in the binary. So I'm not sure what the testcase is supposed to be testing.

Jul 14 2022, 1:27 PM · Restricted Project, Restricted Project

Jul 13 2022

adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

In the inline assembly, there may be branch that can only jump a limited distance, if we run MFS on the function, the resulting distance maybe to far to encode into the instruction.

If there's a rule like this, we should explicitly state in LangRef which branches are/are not allowed.

Jul 13 2022, 3:00 PM · Restricted Project, Restricted Project
adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

The ones in the inline assembly cannot.

This should be fixable on top of D129288. But an incremental fix separate from that would be okay.

Not sure the extra flag carries its weight; it's already hard enough to keep track of all the various flags we put on MachineFunction , and you can already easily check IsInlineAsmBrIndirectTarget() while you're iterating over basic blocks.

Jul 13 2022, 2:18 PM · Restricted Project, Restricted Project
adriantong1024 added a comment to D129677: Disable machine function splitting for functions with inline asm br.

The ones in the inline assembly cannot.

This should be fixable on top of D129288. But an incremental fix separate from that would be okay.

Err, actually, now I'm confused. What exactly do you mean by "recoded"? Do you mean there's a rule that imposes a maximum distance between an INLINEASM_BR and its indirect destinations? Is that rule written down anywhere?

Jul 13 2022, 2:15 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D129677: Disable machine function splitting for functions with inline asm br.

Update test case.

Jul 13 2022, 1:44 PM · Restricted Project, Restricted Project
adriantong1024 requested review of D129677: Disable machine function splitting for functions with inline asm br.
Jul 13 2022, 11:44 AM · Restricted Project, Restricted Project

Jun 22 2022

adriantong1024 committed rG4e555a3df456: Fix a misspell. NFC (authored by adriantong1024).
Fix a misspell. NFC
Jun 22 2022, 2:23 PM · Restricted Project, Restricted Project

Jun 17 2022

adriantong1024 accepted D128039: [MachineCopyPropagation] Fix D125335 accidentally change control flow..

LGTM

Jun 17 2022, 1:29 PM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D128039: [MachineCopyPropagation] Fix D125335 accidentally change control flow..
Jun 17 2022, 7:16 AM · Restricted Project, Restricted Project

Jun 16 2022

adriantong1024 committed rG55311801f06d: Allow bitwidth difference when checking for isOneOrOneSplat. (authored by adriantong1024).
Allow bitwidth difference when checking for isOneOrOneSplat.
Jun 16 2022, 9:05 AM · Restricted Project, Restricted Project
adriantong1024 closed D127354: Allow bitwidth difference when checking for isOneOrOneSplat..
Jun 16 2022, 9:05 AM · Restricted Project, Restricted Project

Jun 15 2022

adriantong1024 added a comment to D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

I ran ninja check-lld check-libc check-flang check-llvm check-polly check-bolt check-mlir check-all check-clang check-clang-tools check-openmp locally and things are good.

Jun 15 2022, 12:59 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

Address comment from Simon. Thanks!

Jun 15 2022, 10:13 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

Rebase after rGadfcdb0d0d4af1194121211c125c291162b43ccd

Jun 15 2022, 9:39 AM · Restricted Project, Restricted Project

Jun 14 2022

adriantong1024 added a comment to D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

I am unable to reproduce the failures in check-openmp locally.

Jun 14 2022, 10:16 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

Address the comments from @RKSimon. Thanks!
And rebase.

Jun 14 2022, 12:20 PM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D127354: Allow bitwidth difference when checking for isOneOrOneSplat..
Jun 14 2022, 11:26 AM · Restricted Project, Restricted Project
adriantong1024 added a comment to D25374: [DAGCombiner] Update most ADD combines to support general vector combines.

Updated isNonOpaqueConstantOrConstantVector to isNonOpaqueConstantOrNonOpaqueConstantVector to check that no vector elements are opaque constants. Also added isConstantOrConstantVector helper that doesn't care about opaque. Both of these allow undef vector elements.

Updated the helpers to not allow implicit truncation in build vectors - this will be necessary for the reuse of these helpers in other combines (e.g. SUB and MUL). This isn't necessary for the isNullConstantOrNullSplatConstant case.

Hi @RKSimon

I know this CR is quite old, I am trying to remove the "not allow implicit truncation" restriction for one optimization. I am somewhat new to this part of the compiler, can you please explain to me why this is necessary for SUB and MUL combines. An example would be great. Thanks !

I can't remember tbh - best way to find out would be for you to remove the restriction and run check-llvm-codegen to see what changes/regresses/asserts/explodes....

Jun 14 2022, 11:25 AM · Restricted Project, Restricted Project
Herald added projects to D25374: [DAGCombiner] Update most ADD combines to support general vector combines: Restricted Project, Restricted Project.

Updated isNonOpaqueConstantOrConstantVector to isNonOpaqueConstantOrNonOpaqueConstantVector to check that no vector elements are opaque constants. Also added isConstantOrConstantVector helper that doesn't care about opaque. Both of these allow undef vector elements.

Updated the helpers to not allow implicit truncation in build vectors - this will be necessary for the reuse of these helpers in other combines (e.g. SUB and MUL). This isn't necessary for the isNullConstantOrNullSplatConstant case.

Jun 14 2022, 9:34 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D127354: Allow bitwidth difference when checking for isOneOrOneSplat..
Jun 14 2022, 8:53 AM · Restricted Project, Restricted Project
adriantong1024 retitled D127354: Allow bitwidth difference when checking for isOneOrOneSplat. from Implement capability to optimize add negative into subtract positive in AArch64. to Allow bitwidth difference when checking for isOneOrOneSplat..
Jun 14 2022, 8:32 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

Address comments from Craig. Thanks for the pointer, this does simplify things.

Jun 14 2022, 8:31 AM · Restricted Project, Restricted Project

Jun 8 2022

adriantong1024 added a comment to D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

This shoudl have been picked up by foldAddSubMasked1 in the generic DAG combiner but wasn't because the BUILD_VECTOR has i8 element type and i32 constant operands. We should fix that to not require the widths to match.

Jun 8 2022, 4:04 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D127354: Allow bitwidth difference when checking for isOneOrOneSplat..

Address comments.

Jun 8 2022, 3:32 PM · Restricted Project, Restricted Project
adriantong1024 requested review of D127354: Allow bitwidth difference when checking for isOneOrOneSplat..
Jun 8 2022, 3:10 PM · Restricted Project, Restricted Project

May 31 2022

adriantong1024 updated adriantong1024.
May 31 2022, 3:43 PM

May 26 2022

adriantong1024 committed rG7c13ae6490b1: Give option to use isCopyInstr to determine which MI is (authored by adriantong1024).
Give option to use isCopyInstr to determine which MI is
May 26 2022, 11:48 AM · Restricted Project, Restricted Project
adriantong1024 closed D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 26 2022, 11:48 AM · Restricted Project, Restricted Project

May 24 2022

adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

I compiled gcc.c from https://people.csail.mit.edu/smcc/projects/single-file-programs/ and here are the results. This is compiling for ARM64 on an ARM64 machine.

May 24 2022, 10:07 AM · Restricted Project, Restricted Project
adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

I hadn't expected MachineCopyPropgation would be too expensive - it sounds like a simple concept and seems to already be run twice in the default pipeline. This is only adding an extra at -O3, so hopefully won't be too bad for compile time.

Perhaps it is worth running some quick tests though? And if it is a problem adding disablePass(&MachineCopyPropagationID); for AArch64, to run just at the new pass location, not the old. I'm not sure if that would have any knock-on effects, disabling the two existing runs of the pass earlier in the pipeline.

May 24 2022, 6:41 AM · Restricted Project, Restricted Project

May 19 2022

adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

@craig.topper @dmgreen @arsenm Please advise if there is anything else we want to change ?

May 19 2022, 9:54 AM · Restricted Project, Restricted Project

May 18 2022

adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Can you avoid adding a second run by moving the current run? The ad-hoc physreg liveness tracking after allocation is really expensive

May 18 2022, 10:05 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Address comment from @dmgreen. Thanks for the review.

May 18 2022, 9:04 AM · Restricted Project, Restricted Project

May 17 2022

adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

The problem (my concern) here is some architecture may not be ready to use isCopyInstr to identify what a COPY is.

IIUC, isCopyInstrImpl hasn't been override by many targets and returns None, I think it doesn't affect these targets in MCP. Even when MI.isCopy() returns false, and then isCopyInstr returns true, you still have chance to decide if it's feasible to perform propagation based on this MI's operands' flags and etc. As I have said, you can implement conservative checks in your first patch(sufficient to cover your cases) and enhance it in following patches.

May 17 2022, 10:14 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 17 2022, 9:43 AM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 17 2022, 9:24 AM · Restricted Project, Restricted Project
adriantong1024 updated the summary of D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 17 2022, 9:15 AM · Restricted Project, Restricted Project
adriantong1024 retitled D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP from Pass lamba to MachineCopyProp to determine which MI is treated as Copy instruction. to Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 17 2022, 8:42 AM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Address comment. Thanks @craig.topper for the time and review.

May 17 2022, 8:34 AM · Restricted Project, Restricted Project

May 16 2022

adriantong1024 updated the diff for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Address comments.

May 16 2022, 3:44 PM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 16 2022, 2:28 PM · Restricted Project, Restricted Project
adriantong1024 added inline comments to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 16 2022, 1:58 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Address comments

May 16 2022, 1:18 PM · Restricted Project, Restricted Project

May 11 2022

adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Thanks for the review @dmgreen.

May 11 2022, 7:50 AM · Restricted Project, Restricted Project

May 10 2022

adriantong1024 added inline comments to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.
May 10 2022, 9:59 PM · Restricted Project, Restricted Project
adriantong1024 added a reviewer for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP: craig.topper.
May 10 2022, 9:53 PM · Restricted Project, Restricted Project
adriantong1024 added a comment to D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Thanks for the review @craig.topper. I've updated the patch.

May 10 2022, 9:52 PM · Restricted Project, Restricted Project
adriantong1024 updated the diff for D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP.

Address Craig comment. Thanks for taking a look.

May 10 2022, 9:52 PM · Restricted Project, Restricted Project
adriantong1024 published D125335: Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP for review.
May 10 2022, 1:32 PM · Restricted Project, Restricted Project

Mar 4 2022

adriantong1024 accepted D118905: [TypePromotion] Avoid unnecessary trunc zext pairs.
Mar 4 2022, 12:30 PM · Restricted Project, Restricted Project
Herald added a project to D118905: [TypePromotion] Avoid unnecessary trunc zext pairs: Restricted Project.

This LGTM

Mar 4 2022, 12:30 PM · Restricted Project, Restricted Project

Feb 2 2022

adriantong1024 added inline comments to D111237: [TypePromotion] Search from ZExt + PHI.
Feb 2 2022, 4:09 PM · Restricted Project, Restricted Project

Jan 19 2022

adriantong1024 committed rGb6a7ae2c5ddc: Optimize shift and accumulate pattern in AArch64. (authored by adriantong1024).
Optimize shift and accumulate pattern in AArch64.
Jan 19 2022, 5:58 PM
adriantong1024 closed D114405: Optimize shift and accumulate pattern in AArch64..
Jan 19 2022, 5:58 PM · Restricted Project

Jan 18 2022

adriantong1024 committed rGea27adb45b78: [NFC] Test commit. (authored by adriantong1024).
[NFC] Test commit.
Jan 18 2022, 11:01 AM

Jan 17 2022

adriantong1024 updated the diff for D114405: Optimize shift and accumulate pattern in AArch64..

Address Dave's comments. Thanks!

Jan 17 2022, 8:03 AM · Restricted Project

Jan 14 2022

adriantong1024 updated the diff for D114405: Optimize shift and accumulate pattern in AArch64..

Address Dave's comment. Thanks!

Jan 14 2022, 10:51 AM · Restricted Project

Jan 12 2022

adriantong1024 updated the diff for D114405: Optimize shift and accumulate pattern in AArch64..

Add 2 more test cases, testing different datatypes.

Jan 12 2022, 1:48 PM · Restricted Project

Jan 11 2022

adriantong1024 updated the diff for D114405: Optimize shift and accumulate pattern in AArch64..

Address Dave's comment.

Jan 11 2022, 9:52 PM · Restricted Project

Dec 29 2021

adriantong1024 updated the diff for D114405: Optimize shift and accumulate pattern in AArch64..

Address Dave's comment. Moving the rules into tablegen.

Dec 29 2021, 9:16 PM · Restricted Project

Nov 24 2021

adriantong1024 added a reviewer for D114405: Optimize shift and accumulate pattern in AArch64.: dmgreen.
Nov 24 2021, 3:32 PM · Restricted Project
adriantong1024 added a comment to D114405: Optimize shift and accumulate pattern in AArch64..

Could this be handled with a tablegen pattern? Something like the or_is_add pattern from X86 (but using haveNoCommonBitsSet). Maybe a PatFrag that accepts an add or an add-like-or?

Thanks for the comment David. I will take a look at handling in tablegen pattern as well. Will update when I have more.

Nov 24 2021, 3:31 PM · Restricted Project

Nov 22 2021

adriantong1024 requested review of D114405: Optimize shift and accumulate pattern in AArch64..
Nov 22 2021, 4:48 PM · Restricted Project