Page MenuHomePhabricator

syzaara (Zaara Syeda)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 24 2016, 11:42 AM (334 w, 4 h)

Recent Activity

Dec 1 2022

syzaara added inline comments to D138719: Expand loop peeling phi computation to handle binary ops and casts.
Dec 1 2022, 11:00 AM · Restricted Project, Restricted Project

Nov 2 2022

syzaara added inline comments to D136415: [LSR] Check if terminating value is safe to expand before transformation.
Nov 2 2022, 9:21 AM · Unknown Object (Project), Restricted Project, Restricted Project

Sep 19 2022

syzaara added a comment to D97735: [Globals] Treat nobuiltin fns as maybe-derefined..

ping
Hi @fhahn , could you please provide your thoughts on the earlier comment?

@fhahn
Hello. I am uncertain whether having ‘nobuiltin’ attribute on a function declaration should imply that it returns true for mayBeRedefined. The description for ‘nobuiltin’ in the llvm reference does not imply that the function definition may change. We should still be able to enable IPO transformations for functions with nobuiltin if the transformations do not make assumptions about the semantics of the function. There is also no statement that ‘nobuiltin’ applies only to library functions. If the attribute is used on non library user function, all IPO transformations are then blocked on this function. One suggested workaround could be to remove the ‘nobuiltin’ attribute from user functions that are not library calls in InferFunctionAttrs.cpp. with something like:

+++ b/llvm/lib/Transforms/IPO/InferFunctionAttrs.cpp
@@ -21,7 +21,7 @@ static bool inferAllPrototypeAttributes(
   Module &M, function_ref<TargetLibraryInfo &(Function &)> GetTLI) {
  bool Changed = false;
 
- for (Function &F : M.functions())
+ for (Function &F : M.functions()) {
   // We only infer things using the prototype and the name; we don't need
   // definitions. This ensures libfuncs are annotated and also allows our
   // CGSCC inference to avoid needing to duplicate the inference from other
@@ -32,7 +32,12 @@ static bool inferAllPrototypeAttributes(
     Changed |= inferNonMandatoryLibFuncAttrs(F, GetTLI(F));
    Changed |= inferAttributesFromOthers(F);
   }
-
+  if (!F.isDeclaration()) {
+   LibFunc LibFn;
+   if (F.hasFnAttribute(Attribute::NoBuiltin) && !GetTLI(F).getLibFunc(F, LibFn))
+    F.removeFnAttr(Attribute::NoBuiltin);
+  }
+ }
  return Changed;
 }
Sep 19 2022, 7:03 AM · Restricted Project, Restricted Project

Sep 12 2022

syzaara updated subscribers of D97735: [Globals] Treat nobuiltin fns as maybe-derefined..

@fhahn
Hello. I am uncertain whether having ‘nobuiltin’ attribute on a function declaration should imply that it returns true for mayBeRedefined. The description for ‘nobuiltin’ in the llvm reference does not imply that the function definition may change. We should still be able to enable IPO transformations for functions with nobuiltin if the transformations do not make assumptions about the semantics of the function. There is also no statement that ‘nobuiltin’ applies only to library functions. If the attribute is used on non library user function, all IPO transformations are then blocked on this function. One suggested workaround could be to remove the ‘nobuiltin’ attribute from user functions that are not library calls in InferFunctionAttrs.cpp. with something like:

Sep 12 2022, 10:34 AM · Restricted Project, Restricted Project

Aug 16 2022

syzaara added a comment to D129653: isInductionPHI - Add some safety checks.

ping

Aug 16 2022, 12:57 PM · Restricted Project, Restricted Project

Jul 25 2022

syzaara added inline comments to D129653: isInductionPHI - Add some safety checks.
Jul 25 2022, 6:44 AM · Restricted Project, Restricted Project

Jul 20 2022

syzaara added a comment to D129650: [InstCombine] change conditions for transform of sub to xor.

Note that we do not recognize the xor as a subtract of 8 SCEV expression. This prevents further simplifications on expressions using _sub_tmp4, because it is now represented as a symbol rather than a recognized SCEV operation.
Would it be possible to revert this until we teach SCEV how to handle the XOR?

Please correct me if I've misunderstood: would this patch solve the problem that you are seeing? If we want to revert, that would be 79bb915fb60b ?

Jul 20 2022, 10:40 AM · Restricted Project, Restricted Project
syzaara added a comment to D129650: [InstCombine] change conditions for transform of sub to xor.

This change is causing issues with creating a SCEV expression for the subtract. This leads to not being able to delinearizing some expressions which can have an impact on loop passes like loop interchange. For example, with the attached{F23852262} IR, we are no longer able to compute a simple SCEV expression for the subtract which is now an xor:
Output from -passes=print<scalar-evolution>

Jul 20 2022, 9:36 AM · Restricted Project, Restricted Project

Jul 18 2022

syzaara updated the diff for D129653: isInductionPHI - Add some safety checks.
Jul 18 2022, 11:39 AM · Restricted Project, Restricted Project
syzaara updated the diff for D129653: isInductionPHI - Add some safety checks.
Jul 18 2022, 11:35 AM · Restricted Project, Restricted Project
syzaara added inline comments to D129653: isInductionPHI - Add some safety checks.
Jul 18 2022, 7:01 AM · Restricted Project, Restricted Project

Jul 13 2022

syzaara added inline comments to D129653: isInductionPHI - Add some safety checks.
Jul 13 2022, 1:07 PM · Restricted Project, Restricted Project
syzaara added inline comments to D129653: isInductionPHI - Add some safety checks.
Jul 13 2022, 10:37 AM · Restricted Project, Restricted Project
syzaara added inline comments to D129653: isInductionPHI - Add some safety checks.
Jul 13 2022, 9:47 AM · Restricted Project, Restricted Project
syzaara added a comment to D129653: isInductionPHI - Add some safety checks.

Untested, but LGTM; thanks for this cleanup!

Jul 13 2022, 9:08 AM · Restricted Project, Restricted Project
syzaara requested review of D129653: isInductionPHI - Add some safety checks.
Jul 13 2022, 8:46 AM · Restricted Project, Restricted Project

Jul 7 2022

syzaara added inline comments to D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.
Jul 7 2022, 12:30 PM · Restricted Project, Restricted Project
syzaara committed rG58b9666dc1a0: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI (authored by syzaara).
[LSR] Fix bug - check if loop has preheader before calling isInductionPHI
Jul 7 2022, 12:14 PM · Restricted Project, Restricted Project
syzaara closed D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.
Jul 7 2022, 12:13 PM · Restricted Project, Restricted Project
syzaara updated the diff for D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.

Address review

Jul 7 2022, 11:57 AM · Restricted Project, Restricted Project
syzaara updated the diff for D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.
Jul 7 2022, 11:45 AM · Restricted Project, Restricted Project
syzaara updated the diff for D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.

Addressed review comments, added test.

Jul 7 2022, 11:22 AM · Restricted Project, Restricted Project
syzaara added a comment to D125990: [LSR] Fix bug for optimizing unused IVs to final values.

I bisected an assertion failure while building the Linux kernel for PowerPC to this patch:

# bad: [20962c1240691d25b21ce425313c81eed0b1b358] [SimplifyCFG] Don't split predecessors of callbr terminator
# good: [dc969061c68e62328607d68215ed8b9ef4a1e4b1] [SimplifyCFG] Thread all predecessors with same value at once
git bisect start '20962c1240691d25b21ce425313c81eed0b1b358' 'dc969061c68e62328607d68215ed8b9ef4a1e4b1'
# bad: [a6e63e35ede4b9f23b58437263eaac9a2926c9bf] [NFC][HLSL] Add tests for vector alias. Remove dead code.
git bisect bad a6e63e35ede4b9f23b58437263eaac9a2926c9bf
# good: [5493f8fc59ca9cc6fc14da8e6aafe6a52fb9ebc0] [VectorCombine] Improve shuffle select shuffle-of-shuffles
git bisect good 5493f8fc59ca9cc6fc14da8e6aafe6a52fb9ebc0
# bad: [b170d856a3a303ab826f6896812bfd0ce05ec706] [SimplifyCFG] Skip hoisting common instructions that return token type
git bisect bad b170d856a3a303ab826f6896812bfd0ce05ec706
# bad: [7b1ff859feaaf1316932d0be113d762626009fd6] [gn build] Port b8dbc6ffea93
git bisect bad 7b1ff859feaaf1316932d0be113d762626009fd6
# good: [a2158374ba1a6f81f4cce3eb54d0bc44f3ab75e0] [mlir][LLVMIR] Apply CallOp/CallableInterface on suitable operations
git bisect good a2158374ba1a6f81f4cce3eb54d0bc44f3ab75e0
# bad: [dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e] [LSR] Fix bug for optimizing unused IVs to final values
git bisect bad dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e
# good: [b8dbc6ffea93976dc0d8569c9d23e9c21e33e317] [HLSL] Add ExternalSemaSource & vector alias
git bisect good b8dbc6ffea93976dc0d8569c9d23e9c21e33e317
# first bad commit: [dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e] [LSR] Fix bug for optimizing unused IVs to final values

A simplified C reproducer:

enum { true } typedef size_t;
struct kernfs_node {
  struct kernfs_node *parent;
} kernfs_path_from_node_locked_kn_to;
size_t kernfs_path_from_node_locked___trans_tmp_1;
void kernfs_path_from_node() {
  asm goto("" : : : : __label_warn_on);
__label_warn_on:;
  struct kernfs_node *to = &kernfs_path_from_node_locked_kn_to;
  size_t depth;
  while (to) {
    depth++;
    to = to->parent;
  }
  kernfs_path_from_node_locked___trans_tmp_1 = depth;
}
$ clang --target=powerpc64-linux-gnu -O2 -c -o /dev/null dir.i
clang: /home/nathan/cbl/src/llvm-project/llvm/include/llvm/IR/Instructions.h:2838: llvm::Value *llvm::PHINode::getIncomingValueForBlock(const llvm::BasicBlock *) const: Assertion `Idx >= 0 && "Invalid basic block argument!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang --target=powerpc64-linux-gnu -O2 -c -o /dev/null dir.i
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'dir.i'.
4.      Running pass 'Loop Pass Manager' on function '@kernfs_path_from_node'
5.      Running pass 'Loop Strength Reduction' on basic block '%while.body'
 #0 0x000055926ab23473 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3445473)
 #1 0x000055926ab2140e llvm::sys::RunSignalHandlers() (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x344340e)
 #2 0x000055926aaa7893 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x000055926aaa7a0e CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0x00007fbd976d8b90 __restore_rt (/lib64/libc.so.6+0x38b90)
 #5 0x00007fbd97728c4c __pthread_kill_implementation (/lib64/libc.so.6+0x88c4c)
 #6 0x00007fbd976d8ae6 gsignal (/lib64/libc.so.6+0x38ae6)
 #7 0x00007fbd976c27f4 abort (/lib64/libc.so.6+0x227f4)
 #8 0x00007fbd976c271b _nl_load_domain.cold (/lib64/libc.so.6+0x2271b)
 #9 0x00007fbd976d1696 (/lib64/libc.so.6+0x31696)
#10 0x0000559269ca1c1d llvm::InductionDescriptor::isInductionPHI(llvm::PHINode*, llvm::Loop const*, llvm::ScalarEvolution*, llvm::InductionDescriptor&, llvm::SCEV const*, llvm::SmallVectorImpl<llvm::Instruction*>*) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x25c3c1d)
#11 0x000055926abe97dc llvm::rewriteLoopExitValues(llvm::Loop*, llvm::LoopInfo*, llvm::TargetLibraryInfo*, llvm::ScalarEvolution*, llvm::TargetTransformInfo const*, llvm::SCEVExpander&, llvm::DominatorTree*, llvm::ReplaceExitVal, llvm::SmallVector<llvm::WeakTrackingVH, 16u>&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x350b7dc)
#12 0x000055926a96f3d8 ReduceLoopStrength(llvm::Loop*, llvm::IVUsers&, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::LoopInfo&, llvm::TargetTransformInfo const&, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::MemorySSA*) LoopStrengthReduce.cpp:0:0
#13 0x000055926a99d154 (anonymous namespace)::LoopStrengthReduce::runOnLoop(llvm::Loop*, llvm::LPPassManager&) LoopStrengthReduce.cpp:0:0
#14 0x0000559269efcaeb llvm::LPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x281eaeb)
#15 0x000055926a3a7197 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x2cc9197)
#16 0x000055926a3aec41 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x2cd0c41)
#17 0x000055926a3a7ba5 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x2cc9ba5)
#18 0x000055926b332f5d clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3c54f5d)
#19 0x000055926b6eafde clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) CodeGenAction.cpp:0:0
#20 0x000055926bfbafc4 clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x48dcfc4)
#21 0x000055926b62d8f0 clang::FrontendAction::Execute() (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3f4f8f0)
#22 0x000055926b5a1f5f clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3ec3f5f)
#23 0x000055926b6e4682 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x4006682)
#24 0x00005592696f6f4a cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x2018f4a)
#25 0x00005592696f4cdf ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#26 0x000055926b423572 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_1>(long) Job.cpp:0:0
#27 0x000055926aaa77a7 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x33c97a7)
#28 0x000055926b4230cf clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3d450cf)
#29 0x000055926b3e2dde clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3d04dde)
#30 0x000055926b3e308e clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3d0508e)
#31 0x000055926b3ff030 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x3d21030)
#32 0x00005592696f43a1 clang_main(int, char**) (/home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin/clang-15+0x20163a1)
#33 0x00007fbd976c3550 __libc_start_call_main (/lib64/libc.so.6+0x23550)
#34 0x00007fbd976c3609 __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x23609)
#35 0x00005592696f1945 _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:117:0
clang-15: error: clang frontend command failed with exit code 134 (use -v to see invocation)
ClangBuiltLinux clang version 15.0.0 (https://github.com/llvm/llvm-project dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e)
Target: powerpc64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/nathan/tmp/install/llvm/dbf6ab5ef9ae0e1f4706917c2b3f98a67c35826e/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

At the parent commit, there is no crash.

Jul 7 2022, 8:39 AM · Restricted Project, Restricted Project
syzaara requested review of D129297: [LSR] Fix bug - check if loop has preheader before calling isInductionPHI.
Jul 7 2022, 8:37 AM · Restricted Project, Restricted Project

Jul 5 2022

syzaara committed rGdbf6ab5ef9ae: [LSR] Fix bug for optimizing unused IVs to final values (authored by syzaara).
[LSR] Fix bug for optimizing unused IVs to final values
Jul 5 2022, 9:31 AM · Restricted Project, Restricted Project
syzaara closed D125990: [LSR] Fix bug for optimizing unused IVs to final values.
Jul 5 2022, 9:31 AM · Restricted Project, Restricted Project

Jun 24 2022

syzaara abandoned D123521: [PowerPC] Legalize masked gather/scatter intrinsics.
Jun 24 2022, 8:46 AM · Restricted Project, Restricted Project

Jun 16 2022

syzaara added a comment to D125990: [LSR] Fix bug for optimizing unused IVs to final values.

@Meinersbur hi Michael, could you please take another look at the addressed review comments.

Jun 16 2022, 9:09 AM · Restricted Project, Restricted Project

Jun 13 2022

syzaara abandoned D123408: [InstCombine] Limit folding of cast into PHI.
Jun 13 2022, 7:03 AM · Restricted Project, Restricted Project
syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

afa192cfb604

That commit causes breakage for me - one file which used to compile in around a second now hangs (doesn't complete in many minutes at least).

To reproduce, download https://martin.st/temp/remap-preproc.c and try to compile it with clang -target x86_64-w64-mingw32 -w -c -O2 remap-preproc.c.

I reverted that commit. I do have an updated patch that accounts for the problem pattern that caused the infinite loop (blame constant expressions...again). Between that and D127499, I think it's safe to abandon this patch.

Jun 13 2022, 7:00 AM · Restricted Project, Restricted Project

Jun 9 2022

syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

@spatel Could you please provide some feedback about this change to InstCombine.

In general, making exceptions to transforms will not solve the general problem. I'm not sure if this is enough, but instcombine seems to miss a narrowing transform like this:
https://alive2.llvm.org/ce/z/hRy3rE

We do already have a variation of that fold, so it should be a small enhancement. I can take a shot at that.
A demanded bits solution within instcombine might be better, but I'm not seeing how to make that work in general since we have to create 3 instructions from the pattern.

Jun 9 2022, 1:39 PM · Restricted Project, Restricted Project
syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

@spatel Could you please provide some feedback about this change to InstCombine.

In general, making exceptions to transforms will not solve the general problem. I'm not sure if this is enough, but instcombine seems to miss a narrowing transform like this:
https://alive2.llvm.org/ce/z/hRy3rE

We do already have a variation of that fold, so it should be a small enhancement. I can take a shot at that.
A demanded bits solution within instcombine might be better, but I'm not seeing how to make that work in general since we have to create 3 instructions from the pattern.

Jun 9 2022, 12:57 PM · Restricted Project, Restricted Project
syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

The cleanest way I can think of to teach LoopVectorizer about this would be to introduce a whole new set of composite reduction operations of the form <op>-then-<lop> (eg RecurKind::AddThenAnd, RecurKind::MulThenAnd, RecurKind::OrThenAnd, and so on)...and that's just for combining logical and with the known integer reduction ops, so if we want to support e.g. or we'd need to double the number of additional recurrence kinds (and the extra logic that comes with it) again. The identity value would be determined from the <op>, and the <lop> has to be applied when reducing the final vector into a single scalar upon loop exit.

@lebedev.ri is this what you had in mind or is there a better way to do it?

@lebedev.ri Can you please advise if the above described way is how we would implement this within the LoopVectorizer?

Sorry, lost track here. I'm not familiar enough with LV to recommend the solution,
but it sounds vaguely reasonable to me. But, do you need the whole <op>-then-<lop> generality?
The only reason why <op>-then-and is useful, is because that and specifies
the effective bitwidth of the reduction, but if the high bits aren't demanded arithmetic/logic ops can be losslessly performed in narrower bit widths:
i32 65535 + i32 65535 = i32 131070 = 0x1FFFE, (trunc(i32 65535) to i8) + (trunc(i32 65535) to i8) = i8 510 = 0xFE, note how low 8 bits are the same.
Perhaps the solution should be around tracking the demanded bit width?

If we think about this problem as an issue with determination of demanded bitwidth, I'm not sure why it would be the job of LV to understand it and treat it in a special way, where InstCombine could have determined it and generated code without the extra widening/truncating instructions. After all, InstCombine is meant to simplify the IR for downstream passes (not LV or other non-canonicalization passes). The one complication is the effect of nsw on that add. Not sure if InstCombine aims at keeping that flag and knowingly treats it as more beneficial than making other simplifications. I suspect it's not, but would be good to verify.

Jun 9 2022, 9:24 AM · Restricted Project, Restricted Project
syzaara updated the diff for D123408: [InstCombine] Limit folding of cast into PHI.

Fix testcase missing target datalayout.

Jun 9 2022, 9:19 AM · Restricted Project, Restricted Project
syzaara added a comment to D125990: [LSR] Fix bug for optimizing unused IVs to final values.

ping

Jun 9 2022, 6:06 AM · Restricted Project, Restricted Project

Jun 2 2022

syzaara updated the diff for D125990: [LSR] Fix bug for optimizing unused IVs to final values.

Address review comments.

Jun 2 2022, 1:28 PM · Restricted Project, Restricted Project

May 26 2022

syzaara updated the diff for D125990: [LSR] Fix bug for optimizing unused IVs to final values.

Shift handling into rewriteLoopExitValues so we can check each individual phi. Added a new entry to enum ReplaceExitVal called UnusedIndVar which will allow rewriteLoopExitValues to check for exit values that can be replaced when they are used as induction variables in the loop with no other uses in the loop.

May 26 2022, 9:55 AM · Restricted Project, Restricted Project

May 19 2022

syzaara requested review of D125990: [LSR] Fix bug for optimizing unused IVs to final values.
May 19 2022, 10:03 AM · Restricted Project, Restricted Project

May 17 2022

syzaara added a comment to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

Hi,

The following starts crashing with this patch:

llc -O1 -o /dev/null bbi-69670_x86.ll

I get:

llc: ../lib/Transforms/Utils/LoopUtils.cpp:1385: int llvm::rewriteLoopExitValues(llvm::Loop *, llvm::LoopInfo *, llvm::TargetLibraryInfo *, llvm::ScalarEvolution *, const llvm::TargetTransformInfo *, llvm::SCEVExpander &, llvm::DominatorTree *, llvm::ReplaceExitVal, SmallVector<llvm::WeakTrackingVH, 16> &): Assertion `EVL->contains(L) && "LCSSA breach detected!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: ../../main-github/llvm/build-all/bin/llc -O1 -o /dev/null bbi-69670_x86.ll
1.	Running pass 'Function Pass Manager' on module 'bbi-69670_x86.ll'.
2.	Running pass 'Loop Pass Manager' on function '@func_1'
3.	Running pass 'Loop Strength Reduction' on basic block '%for.cond6418'
 #0 0x0000000002abded3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (../../main-github/llvm/build-all/bin/llc+0x2abded3)
 #1 0x0000000002abbb4e llvm::sys::RunSignalHandlers() (../../main-github/llvm/build-all/bin/llc+0x2abbb4e)
 #2 0x0000000002abe256 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007fcb1f00e630 __restore_rt sigaction.c:0:0
 #4 0x00007fcb1c755387 raise (/lib64/libc.so.6+0x36387)
 #5 0x00007fcb1c756a78 abort (/lib64/libc.so.6+0x37a78)
 #6 0x00007fcb1c74e1a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #7 0x00007fcb1c74e252 (/lib64/libc.so.6+0x2f252)
 #8 0x0000000002b80be3 llvm::rewriteLoopExitValues(llvm::Loop*, llvm::LoopInfo*, llvm::TargetLibraryInfo*, llvm::ScalarEvolution*, llvm::TargetTransformInfo const*, llvm::SCEVExpander&, llvm::DominatorTree*, llvm::ReplaceExitVal, llvm::SmallVector<llvm::WeakTrackingVH, 16u>&) (../../main-github/llvm/build-all/bin/llc+0x2b80be3)
 #9 0x00000000024c0a1e ReduceLoopStrength(llvm::Loop*, llvm::IVUsers&, llvm::ScalarEvolution&, llvm::DominatorTree&, llvm::LoopInfo&, llvm::TargetTransformInfo const&, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::MemorySSA*) LoopStrengthReduce.cpp:0:0
#10 0x00000000024ec121 (anonymous namespace)::LoopStrengthReduce::runOnLoop(llvm::Loop*, llvm::LPPassManager&) LoopStrengthReduce.cpp:0:0
#11 0x0000000001b2d17b llvm::LPPassManager::runOnFunction(llvm::Function&) (../../main-github/llvm/build-all/bin/llc+0x1b2d17b)
#12 0x000000000230895f llvm::FPPassManager::runOnFunction(llvm::Function&) (../../main-github/llvm/build-all/bin/llc+0x230895f)
#13 0x000000000230f398 llvm::FPPassManager::runOnModule(llvm::Module&) (../../main-github/llvm/build-all/bin/llc+0x230f398)
#14 0x0000000002308f2d llvm::legacy::PassManagerImpl::run(llvm::Module&) (../../main-github/llvm/build-all/bin/llc+0x2308f2d)
#15 0x000000000074a073 main (../../main-github/llvm/build-all/bin/llc+0x74a073)
#16 0x00007fcb1c741555 __libc_start_main (/lib64/libc.so.6+0x22555)
#17 0x0000000000747670 _start (../../main-github/llvm/build-all/bin/llc+0x747670)
Abort

I wouldn't be surprised if it's the dead bb jumping to non-dead code that trips up something:

for.cond6403:                                     ; preds = %dead, %cont5825
  %1 = phi i32 [ %.lcssa221, %dead ], [ 0, %cont5825 ]
  br label %for.cond6418
[...]
dead:                                             ; No predecessors!
  br label %for.cond6403

I wrote
https://github.com/llvm/llvm-project/issues/55529
about the crash.

May 17 2022, 8:43 AM · Restricted Project, Restricted Project, Unknown Object (Project)

May 3 2022

syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

The cleanest way I can think of to teach LoopVectorizer about this would be to introduce a whole new set of composite reduction operations of the form <op>-then-<lop> (eg RecurKind::AddThenAnd, RecurKind::MulThenAnd, RecurKind::OrThenAnd, and so on)...and that's just for combining logical and with the known integer reduction ops, so if we want to support e.g. or we'd need to double the number of additional recurrence kinds (and the extra logic that comes with it) again. The identity value would be determined from the <op>, and the <lop> has to be applied when reducing the final vector into a single scalar upon loop exit.

@lebedev.ri is this what you had in mind or is there a better way to do it?

May 3 2022, 12:19 PM · Restricted Project, Restricted Project

Apr 11 2022

syzaara requested review of D123521: [PowerPC] Legalize masked gather/scatter intrinsics.
Apr 11 2022, 9:55 AM · Restricted Project, Restricted Project

Apr 8 2022

syzaara added a comment to D123408: [InstCombine] Limit folding of cast into PHI.

This seems like a vectorizer bug.

Apr 8 2022, 10:23 AM · Restricted Project, Restricted Project
syzaara requested review of D123408: [InstCombine] Limit folding of cast into PHI.
Apr 8 2022, 10:16 AM · Restricted Project, Restricted Project
syzaara committed rG07005440ae14: [LSR] Optimize unused IVs to final values in the exit block (authored by syzaara).
[LSR] Optimize unused IVs to final values in the exit block
Apr 8 2022, 8:18 AM · Restricted Project, Restricted Project
syzaara closed D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.
Apr 8 2022, 8:18 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Mar 28 2022

syzaara updated the diff for D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.
Mar 28 2022, 12:01 PM · Restricted Project, Restricted Project, Unknown Object (Project)

Mar 9 2022

syzaara updated the diff for D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

Updated to use rewriteLoopExitValues from LoopUtils.

Mar 9 2022, 11:42 AM · Restricted Project, Restricted Project, Unknown Object (Project)
syzaara added a comment to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

IndVarSimply calls rewriteLoopExitValues from LoopUtils to do this optimization. I tried to call the same function in LoopStrengthReduce, however rewriteLoopExitValues asserts that it requires: L->isRecursivelyLCSSAForm(*DT, *LI).

When calling rewriteLoopExitValues from LoopStrengthReduce, this assert is not met.

I am confused, is the assert from rewriteLoopExitValues met or not when calling from LoopStrengthReduce? Does LoopStrengthReduce currently rely on LCSSA form? Is it only become not LCSSA after LoopStrengthReduce changes the code?
Can we first invoke formLCSSA to form LCSSA before calling rewriteLoopExitValues?

Mar 9 2022, 11:40 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Mar 3 2022

syzaara updated the diff for D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.
Mar 3 2022, 8:40 AM · Restricted Project, Restricted Project, Unknown Object (Project)
Herald added a project to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV: Restricted Project.

Is it possible to abstract the portion of the code in IndVarSimplify to simplify this code pattern, and call that function instead of creating your own ReplaceExitPHIsWithFinalVal?

Mar 3 2022, 8:37 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Feb 15 2022

syzaara updated subscribers of D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

@nikic Hi Nikita, could you help me with assessing the compile time impact of adding the IndVarSimplifyPass after LoopStrengthReduce?

Feb 15 2022, 8:30 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Feb 14 2022

syzaara added a comment to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

PING

Feb 14 2022, 8:56 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Feb 2 2022

syzaara added a comment to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

This is normally a IndVar's problem, but i guess LSR runs so late that there is no IndVar's scheduled after it?
Do you have a phase-ordering test showing this missing simplification?

Feb 2 2022, 11:55 AM · Restricted Project, Restricted Project, Unknown Object (Project)
syzaara added a comment to D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.

This is normally a IndVar's problem, but i guess LSR runs so late that there is no IndVar's scheduled after it?
Do you have a phase-ordering test showing this missing simplification?

Feb 2 2022, 10:16 AM · Restricted Project, Restricted Project, Unknown Object (Project)
syzaara updated the diff for D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.
Feb 2 2022, 9:53 AM · Restricted Project, Restricted Project, Unknown Object (Project)
syzaara added a reviewer for D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV: dmgreen.
Feb 2 2022, 9:24 AM · Restricted Project, Restricted Project, Unknown Object (Project)
syzaara requested review of D118808: Loop Strength Reduce - Optimize unused IVs to final values in the exit block with SCEV.
Feb 2 2022, 9:21 AM · Restricted Project, Restricted Project, Unknown Object (Project)

Dec 15 2021

syzaara committed rGe0669931afdb: [LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam (authored by syzaara).
[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam
Dec 15 2021, 7:58 AM

Dec 14 2021

syzaara committed rGdd245bab9fbb: [LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam (authored by syzaara).
[LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam
Dec 14 2021, 8:47 AM
syzaara closed D114886: [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam.
Dec 14 2021, 8:47 AM · Unknown Object (Project), Restricted Project
syzaara committed rG3f066ac64893: Test commit (authored by syzaara).
Test commit
Dec 14 2021, 7:38 AM

Dec 1 2021

syzaara updated the diff for D114886: [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam.
Dec 1 2021, 11:05 AM · Unknown Object (Project), Restricted Project
syzaara added a reviewer for D114886: [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam: reames.
Dec 1 2021, 10:41 AM · Unknown Object (Project), Restricted Project
syzaara retitled D114886: [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam from [LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam to [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam.
Dec 1 2021, 9:11 AM · Unknown Object (Project), Restricted Project
syzaara requested review of D114886: [LoopOptWG][LoopUnroll] Disable loop unroll when user explicitly asks for unroll-and-jam.
Dec 1 2021, 9:08 AM · Unknown Object (Project), Restricted Project

Jun 4 2019

syzaara accepted D62823: [AIX] Implement call lowering with parameters could pass onto GPRs.

LGTM

Jun 4 2019, 10:06 AM · Restricted Project
syzaara added inline comments to D62532: [AIX] Implement function descriptor on SDAG.
Jun 4 2019, 10:00 AM · Restricted Project
syzaara added inline comments to D62823: [AIX] Implement call lowering with parameters could pass onto GPRs.
Jun 4 2019, 7:17 AM · Restricted Project

Jan 15 2019

syzaara committed rL351193: [SimpleLoopUnswitch] Increment stats counter for unswitching switch instruction.
[SimpleLoopUnswitch] Increment stats counter for unswitching switch instruction
Jan 15 2019, 7:12 AM
syzaara closed D56408: [SimpleLoopUnswitch] Increment stats counter for unswitching switch instruction.
Jan 15 2019, 7:11 AM

Nov 20 2018

syzaara added inline comments to D54720: [PPC64] toc-indirect to toc-relative relaxation..
Nov 20 2018, 6:47 AM · Restricted Project

Nov 12 2018

syzaara added a comment to D54433: [PowerPC][NFC] Macro for register set defs for the Asm Parser.

We have similar definitions in PPCDisassembler.cpp, it would be good to condense those definitions as well.

Nov 12 2018, 10:16 AM

Nov 9 2018

syzaara committed rL346512: [Power9] Allow gpr callee saved spills in prologue to vectors registers.
[Power9] Allow gpr callee saved spills in prologue to vectors registers
Nov 9 2018, 8:39 AM
syzaara closed D39386: [Power9] Allow gpr callee saved spills in prologue to vector registers rather than stack.
Nov 9 2018, 8:38 AM · Restricted Project

Nov 5 2018

syzaara committed rL346148: [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics.
[Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics
Nov 5 2018, 9:33 AM
syzaara closed D53581: [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics.
Nov 5 2018, 9:33 AM

Nov 2 2018

syzaara edited reviewers for D53581: [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics, added: nemanjai, lei, sfertile, stefanp, amyk; removed: power-llvm-team.
Nov 2 2018, 11:49 AM

Oct 23 2018

syzaara created D53581: [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics.
Oct 23 2018, 8:28 AM

Oct 9 2018

syzaara added inline comments to D39386: [Power9] Allow gpr callee saved spills in prologue to vector registers rather than stack.
Oct 9 2018, 6:26 AM · Restricted Project

Sep 24 2018

syzaara committed rL342882: [PowerPC] Support operand modifier 'x' in inline asm.
[PowerPC] Support operand modifier 'x' in inline asm
Sep 24 2018, 7:25 AM
syzaara closed D52244: [PowerPC] Support operand modifier 'x' in inline asm.
Sep 24 2018, 7:25 AM
syzaara updated the diff for D52244: [PowerPC] Support operand modifier 'x' in inline asm.
Sep 24 2018, 7:20 AM

Sep 18 2018

syzaara created D52244: [PowerPC] Support operand modifier 'x' in inline asm.
Sep 18 2018, 12:23 PM

Aug 31 2018

syzaara added inline comments to D39386: [Power9] Allow gpr callee saved spills in prologue to vector registers rather than stack.
Aug 31 2018, 8:30 AM · Restricted Project

Aug 30 2018

syzaara updated the diff for D39386: [Power9] Allow gpr callee saved spills in prologue to vector registers rather than stack.
Aug 30 2018, 8:26 AM · Restricted Project

Aug 28 2018

syzaara updated the diff for D39386: [Power9] Allow gpr callee saved spills in prologue to vector registers rather than stack.
Aug 28 2018, 10:44 AM · Restricted Project

Aug 21 2018

syzaara committed rL340281: [PPC64] Add TLS initial exec to local exec relaxation.
[PPC64] Add TLS initial exec to local exec relaxation
Aug 21 2018, 8:14 AM
syzaara committed rLLD340281: [PPC64] Add TLS initial exec to local exec relaxation.
[PPC64] Add TLS initial exec to local exec relaxation
Aug 21 2018, 8:14 AM
syzaara closed D48091: [PPC64] Add TLS initial exec to local exec relaxation.
Aug 21 2018, 8:14 AM

Aug 8 2018

syzaara committed rL339260: [PowerPC] Improve codegen for vector loads using scalar_to_vector.
[PowerPC] Improve codegen for vector loads using scalar_to_vector
Aug 8 2018, 8:21 AM
syzaara closed D48950: [PowerPC] Improve codegen for vector loads using scalar_to_vector .
Aug 8 2018, 8:21 AM

Jul 25 2018

syzaara updated the diff for D49507: [Power9] Add __float128 support in the backend for bitcast to a i128.
Jul 25 2018, 1:01 PM
syzaara updated the diff for D48091: [PPC64] Add TLS initial exec to local exec relaxation.
Jul 25 2018, 8:10 AM

Jul 24 2018

syzaara added inline comments to D48950: [PowerPC] Improve codegen for vector loads using scalar_to_vector .
Jul 24 2018, 2:22 PM
syzaara added inline comments to D48950: [PowerPC] Improve codegen for vector loads using scalar_to_vector .
Jul 24 2018, 2:13 PM
syzaara added inline comments to D48950: [PowerPC] Improve codegen for vector loads using scalar_to_vector .
Jul 24 2018, 2:12 PM
syzaara commandeered D49507: [Power9] Add __float128 support in the backend for bitcast to a i128.
Jul 24 2018, 1:30 PM
syzaara updated the diff for D48091: [PPC64] Add TLS initial exec to local exec relaxation.
Jul 24 2018, 12:59 PM

Jul 18 2018

syzaara added a comment to D48091: [PPC64] Add TLS initial exec to local exec relaxation.

Just a question. Does this patch support the optimization that fills in the GOT slot when a group of initial-exec relocations are known to be link-time constants?

__attribute__((tls_model("initial-exec")))
static __thread DTLS dtls;
48:   00 00 62 3c     addis   r3,r2,0
                      48: R_PPC64_GOT_TPREL16_HA      _ZN11__sanitizerL4dtlsE
4c:   00 00 63 e8     ld      r3,0(r3)
                      4c: R_PPC64_GOT_TPREL16_LO_DS   _ZN11__sanitizerL4dtlsE
50:   14 6a 83 7c     add     r4,r3,r13
                      50: R_PPC64_TLS _ZN11__sanitizerL4dtlsE

https://github.com/llvm-mirror/lld/tree/master/ELF/Relocations.cpp#L747

template <class ELFT> static void addGotEntry(Symbol &Sym) {
...
  bool IsLinkTimeConstant =
      !Sym.IsPreemptible && (!Config->Pic || isAbsolute(Sym));
  /// if R_PPC64_GOT_TPREL16_HA R_PPC64_GOT_TPREL16_LO_DS are link-time constants, they can be filled but `Target->GotRel` should not be used here as it is a 64-bit value.
  /// Target->GotRel is R_PPC64_GLOB_DAT but I think it should not be used.
  if (IsLinkTimeConstant) {
    InX::Got->Relocations.push_back({Expr, Target->GotRel, Off, 0, &Sym});
    return;
  }
Jul 18 2018, 8:07 AM

Jul 12 2018

syzaara updated the diff for D49237: [PPC64] Optimize redundant instructions using R_PPC64_TOC16_HA in nop.
Jul 12 2018, 8:00 AM