This is an archive of the discontinued LLVM Phabricator instance.

[HLSL] [DirectX backend] Move generateGlobalCtorDtorCalls into DirectX backend.
Needs ReviewPublic

Authored by python3kgae on Oct 7 2022, 12:53 AM.

Details

Summary

Move generateGlobalCtorDtorCalls out of clang CodeGen.
A new transform pass GlobalCtorDtorCalls is created to do the job.
GlobalCtorDtorCalls is registed in registerPipelineEarlySimplificationEPCallback.

Diff Detail

Event Timeline

python3kgae created this revision.Oct 7 2022, 12:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2022, 12:53 AM
python3kgae requested review of this revision.Oct 7 2022, 12:53 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 7 2022, 12:53 AM

You might actually want to run this twice, once early, and once in the backend. Early to get high-quality optimization, late in case optimizations don't run for some reason. (Not sure how that works if T.getEnvironment() == Triple::EnvironmentType::Library, though, since we don't actually erase the constructors?)

python3kgae added a comment.EditedOct 7 2022, 2:46 PM

You might actually want to run this twice, once early, and once in the backend. Early to get high-quality optimization, late in case optimizations don't run for some reason. (Not sure how that works if T.getEnvironment() == Triple::EnvironmentType::Library, though, since we don't actually erase the constructors?)

The pass will be always enabled in registerPipelineEarlySimplificationEPCallback (Not controlled by optimization level), is it possible that things registered in registerPipelineEarlySimplificationEPCallback be skipped?

For the library case, we'll have to erase the constructors after generating calls for them in the entry function when linking.
In case the target profile is still library when linking, we'll have to merge the constructors.
These should be done in the linker. Created https://github.com/llvm/llvm-project/issues/58238 to track it.

The pass will be always enabled in registerPipelineEarlySimplificationEPCallback (Not controlled by optimization level), is it possible that things registered in registerPipelineEarlySimplificationEPCallback be skipped?

I think we won't end up running it at -O0.

The pass will be always enabled in registerPipelineEarlySimplificationEPCallback (Not controlled by optimization level), is it possible that things registered in registerPipelineEarlySimplificationEPCallback be skipped?

I think we won't end up running it at -O0.

Oh, hmm, just checked, and it looks like we have a special codepath specifically to ensure PipelineEarlySimplificationEPCallbacks runs at -O0. I guess you're fine in that respect.

The pass will be always enabled in registerPipelineEarlySimplificationEPCallback (Not controlled by optimization level), is it possible that things registered in registerPipelineEarlySimplificationEPCallback be skipped?

I think we won't end up running it at -O0.

Oh, hmm, just checked, and it looks like we have a special codepath specifically to ensure PipelineEarlySimplificationEPCallbacks runs at -O0. I guess you're fine in that respect.

I see. Thanks for the update.

Add end of file newline

beanz added a comment.Oct 19 2022, 6:32 AM

I don’t think you should be deleting all those frontend tests. Those tests existed before your change to remove the global constructor global variable, they provide valuable coverage of the frontend constructor generation.

beanz added a comment.Oct 19 2022, 7:47 AM

In addition to restoring the test cases that you deleted, I think you should also reduce and simplify the test cases you're adding. These new tests are very large blobs of IR generated by clang. They include a lot of extra instructions that aren't needed to exercise the code you wrote in this patch.

beanz added inline comments.Oct 19 2022, 7:57 AM
clang/lib/CodeGen/CGHLSLRuntime.cpp
193

I think you've been a bit too aggressive about what code you're moving. The code that generates the global constructor calls should not be removed.

We should generate correct IR from clang, and part of correct IR for HLSL is that the global constructors are called from the entries.

clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl