Emit references to __do_global_ctors and __do_global_dtors to allow constructor/destructor routines to run.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I roughly followed the example of how __do_copy_data and __do_clear_bss are linked, in AVRTargetStreamer::finish().
We need a test case to show this change, please refer to tests in llvm/test/CodeGen or llvm/test/MC .
llvm/test/CodeGen/AVR/ctors.ll | ||
---|---|---|
1 ↗ | (On Diff #363193) | Please use llvm's builtin script to make your test case better formatted. python3 llvm-project/llvm/utils/update_llc_test_checks.py llvm-project/llvm/test/CodeGen/AVR/ctors.ll |
Updated test with update_llc_test_checks.py. Note that update_llc_test_checks.py doesn't actually support AVR at the moment! I had to apply this fix first:
diff --git a/llvm/utils/UpdateTestChecks/asm.py b/llvm/utils/UpdateTestChecks/asm.py index 2a18899d55f0..6c878dd742bc 100644 --- a/llvm/utils/UpdateTestChecks/asm.py +++ b/llvm/utils/UpdateTestChecks/asm.py @@ -360,6 +360,7 @@ def get_triple_from_march(march): 'mips': 'mips', 'sparc': 'sparc', 'hexagon': 'hexagon', + 'avr': 'avr', } for prefix, triple in triples.items(): if march.startswith(prefix):
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
64 | It would not be good to use such a flag. But would be better to put your code into MCTargetDesc/AVRTargetStreamer.cpp where __do_copy_data is emitted, and make a check like if (current module has constructor) emit do_global_ctors if (current module has destructor) emit do_global_dtors | |
206 | Do we need to emit some comments for __do_global_dtors as __do_copy_data has ? |
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
64 | I'm not sure how to know, from AVRTargetStreamer, whether there are constructors/destructors. If we don't want to keep EmittedStructorSymbolAttrs, then I suppose an alternative would be for AVRAsmPrinter::emitXXStructor() *always* to emit the global symbols. It won't hurt anything for them to be emitted more than once -- it's just theoretically inefficient. Can you explain more about your concern with EmittedStructorSymbolAttrs? Is it bad to keep state inside AVRAsmPrinter? | |
206 | Good idea -- I'll do that now. |
You're right, AVR does have a section (.ctors) for the *table* of constructors. But there has to be some code that actually calls the functions in that table. That's what __do_global_ctors does. When present, it gets linked into the boot path (after things like __do_copy_data and __do_clear_bss) to do its work. But if nothing requests it, it won't be linked, and the constructor table will just sit around, never to be consulted.
I see this code as similar to the code that links in __do_copy_data, which I think would also fall into the category of "magic symbols"—symbols that are necessary to make the program run correctly on AVR.
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
64 | Yes. I do concern keep any state inside AVRAsmPrinter ? How about Dylan's opinion? |
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
64 | I really do not like using a flag to keep the emitted state. How about only directly emit them in MCTargetDesc/AVRTargetStreamer.cpp, and add a FIXME, as for __do_copy_data ? void AVRTargetStreamer::finish() { MCStreamer &OS = getStreamer(); MCContext &Context = OS.getContext(); MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data"); MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss"); // FIXME: We can disable __do_copy_data if there are no static RAM variables. OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); OS.emitRawComment("copy all variables from program memory to RAM on startup"); OS.emitSymbolAttribute(DoCopyData, MCSA_Global); OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); OS.emitRawComment("clear the zeroed data section on startup"); OS.emitSymbolAttribute(DoClearBss, MCSA_Global); } |
Sure, I'm open to that.
It's rare for a program not to use any global variables, whereas programs that don't use any constructor/destructor functions are pretty common. Luckily, though, __do_global_ctors is pretty cheap (just a few instructions) if the constructor table is empty.
So I think your suggestion would be fine.
But can you explain more the reason why we don't want this state in AVRAsmPrinter? If I'm reading correctly, the superclass, AsmPrinter, keeps mutable state, and (AFAICT) it's not unusual for other targets' AsmPrinter subclasses to keep state, either. I'm admittedly not an expert here though, so I'd like to learn why it's a problem.
Thanks!
I see your idea. It is OK to keep states in AsmPrinter. But I still would like to see if there is a unique way for all such '.globl' symbol declaration.
Maybe either temporarily put them all in AVRTargetStreamer::finish and add FIXMEs, or put all in AsmPrinter with flags ( a bit flag variable or something else)?
What about Dylan's idea?
Just brainstorming. Another option might be to move the code in AVRTargetStreamer::finish() into a new override AVRAsmPrinter::doFinalization(). That would make it so that both bits of .globl code are in AVRAsmPrinter.
Here's what that would look like (atop the change currently here):
diff --git a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp index f76034e7f115..7b7ede62a00f 100644 --- a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp +++ b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp @@ -59,6 +59,8 @@ public: void emitXXStructor(const DataLayout &DL, const Constant *CV) override; + bool doFinalization(Module &M) override; + private: const MCRegisterInfo &MRI; bool EmittedStructorSymbolAttrs = false; @@ -217,6 +219,23 @@ void AVRAsmPrinter::emitXXStructor(const DataLayout &DL, const Constant *CV) { AsmPrinter::emitXXStructor(DL, CV); } +bool AVRAsmPrinter::doFinalization(Module &M) { + MCSymbol *DoCopyData = OutContext.getOrCreateSymbol("__do_copy_data"); + MCSymbol *DoClearBss = OutContext.getOrCreateSymbol("__do_clear_bss"); + + // FIXME: We can disable __do_copy_data if there are no static RAM variables. + + OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should"); + OutStreamer->emitRawComment("copy all variables from program memory to RAM on startup"); + OutStreamer->emitSymbolAttribute(DoCopyData, MCSA_Global); + + OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should"); + OutStreamer->emitRawComment("clear the zeroed data section on startup"); + OutStreamer->emitSymbolAttribute(DoClearBss, MCSA_Global); + + return AsmPrinter::doFinalization(M); +} + } // end of namespace llvm extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAVRAsmPrinter() { diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp index eccd343d79ab..56e0e7810466 100644 --- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp +++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp @@ -21,23 +21,4 @@ AVRTargetStreamer::AVRTargetStreamer(MCStreamer &S) : MCTargetStreamer(S) {} AVRTargetAsmStreamer::AVRTargetAsmStreamer(MCStreamer &S) : AVRTargetStreamer(S) {} -void AVRTargetStreamer::finish() { - MCStreamer &OS = getStreamer(); - MCContext &Context = OS.getContext(); - - MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data"); - MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss"); - - // FIXME: We can disable __do_copy_data if there are no static RAM variables. - - OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); - OS.emitRawComment("copy all variables from program memory to RAM on startup"); - OS.emitSymbolAttribute(DoCopyData, MCSA_Global); - - OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); - OS.emitRawComment("clear the zeroed data section on startup"); - OS.emitSymbolAttribute(DoClearBss, MCSA_Global); -} - } // end namespace llvm - diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h index 5c4d1a22f6c6..b8b1454a2b8d 100644 --- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h +++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h @@ -18,8 +18,6 @@ class MCStreamer; class AVRTargetStreamer : public MCTargetStreamer { public: explicit AVRTargetStreamer(MCStreamer &S); - - void finish() override; }; /// A target streamer for textual AVR assembly code.
Maybe—though note that GCC does handle this in the compiler, and LLVM itself handles the similar symbols __do_copy_data and __do_clear_bss in the compiler (as we've been discussing here).
Adding such functionality to the GNU linker probably isn't straightforward. The linker has no concept of constructor functions, AFAIK. The only thing that directs the .ctors table to be linked in is the AVR linker script.
Assuming no such fix in the linker is imminent, could we agree that supporting this in LLVM would be a good improvement for now?
If GCC does emit __do_global_ctors, having parity is fine.
If AVR supports COMDAT, make sure you test the case when the third field references a function in a comdat.
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
203 | delete period | |
207 | delete ELF doesn't need setExternal | |
llvm/test/CodeGen/AVR/ctors.ll | ||
11 ↗ | (On Diff #363373) | This is insufficient. It should additionally check whether the label is defined. |
llvm/test/CodeGen/AVR/ctors.ll | ||
---|---|---|
11 ↗ | (On Diff #363373) | The label isn't expected to be defined. The symbol comes from libgcc.a. This directive simply tells the assembler to emit an undefined-symbol reference to __do_global_ctors. The actual implementation comes at link time, from libgcc. For comparison, see llvm/test/CodeGen/AVR/clear-bss.ll, which does the same thing. (Apologies if I'm misunderstanding your request here.) |
llvm/test/CodeGen/AVR/ctors.ll | ||
---|---|---|
11 ↗ | (On Diff #363373) | OK. This is resolved. Thanks for the reply. Please add ;; Emit .globl __do_global_ctors to leave an undefined symbol. This matches GNU as and is needed to pull the definition from libgcc.a or something similar. |
Removed incorrect use of setExternal().
Tweaked assembly comment.
The COMDAT handling is in AsmPrinter::emitXXStructorList() and is unchanged by this patch.
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
207 | Good catch, thanks. |
@MaskRay, do you have an opinion on whether I should move the other undefined-symbols definitions into AVRAsmPrinter.cpp to match?
Specifically, something like this:
diff --git a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp index f76034e7f115..7b7ede62a00f 100644 --- a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp +++ b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp @@ -59,6 +59,8 @@ public: void emitXXStructor(const DataLayout &DL, const Constant *CV) override; + bool doFinalization(Module &M) override; + private: const MCRegisterInfo &MRI; bool EmittedStructorSymbolAttrs = false; @@ -217,6 +219,23 @@ void AVRAsmPrinter::emitXXStructor(const DataLayout &DL, const Constant *CV) { AsmPrinter::emitXXStructor(DL, CV); } +bool AVRAsmPrinter::doFinalization(Module &M) { + MCSymbol *DoCopyData = OutContext.getOrCreateSymbol("__do_copy_data"); + MCSymbol *DoClearBss = OutContext.getOrCreateSymbol("__do_clear_bss"); + + // FIXME: We can disable __do_copy_data if there are no static RAM variables. + + OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should"); + OutStreamer->emitRawComment("copy all variables from program memory to RAM on startup"); + OutStreamer->emitSymbolAttribute(DoCopyData, MCSA_Global); + + OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should"); + OutStreamer->emitRawComment("clear the zeroed data section on startup"); + OutStreamer->emitSymbolAttribute(DoClearBss, MCSA_Global); + + return AsmPrinter::doFinalization(M); +} + } // end of namespace llvm extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAVRAsmPrinter() { diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp index eccd343d79ab..56e0e7810466 100644 --- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp +++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp @@ -21,23 +21,4 @@ AVRTargetStreamer::AVRTargetStreamer(MCStreamer &S) : MCTargetStreamer(S) {} AVRTargetAsmStreamer::AVRTargetAsmStreamer(MCStreamer &S) : AVRTargetStreamer(S) {} -void AVRTargetStreamer::finish() { - MCStreamer &OS = getStreamer(); - MCContext &Context = OS.getContext(); - - MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data"); - MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss"); - - // FIXME: We can disable __do_copy_data if there are no static RAM variables. - - OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); - OS.emitRawComment("copy all variables from program memory to RAM on startup"); - OS.emitSymbolAttribute(DoCopyData, MCSA_Global); - - OS.emitRawComment(" Declaring this symbol tells the CRT that it should"); - OS.emitRawComment("clear the zeroed data section on startup"); - OS.emitSymbolAttribute(DoClearBss, MCSA_Global); -} - } // end namespace llvm - diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h index 5c4d1a22f6c6..b8b1454a2b8d 100644 --- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h +++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h @@ -18,8 +18,6 @@ class MCStreamer; class AVRTargetStreamer : public MCTargetStreamer { public: explicit AVRTargetStreamer(MCStreamer &S); - - void finish() override; }; /// A target streamer for textual AVR assembly code.
doFinalization is indeed a better place for .globl. You may need to condition the attributes on EmittedStructorSymbolAttrs
Move emission of __do_copy_data and __do_clear_bss into AVRAsmPrinter. (Leave AVRTargetStreamer as base class of AVRELFStreamer.)
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
225 | Leading space is not needed |
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
225 | The leading space seems to be pretty common for calls to emitRawComment, not just in the AVR target but others too. It turns ;Declaring this symbol into ; Declaring this symbol which looks prettier for anyone having to look at the generated assembly. (I'll fix the linter complaint.) |
llvm/lib/Target/AVR/AVRAsmPrinter.cpp | ||
---|---|---|
225 | ok, i thought emitRawComment ensured a space when # is used. Perhaps that's not the case, or not the case for ; |
clang-format: please reformat the code