This is an archive of the discontinued LLVM Phabricator instance.

[AVR] emit `MCSA_Global` references to `__do_global_ctors` and `__do_global_dtors`
ClosedPublic

Authored by mhjacobson on Jul 29 2021, 11:50 PM.

Details

Summary

Emit references to __do_global_ctors and __do_global_dtors to allow constructor/destructor routines to run.

Diff Detail

Event Timeline

mhjacobson created this revision.Jul 29 2021, 11:50 PM
mhjacobson requested review of this revision.Jul 29 2021, 11:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2021, 11:50 PM

I roughly followed the example of how __do_copy_data and __do_clear_bss are linked, in AVRTargetStreamer::finish().

We need a test case to show this change, please refer to tests in llvm/test/CodeGen or llvm/test/MC .

Added a test.

benshi001 added inline comments.Jul 30 2021, 6:31 PM
llvm/test/CodeGen/AVR/ctors.ll
2

Please use llvm's builtin script to make your test case better formatted.

python3 llvm-project/llvm/utils/update_llc_test_checks.py llvm-project/llvm/test/CodeGen/AVR/ctors.ll

Updated test with update_llc_test_checks.py. Note that update_llc_test_checks.py doesn't actually support AVR at the moment! I had to apply this fix first:

diff --git a/llvm/utils/UpdateTestChecks/asm.py b/llvm/utils/UpdateTestChecks/asm.py
index 2a18899d55f0..6c878dd742bc 100644
--- a/llvm/utils/UpdateTestChecks/asm.py
+++ b/llvm/utils/UpdateTestChecks/asm.py
@@ -360,6 +360,7 @@ def get_triple_from_march(march):
       'mips': 'mips',
       'sparc': 'sparc',
       'hexagon': 'hexagon',
+      'avr': 'avr',
   }
   for prefix, triple in triples.items():
     if march.startswith(prefix):
mhjacobson marked an inline comment as done.Jul 30 2021, 9:24 PM

Fixed alphabetization of includes.

Argh, the test fails now, because the ordering is wrong. Fixing it now.

benshi001 added inline comments.Jul 30 2021, 10:36 PM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
64

It would not be good to use such a flag. But would be better to put your code into MCTargetDesc/AVRTargetStreamer.cpp where __do_copy_data is emitted, and make a check like

if (current module has constructor)
  emit do_global_ctors
if (current module has destructor)
  emit do_global_dtors
206

Do we need to emit some comments for __do_global_dtors as __do_copy_data has ?

Fix the ordering of the directives in the test, so that it actually passes! :)

mhjacobson added inline comments.Jul 30 2021, 10:51 PM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
64

I'm not sure how to know, from AVRTargetStreamer, whether there are constructors/destructors.

If we don't want to keep EmittedStructorSymbolAttrs, then I suppose an alternative would be for AVRAsmPrinter::emitXXStructor() *always* to emit the global symbols. It won't hurt anything for them to be emitted more than once -- it's just theoretically inefficient.

Can you explain more about your concern with EmittedStructorSymbolAttrs? Is it bad to keep state inside AVRAsmPrinter?

206

Good idea -- I'll do that now.

Add assembly comment.

mhjacobson marked an inline comment as done.Jul 30 2021, 11:05 PM

Doesn't AVR use a section? Using magic symbols looks weird.

You're right, AVR does have a section (.ctors) for the *table* of constructors. But there has to be some code that actually calls the functions in that table. That's what __do_global_ctors does. When present, it gets linked into the boot path (after things like __do_copy_data and __do_clear_bss) to do its work. But if nothing requests it, it won't be linked, and the constructor table will just sit around, never to be consulted.

I see this code as similar to the code that links in __do_copy_data, which I think would also fall into the category of "magic symbols"—symbols that are necessary to make the program run correctly on AVR.

benshi001 added inline comments.Jul 31 2021, 1:01 AM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
64

Yes. I do concern keep any state inside AVRAsmPrinter ? How about Dylan's opinion?

Conform to source formatting convention.

benshi001 added inline comments.Aug 1 2021, 6:52 PM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
64

I really do not like using a flag to keep the emitted state. How about only directly emit them in MCTargetDesc/AVRTargetStreamer.cpp, and add a FIXME, as for __do_copy_data ?

void AVRTargetStreamer::finish() {
  MCStreamer &OS = getStreamer();
  MCContext &Context = OS.getContext();

  MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data");
  MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss");

  // FIXME: We can disable __do_copy_data if there are no static RAM variables.

  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
  OS.emitRawComment("copy all variables from program memory to RAM on startup");
  OS.emitSymbolAttribute(DoCopyData, MCSA_Global);

  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
  OS.emitRawComment("clear the zeroed data section on startup");
  OS.emitSymbolAttribute(DoClearBss, MCSA_Global);
}

Sure, I'm open to that.

It's rare for a program not to use any global variables, whereas programs that don't use any constructor/destructor functions are pretty common. Luckily, though, __do_global_ctors is pretty cheap (just a few instructions) if the constructor table is empty.

So I think your suggestion would be fine.

But can you explain more the reason why we don't want this state in AVRAsmPrinter? If I'm reading correctly, the superclass, AsmPrinter, keeps mutable state, and (AFAICT) it's not unusual for other targets' AsmPrinter subclasses to keep state, either. I'm admittedly not an expert here though, so I'd like to learn why it's a problem.

Thanks!

Sure, I'm open to that.

It's rare for a program not to use any global variables, whereas programs that don't use any constructor/destructor functions are pretty common. Luckily, though, __do_global_ctors is pretty cheap (just a few instructions) if the constructor table is empty.

So I think your suggestion would be fine.

But can you explain more the reason why we don't want this state in AVRAsmPrinter? If I'm reading correctly, the superclass, AsmPrinter, keeps mutable state, and (AFAICT) it's not unusual for other targets' AsmPrinter subclasses to keep state, either. I'm admittedly not an expert here though, so I'd like to learn why it's a problem.

Thanks!

I see your idea. It is OK to keep states in AsmPrinter. But I still would like to see if there is a unique way for all such '.globl' symbol declaration.

Maybe either temporarily put them all in AVRTargetStreamer::finish and add FIXMEs, or put all in AsmPrinter with flags ( a bit flag variable or something else)?

What about Dylan's idea?

Just brainstorming. Another option might be to move the code in AVRTargetStreamer::finish() into a new override AVRAsmPrinter::doFinalization(). That would make it so that both bits of .globl code are in AVRAsmPrinter.

Just brainstorming. Another option might be to move the code in AVRTargetStreamer::finish() into a new override AVRAsmPrinter::doFinalization(). That would make it so that both bits of .globl code are in AVRAsmPrinter.

Here's what that would look like (atop the change currently here):

diff --git a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
index f76034e7f115..7b7ede62a00f 100644
--- a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
+++ b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
@@ -59,6 +59,8 @@ public:
 
   void emitXXStructor(const DataLayout &DL, const Constant *CV) override;
 
+  bool doFinalization(Module &M) override;
+
 private:
   const MCRegisterInfo &MRI;
   bool EmittedStructorSymbolAttrs = false;
@@ -217,6 +219,23 @@ void AVRAsmPrinter::emitXXStructor(const DataLayout &DL, const Constant *CV) {
   AsmPrinter::emitXXStructor(DL, CV);
 }
 
+bool AVRAsmPrinter::doFinalization(Module &M) {
+  MCSymbol *DoCopyData = OutContext.getOrCreateSymbol("__do_copy_data");
+  MCSymbol *DoClearBss = OutContext.getOrCreateSymbol("__do_clear_bss");
+
+  // FIXME: We can disable __do_copy_data if there are no static RAM variables.
+
+  OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should");
+  OutStreamer->emitRawComment("copy all variables from program memory to RAM on startup");
+  OutStreamer->emitSymbolAttribute(DoCopyData, MCSA_Global);
+
+  OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should");
+  OutStreamer->emitRawComment("clear the zeroed data section on startup");
+  OutStreamer->emitSymbolAttribute(DoClearBss, MCSA_Global);
+
+  return AsmPrinter::doFinalization(M);
+}
+
 } // end of namespace llvm
 
 extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAVRAsmPrinter() {
diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
index eccd343d79ab..56e0e7810466 100644
--- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
+++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
@@ -21,23 +21,4 @@ AVRTargetStreamer::AVRTargetStreamer(MCStreamer &S) : MCTargetStreamer(S) {}
 AVRTargetAsmStreamer::AVRTargetAsmStreamer(MCStreamer &S)
     : AVRTargetStreamer(S) {}
 
-void AVRTargetStreamer::finish() {
-  MCStreamer &OS = getStreamer();
-  MCContext &Context = OS.getContext();
-
-  MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data");
-  MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss");
-
-  // FIXME: We can disable __do_copy_data if there are no static RAM variables.
-
-  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
-  OS.emitRawComment("copy all variables from program memory to RAM on startup");
-  OS.emitSymbolAttribute(DoCopyData, MCSA_Global);
-
-  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
-  OS.emitRawComment("clear the zeroed data section on startup");
-  OS.emitSymbolAttribute(DoClearBss, MCSA_Global);
-}
-
 } // end namespace llvm
-
diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
index 5c4d1a22f6c6..b8b1454a2b8d 100644
--- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
+++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
@@ -18,8 +18,6 @@ class MCStreamer;
 class AVRTargetStreamer : public MCTargetStreamer {
 public:
   explicit AVRTargetStreamer(MCStreamer &S);
-
-  void finish() override;
 };
 
 /// A target streamer for textual AVR assembly code.

Appeasing linter.

MaskRay requested changes to this revision.Aug 3 2021, 8:40 PM

Defining __do_global_ctors looks like ld's work (in GNU ld), not the assembler's.

This revision now requires changes to proceed.Aug 3 2021, 8:40 PM

Defining __do_global_ctors looks like ld's work (in GNU ld), not the assembler's.

Maybe—though note that GCC does handle this in the compiler, and LLVM itself handles the similar symbols __do_copy_data and __do_clear_bss in the compiler (as we've been discussing here).

Adding such functionality to the GNU linker probably isn't straightforward. The linker has no concept of constructor functions, AFAIK. The only thing that directs the .ctors table to be linked in is the AVR linker script.

Assuming no such fix in the linker is imminent, could we agree that supporting this in LLVM would be a good improvement for now?

If GCC does emit __do_global_ctors, having parity is fine.

If AVR supports COMDAT, make sure you test the case when the third field references a function in a comdat.

llvm/lib/Target/AVR/AVRAsmPrinter.cpp
203

delete period

207

delete

ELF doesn't need setExternal

llvm/test/CodeGen/AVR/ctors.ll
12

This is insufficient. It should additionally check whether the label is defined.

mhjacobson added inline comments.Aug 3 2021, 9:11 PM
llvm/test/CodeGen/AVR/ctors.ll
12

The label isn't expected to be defined. The symbol comes from libgcc.a.

This directive simply tells the assembler to emit an undefined-symbol reference to __do_global_ctors. The actual implementation comes at link time, from libgcc.

For comparison, see llvm/test/CodeGen/AVR/clear-bss.ll, which does the same thing.

(Apologies if I'm misunderstanding your request here.)

MaskRay added inline comments.Aug 3 2021, 9:18 PM
llvm/test/CodeGen/AVR/ctors.ll
12

OK. This is resolved. Thanks for the reply.

Please add ;; Emit .globl __do_global_ctors to leave an undefined symbol. This matches GNU as and is needed to pull the definition from libgcc.a or something similar.

Removed incorrect use of setExternal().

Tweaked assembly comment.

The COMDAT handling is in AsmPrinter::emitXXStructorList() and is unchanged by this patch.

mhjacobson marked 4 inline comments as done.Aug 3 2021, 10:04 PM
mhjacobson added inline comments.
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
207

Good catch, thanks.

MaskRay accepted this revision.Aug 3 2021, 10:22 PM

LGTM.

This revision is now accepted and ready to land.Aug 3 2021, 10:22 PM
mhjacobson marked an inline comment as done.Aug 3 2021, 10:24 PM

@MaskRay, do you have an opinion on whether I should move the other undefined-symbols definitions into AVRAsmPrinter.cpp to match?

Specifically, something like this:

diff --git a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
index f76034e7f115..7b7ede62a00f 100644
--- a/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
+++ b/llvm/lib/Target/AVR/AVRAsmPrinter.cpp
@@ -59,6 +59,8 @@ public:
 
   void emitXXStructor(const DataLayout &DL, const Constant *CV) override;
 
+  bool doFinalization(Module &M) override;
+
 private:
   const MCRegisterInfo &MRI;
   bool EmittedStructorSymbolAttrs = false;
@@ -217,6 +219,23 @@ void AVRAsmPrinter::emitXXStructor(const DataLayout &DL, const Constant *CV) {
   AsmPrinter::emitXXStructor(DL, CV);
 }
 
+bool AVRAsmPrinter::doFinalization(Module &M) {
+  MCSymbol *DoCopyData = OutContext.getOrCreateSymbol("__do_copy_data");
+  MCSymbol *DoClearBss = OutContext.getOrCreateSymbol("__do_clear_bss");
+
+  // FIXME: We can disable __do_copy_data if there are no static RAM variables.
+
+  OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should");
+  OutStreamer->emitRawComment("copy all variables from program memory to RAM on startup");
+  OutStreamer->emitSymbolAttribute(DoCopyData, MCSA_Global);
+
+  OutStreamer->emitRawComment(" Declaring this symbol tells the CRT that it should");
+  OutStreamer->emitRawComment("clear the zeroed data section on startup");
+  OutStreamer->emitSymbolAttribute(DoClearBss, MCSA_Global);
+
+  return AsmPrinter::doFinalization(M);
+}
+
 } // end of namespace llvm
 
 extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAVRAsmPrinter() {
diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
index eccd343d79ab..56e0e7810466 100644
--- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
+++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.cpp
@@ -21,23 +21,4 @@ AVRTargetStreamer::AVRTargetStreamer(MCStreamer &S) : MCTargetStreamer(S) {}
 AVRTargetAsmStreamer::AVRTargetAsmStreamer(MCStreamer &S)
     : AVRTargetStreamer(S) {}
 
-void AVRTargetStreamer::finish() {
-  MCStreamer &OS = getStreamer();
-  MCContext &Context = OS.getContext();
-
-  MCSymbol *DoCopyData = Context.getOrCreateSymbol("__do_copy_data");
-  MCSymbol *DoClearBss = Context.getOrCreateSymbol("__do_clear_bss");
-
-  // FIXME: We can disable __do_copy_data if there are no static RAM variables.
-
-  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
-  OS.emitRawComment("copy all variables from program memory to RAM on startup");
-  OS.emitSymbolAttribute(DoCopyData, MCSA_Global);
-
-  OS.emitRawComment(" Declaring this symbol tells the CRT that it should");
-  OS.emitRawComment("clear the zeroed data section on startup");
-  OS.emitSymbolAttribute(DoClearBss, MCSA_Global);
-}
-
 } // end namespace llvm
-
diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
index 5c4d1a22f6c6..b8b1454a2b8d 100644
--- a/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
+++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRTargetStreamer.h
@@ -18,8 +18,6 @@ class MCStreamer;
 class AVRTargetStreamer : public MCTargetStreamer {
 public:
   explicit AVRTargetStreamer(MCStreamer &S);
-
-  void finish() override;
 };
 
 /// A target streamer for textual AVR assembly code.

doFinalization is indeed a better place for .globl. You may need to condition the attributes on EmittedStructorSymbolAttrs

Move emission of __do_copy_data and __do_clear_bss into AVRAsmPrinter. (Leave AVRTargetStreamer as base class of AVRELFStreamer.)

MaskRay added inline comments.Aug 4 2021, 12:20 AM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
225

Leading space is not needed

mhjacobson added inline comments.Aug 4 2021, 12:23 AM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
225

The leading space seems to be pretty common for calls to emitRawComment, not just in the AVR target but others too. It turns

;Declaring this symbol

into

; Declaring this symbol

which looks prettier for anyone having to look at the generated assembly.

(I'll fix the linter complaint.)

Appeased the linter.

MaskRay added inline comments.Aug 4 2021, 11:26 AM
llvm/lib/Target/AVR/AVRAsmPrinter.cpp
225

ok, i thought emitRawComment ensured a space when # is used. Perhaps that's not the case, or not the case for ;

mhjacobson marked 2 inline comments as done.Aug 4 2021, 3:25 PM

Thanks for your help, @benshi001 and @MaskRay!