This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Show data and assignment commands in the map file.
ClosedPublic

Authored by grimar on Mar 2 2018, 5:23 AM.

Details

Summary

With given script:

SECTIONS {
  . = 0x1000;
  .foo : {
    BYTE(0x11)
    SHORT(0x1122)
    LONG(0x11223344)
    QUAD(0x1122334455667788)
    *(.foo.1)
    . += 0x1000;
  }
}

BFD shows data commands and location counter moving in the map file:

.foo          0x0000000000001000      0x117
                0x0000000000001000        0x1 BYTE 0x11
                0x0000000000001001        0x2 SHORT 0x1122
                0x0000000000001003        0x4 LONG 0x11223344
                0x0000000000001007        0x8 QUAD 0x1122334455667788
 *(.foo.1)
 .foo.1         0x000000000000100f        0x8 map-file.test.tmp.o
                0x0000000000001117                . = (. + 0x100)
 *fill*         0x0000000000001017      0x100

Patch teaches LLD to do the same.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

grimar created this revision.Mar 2 2018, 5:23 AM
grimar retitled this revision from [ELF] - Show data commands in a map file. to [ELF] - Show data commands in the map file..Mar 2 2018, 5:24 AM
grimar edited reviewers, added: espindola; removed: rafael.Mar 2 2018, 6:45 AM
ruiu added a comment.Mar 2 2018, 11:25 AM

GNU linker's output (for example . = (. + 0x100)) looks useful but I don't think showing only data command is not that useful given that the usage is low. Also, that's not a difficult or interesting part of debugging linker scripts. So I don't think we don't want this unless you have a plan to extend it to show all linker scripts commands.

grimar planned changes to this revision.Mar 5 2018, 12:53 AM

GNU linker's output (for example . = (. + 0x100)) looks useful but I don't think showing only data command is not that useful given that the usage is low. Also, that's not a difficult or interesting part of debugging linker scripts. So I don't think we don't want this unless you have a plan to extend it to show all linker scripts commands.

Yes, my intention was to support . = (. + 0x100) and this one was a first step. I'll extend the patch and update then.

grimar updated this revision to Diff 136990.Mar 5 2018, 6:47 AM
  • Added support for assignment commands.
grimar retitled this revision from [ELF] - Show data commands in the map file. to [ELF] - Show data and assignment commands in the map file..Mar 5 2018, 8:02 AM
grimar edited the summary of this revision. (Show Details)
ruiu added inline comments.Mar 5 2018, 10:31 AM
ELF/LinkerScript.h
113

StringView isn't a good name because we have std::string_view in modern C++.

115–116

They need a comment.

ELF/ScriptParser.cpp
794

Tok means a token, and position is not a token, so this name is confusing.

795–798

You can use llvm::join, can't you?

grimar updated this revision to Diff 137382.Mar 7 2018, 6:42 AM
grimar marked 4 inline comments as done.
  • Addressed review comments.
ruiu added inline comments.Mar 7 2018, 2:31 PM
ELF/LinkerScript.h
112–113

It is obvious that this comment describe the following variable, so you don't need to mention that.

// A string representation of this command. We use this for -Map.
115

Please insert a blank line before a comment.

115

What do you mean by "before execution of assignment"? Before you assign a command to an output section, it doesn't have any address.

ELF/MapFile.cpp
210

You are passing Cmd->Size as an alignment which seems wrong to me.

ELF/ScriptParser.cpp
118–119

FirstTok and LastTok are not tokens but positions, so you should avoid "Tok".

796

This looks weird -- at least I believe you could do Prefix.str() + " " + ....

804–805

You should change the ctor so that it takes CommandString as the third parameter. Half-initializing an object using the constructor and then set a new value to a member looks weird.

George Rimar via Phabricator via llvm-commits
<llvm-commits@lists.llvm.org> writes:

+# RUN: ld.lld -o %t %t.o -Map=%t.map --script %s
+# RUN: FileCheck -strict-whitespace %s < %t.map

Why do you need -strict-whitespace? The fact that you have a newline in
0x123 * (1+1) should be sufficient, no?

+ // Keeps string representing the expression. Used when printing map file.
+ StringRef StringView;
+
+ unsigned Offset;
+ unsigned Size;

Add a comment about Offset and Size. Looking at a SymbolAssignment one
could assume Offset is the offset in a section and Size the size of the
elf symbol.

Cheers,
Rafael

LGTM

George Rimar via Phabricator via llvm-commits
<llvm-commits@lists.llvm.org> writes:

grimar updated this revision to Diff 137382.
grimar marked 4 inline comments as done.
grimar added a comment.

  • Addressed review comments.

https://reviews.llvm.org/D44004

Files:

ELF/LinkerScript.cpp
ELF/LinkerScript.h
ELF/MapFile.cpp
ELF/ScriptParser.cpp
test/ELF/linkerscript/map-file.test

Index: test/ELF/linkerscript/map-file.test

  • test/ELF/linkerscript/map-file.test

+++ test/ELF/linkerscript/map-file.test
@@ -0,0 +1,38 @@
+# REQUIRES: x86
+
+# RUN: echo ".section .foo.1,\"a\"" > %t.s
+# RUN: echo ".quad 1" >> %t.s
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %t.s -o %t.o
+
+# RUN: ld.lld -o %t %t.o -Map=%t.map --script %s
+# RUN: FileCheck -strict-whitespace %s < %t.map
+
+SECTIONS {
+ . = 0x1000;
+ .foo : {
+ BYTE(0x11)
+ SHORT(0x1122)
+ LONG(0x11223344)
+ QUAD(0x1122334455667788)
+ . += 0x1000;
+ *(.foo.1)
+ . += 0x123 *
+ (1 + 1);
+ foo = .;
+ bar = 0x42 - 0x26;
+ }
+}
+
+# CHECK: Address Size Align Out In Symbol
+# CHECK-NEXT: 0000000000001000 000000000000125d 1 .foo
+# CHECK-NEXT: 0000000000001000 0000000000000001 1 BYTE ( 0x11 )
+# CHECK-NEXT: 0000000000001001 0000000000000002 2 SHORT ( 0x1122 )
+# CHECK-NEXT: 0000000000001003 0000000000000004 4 LONG ( 0x11223344 )
+# CHECK-NEXT: 0000000000001007 0000000000000008 8 QUAD ( 0x1122334455667788 )
+# CHECK-NEXT: 000000000000100f 0000000000001000 1 . += 0x1000
+# CHECK-NEXT: 000000000000200f 0000000000000008 1 {{.*}}{{/|\\}}map-file.test.tmp.o:(.foo.1)
+# CHECK-NEXT: 0000000000002017 0000000000000246 1 . += 0x123 * ( 1 + 1 )
+# CHECK-NEXT: 000000000000225d 0000000000000000 1 foo = .
+# CHECK-NEXT: 000000000000225d 0000000000000000 1 bar = 0x42 - 0x26
+# CHECK-NEXT: 0000000000002260 0000000000000000 4 .text
+# CHECK-NEXT: 0000000000002260 0000000000000000 4 {{.*}}{{/|\\}}map-file.test.tmp.o:(.text)

Index: ELF/ScriptParser.cpp

  • ELF/ScriptParser.cpp

+++ ELF/ScriptParser.cpp
@@ -114,6 +114,10 @@

std::pair<std::vector<SymbolVersion>, std::vector<SymbolVersion>>
readSymbols();

+ // Concatenates tokens in a [FirstTok, LastTok] into string with Prefix.
+ std::string buildCommandString(StringRef Prefix, size_t FirstTok,
+ size_t LastTok);
+

// True if a script being read is in a subdirectory specified by -sysroot.
bool IsUnderSysroot;

@@ -773,15 +777,25 @@

return Cmd;

}

+std::string ScriptParser::buildCommandString(StringRef Prefix, size_t FirstTok,
+ size_t LastTok) {
+ return (Prefix + " ").str() +
+ llvm::join(Tokens.begin() + FirstTok, Tokens.begin() + LastTok, " ");
+}
+
SymbolAssignment *ScriptParser::readAssignment(StringRef Name) {
+ size_t OldPos = Pos;

StringRef Op = next();
assert(Op == "=" || Op == "+=");
Expr E = readExpr();
if (Op == "+=") {
  std::string Loc = getCurrentLocation();
  E = [=] { return add(Script->getSymbolValue(Name, Loc), E()); };
}
  • return make<SymbolAssignment>(Name, E, getCurrentLocation());

+
+ SymbolAssignment *Ret = make<SymbolAssignment>(Name, E, getCurrentLocation());
+ Ret->CommandString = buildCommandString(Name, OldPos, Pos);
+ return Ret;
}

// This is an operator-precedence parser to parse a linker
@@ -935,7 +949,11 @@

               .Default(-1);
if (Size == -1)
  return nullptr;
  • return make<ByteCommand>(readParenExpr(), Size);

+
+ size_t OldPos = Pos;
+ ByteCommand *Ret = make<ByteCommand>(readParenExpr(), Size);
+ Ret->CommandString = buildCommandString(Tok, OldPos, Pos);
+ return Ret;
}

StringRef ScriptParser::readParenLiteral() {

Index: ELF/MapFile.cpp

  • ELF/MapFile.cpp

+++ ELF/MapFile.cpp
@@ -138,11 +138,29 @@

OS << OSec->Name << '\n';
 
// Dump symbols for each input section.
  • for (InputSection *IS : getInputSections(OSec)) {
  • writeHeader(OS, OSec->Addr + IS->OutSecOff, IS->getSize(), IS->Alignment);
  • OS << Indent1 << toString(IS) << '\n';
  • for (Symbol *Sym : SectionSyms[IS])
  • OS << SymStr[Sym] << '\n';

+ for (BaseCommand *Base : OSec->SectionCommands) {
+ if (auto *ISD = dyn_cast<InputSectionDescription>(Base)) {
+ for (InputSection *IS : ISD->Sections) {
+ writeHeader(OS, OSec->Addr + IS->OutSecOff, IS->getSize(),
+ IS->Alignment);
+ OS << Indent1 << toString(IS) << '\n';
+ for (Symbol *Sym : SectionSyms[IS])
+ OS << SymStr[Sym] << '\n';
+ }
+ continue;
+ }
+
+ if (auto *Cmd = dyn_cast<ByteCommand>(Base)) {
+ writeHeader(OS, OSec->Addr + Cmd->Offset, Cmd->Size, Cmd->Size);
+ OS << Indent1 << Cmd->CommandString << '\n';
+ continue;
+ }
+
+ if (auto *Cmd = dyn_cast<SymbolAssignment>(Base)) {
+ writeHeader(OS, OSec->Addr + Cmd->Offset, Cmd->Size, 1);
+ OS << Indent1 << Cmd->CommandString << '\n';
+ continue;
+ }

  }
}

}

Index: ELF/LinkerScript.h

  • ELF/LinkerScript.h

+++ ELF/LinkerScript.h
@@ -106,6 +106,14 @@

// Holds file name and line number for error reporting.
std::string Location;

+
+ Following attributes are used for printing the map file.
+
Keeps string representing the command.
+ std::string CommandString;
+ The offset value within the section before execution of assignment.
+ unsigned Offset;
+
The value by which the assignment changed the output section size.
+ unsigned Size;
};

// Linker scripts allow additional constraints to be put on ouput sections.
@@ -183,6 +191,9 @@

static bool classof(const BaseCommand *C) { return C->Kind == ByteKind; }

+ // Keeps string representing the command. Used when printing map file.
+ std::string CommandString;
+

Expr Expression;
unsigned Offset;
unsigned Size;

Index: ELF/LinkerScript.cpp

  • ELF/LinkerScript.cpp

+++ ELF/LinkerScript.cpp
@@ -739,7 +739,9 @@

for (BaseCommand *Base : Sec->SectionCommands) {
  // This handles the assignments to symbol or to the dot.
  if (auto *Cmd = dyn_cast<SymbolAssignment>(Base)) {

+ Cmd->Offset = Dot - Ctx->OutSec->Addr;

assignSymbol(Cmd, true);

+ Cmd->Size = Dot - Ctx->OutSec->Addr - Cmd->Offset;

  continue;
}

llvm-commits mailing list
llvm-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

grimar updated this revision to Diff 137576.Mar 8 2018, 8:03 AM
grimar marked 4 inline comments as done.
  • Addressed review comments.
ELF/LinkerScript.h
115

I mean that if we have a section description:

0  .foo : {
1    *(.bar)
2    . += 0x123;
3  }

Then at line 2, we have an assignment to Dot.
We handle commands one by one and offset before we process
("before execution of assignment (command)" in my wording) the assignment will
be equal to the size of .bar. That offset is saved to Offset member.
After execution of assignment command, offset will be different and Size below
stores the difference.

I rewrote the comment slightly.

ELF/MapFile.cpp
210

Right, that was incorrect, changed to 1.

ELF/ScriptParser.cpp
118–119

Not sure what is wrong here honestly.
I think first and last can be both adjectives and nouns. I meant that they are adjectives describing the
tokens and hence they are not positions here I think.
I renamed to First/Last though as such naming is also OK for me.

George Rimar via Phabricator <reviews@reviews.llvm.org> writes:

grimar added inline comments.

Comment at: ELF/LinkerScript.h:113
+ std::string CommandString;
+ // The offset value within the section before execution of assignment.

+ unsigned Offset;

ruiu wrote:

ruiu wrote:

Please insert a blank line before a comment.

What do you mean by "before execution of assignment"? Before you assign a command to an output section, it doesn't have any address.

I mean that if we have a section description:

0  .foo : {
1    *(.bar)
2    . += 0x123;
3  }

Then at line 2, we have an assignment to Dot.
We handle commands one by one and offset before we process
("before execution of assignment (command)" in my wording) the assignment will
be equal to the size of .bar. That offset is saved to Offset member.
After execution of assignment command, offset will be different and Size below
stores the difference.

I rewrote the comment slightly.

Comment at: ELF/MapFile.cpp:154
+ if (auto *Cmd = dyn_cast<ByteCommand>(Base)) {
+ writeHeader(OS, OSec->Addr + Cmd->Offset, Cmd->Size, Cmd->Size);

+ OS << Indent1 << Cmd->CommandString << '\n';

ruiu wrote:

You are passing Cmd->Size as an alignment which seems wrong to me.

Right, that was incorrect, changed to 1.

Comment at: ELF/ScriptParser.cpp:118-119
+ // Concatenates tokens in a [FirstTok, LastTok] into string with Prefix.
+ std::string buildCommandString(StringRef Prefix, size_t FirstTok,
+ size_t LastTok);

+

ruiu wrote:

FirstTok and LastTok are not tokens but positions, so you should avoid "Tok".

Not sure what is wrong here honestly.
I think first and last can be both adjectives and nouns. I meant that they are adjectives describing the
tokens and hence they are not positions here I think.
I renamed to First/Last though as such naming is also OK for me.

I think Rui's objection is that one would expect a *Tok variable to be
a StringRef. What you have are token number/indexes. I am ok with
First/Last, but maybe FirstTokI/LastTokI would be best.

Cheers,
Rafael

ruiu wrote:

FirstTok and LastTok are not tokens but positions, so you should avoid "Tok".

Not sure what is wrong here honestly.
I think first and last can be both adjectives and nouns. I meant that they are adjectives describing the
tokens and hence they are not positions here I think.
I renamed to First/Last though as such naming is also OK for me.

I think Rui's objection is that one would expect a *Tok variable to be
a StringRef. What you have are token number/indexes. I am ok with
First/Last, but maybe FirstTokI/LastTokI would be best.

Cheers,
Rafael

Ah, thanks, now I see my mistake, I misunderstood initial objection.
Both ways of new namings look OK to me I think.

George Rimar via Phabricator <reviews@reviews.llvm.org> writes:

+ // Concatenates tokens in a [First, Last] into string with Prefix.
+ std::string buildCommandString(StringRef Prefix, size_t First, size_t Last);

It is [First, Last), no?

LGTM with that.

Cheers,
Rafael

ruiu accepted this revision.Mar 14 2018, 10:47 AM

LGTM with these changes.

ELF/LinkerScript.h
115

Just "This is just an offset of this assignment command in the output section" should be more clear. "before processing the assignment" doesn't make much sense to me.

118

"Size of this assignment command. This is usually 0, but if you move '.' or use a BYTE()-family command, this may be greater than 0."

199

"Used for -Map" is perhaps better.

ELF/ScriptParser.cpp
794

I think it doesn't have to be a function; actually I believe it is easier to read if you inline it.

This revision is now accepted and ready to land.Mar 14 2018, 10:47 AM
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.
grimar marked 3 inline comments as done.