This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Fix non-standard escape warnings for braces in InstAlias
ClosedPublic

Authored by c-rhodes on May 15 2020, 2:56 AM.

Details

Summary

TableGen interprets braces ('{}') in the asm string of instruction aliases as
variants but when defining aliases with literal braces they have to be escaped
to prevent them being removed.

Braces are escaped with '\\', for example:

def FooBraces : InstAlias<"foo \\{$imm\\}", (foo IntOperand:$imm)>;

Although when TableGen is emitting the assembly writer (-gen-asm-writer)
the AsmString that gets emitted is:

AsmString = "foo \{$\x01\}";

In c/c++ braces don't need to be escaped which causes compilation
warnings:

warning: use of non-standard escape character '\{' [-Wpedantic]

This patch fixes the issue by unescaping the flattened alias asm string
in the asm writer, by replacing '\{\}' with '{}'.

Diff Detail

Event Timeline

c-rhodes created this revision.May 15 2020, 2:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2020, 2:56 AM
hfinkel added inline comments.
llvm/utils/TableGen/AsmWriterEmitter.cpp
827

Is this unescaping the right thing to do in general, or do we really just want to do it for the braces? What if there were an escaped quote in the string?

c-rhodes added inline comments.May 19 2020, 4:48 AM
llvm/utils/TableGen/AsmWriterEmitter.cpp
827

We probably only want to do this for braces. I wasn't too sure about adding this UnescapeString incase we get any undesired behaviour but the tests didn't raise any issues. Maybe it would be better if I create a new function UnescapeAliasString to handle this particular case?

hfinkel added inline comments.May 19 2020, 5:26 AM
llvm/utils/TableGen/AsmWriterEmitter.cpp
827

I recommend adding some test cases with some other escapes (quotes, new lines) and, if they don't work correctly, add a UnescapeAliasString. If everything does work correctly, then leave it as-is + the extra tests.

c-rhodes added inline comments.May 20 2020, 5:28 AM
llvm/utils/TableGen/AsmWriterEmitter.cpp
827

I tried some other escapes but don't really know what to make of it as the current behaviour doesn't look right.

I defined an alias containing an escaped quote as you suggested:

def FooQuote : InstAlias<"foo \"$imm\"", (foo IntOperand:$imm)>;

and generated the assembly writer with:

$ ./bin/llvm-tblgen -gen-asm-writer -I ../llvm/include ../llvm/test/TableGen/AliasAsmString.td -o AsmStringWriter.inc

the asm string that gets emitted is invalid c++:

static const char AsmStrings[] =
  /* 0 */ "foo "$\x01"\0"
;

but the behaviour is the same without this patch. I'm tempted to add UnescapeAliasString even though this patch isn't changing escapes in the alias string because I don't know what the correct behaviour is and I don't want to introduce tests that validate potentially incorrect behaviour. Any thoughts?

c-rhodes updated this revision to Diff 266259.May 26 2020, 10:08 AM

Changes:

  • Move unescaping of braces in InstAlias asm string from UnescapeString to UnecapeAliasString .
hfinkel accepted this revision.May 26 2020, 10:39 AM

LGTM

llvm/utils/TableGen/AsmWriterEmitter.cpp
827

Fair enough; the non-compilable code certainly isn't useful, but maybe we've never really supported things in the strings that would need escaping when output anyway. If so, this is a special case distinct from those.

This revision is now accepted and ready to land.May 26 2020, 10:39 AM
This revision was automatically updated to reflect the committed changes.