[cxx-abi-dev] Mangling of string literals versus variadic templates

David Vandevoorde daveed at edg.com
Tue Dec 17 19:10:50 UTC 2013


On Dec 16, 2013, at 8:10 PM, Richard Smith <richardsmith at google.com> wrote:

> Hi,
> 
> Consider:
> 
>   void g(...);
>   template<int...N> inline const char *f() { g("foo" + N ...); return "bar"; }
>   const char *p = f();
>   const char *q = f<0>();
>   const char *r = f<0, 1>();
> 
> In f<>, what is the mangling of the "bar" string literal? Is this the first string literal or the second?

I think it has to be the second.


> The ABI document says "In all cases the numbering order is strictly lexical order based on the original token sequence", but it's not obvious what that would mean for a template instantiation that discards tokens, as this one does.

Why would it mean something different if you discard tokens?  The internal representation probably still knows there was a string literal involved and could associate a discriminating sequence number with it, no?

> Now consider:
> 
>   template<int...N> inline const char *h() { g([]{return "foo";}() + N ...); return "bar"; }
>   const char *p = h();
>   const char *q = h<0>();
>   const char *r = h<0, 1>();
> 
> What happens here? Does the string literal inside the lambda get a number in the context of the outer function as well as a number within the lambda? It appears within the original token sequence...


I agree the spec misses this, but I think numbering should not include literals (or other entities) that are in a nested body function.  I.e., "bar" would be the first string literal in all instances of h.


> EDG is the only vendor I can find that provides manglings for string literals at all, and its results here are surprising. It provides these manglings for "bar":
> 
>   f<>: _ZZ1fIJEEPKcvEs_0
>   f<0>: _ZZ1fIJLi0EEEPKcvEs_0
>   f<0, 1>: _ZZ1fIJLi0ELi1EEEPKcvEs_0
> 
> These suggest that EDG includes "foo" in the numbering, even though it is not actually part of f<>. But then:
> 
>   h<>: _ZZ1hIJEEPKcvEs_0
>   h<0>: _ZZ1hIJLi0EEEPKcvEs
>   h<0, 1>: _ZZ1hIJLi0ELi1EEEPKcvEs
> 
> These seem very surprising. "foo" is included in the numbering *only* in the case where it doesn't actually appear in the instantiated function body.


I suspect that's just a bug.

> 
> 
> Suggestion: change in 5.1.6:
> 
> "In all cases the numbering order is strictly lexical order based on the original token sequence<ins>, excluding any tokens that are part of the body of a nested entity</ins>. All entities occurring in that sequence are to be numbered, even if subsequent optimization <ins>or (in the case of a string literal) expansion of an empty parameter pack</ins> makes some of them unnecessary."


That sounds good.

> 
> This would make EDG correct, except for the h<> case, where the mangling would be _ZZ1hIJEEPKcvEs
> 
> Another related issue is with user-defined literals. If 123_x appears in an inline function, and implicitly calls operator""_x("123"), the implicit string literal should (presumably) be assigned a mangling number. If it calls operator"""_x(123ULL), a mangling number should presumably not be assigned.


Ah yes, nice catch.

> 
> Suggestion: insert after previously-quoted text from 5.1.6:
> 
> If a user-defined-literal implicitly passes a string literal to a literal operator, the user-defined-literal token is numbered as if it were a string literal token.

Hmmm, would it be better to number all user-defined literals without having to worry about how they'll be transformed?

	Daveed



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20131217/f09241f7/attachment.html>


More information about the cxx-abi-dev mailing list