Local name discriminators
David Vandevoorde
daveed at edg.com
Mon Jun 22 18:56:56 UTC 2009
5.1.6 "Scope Encoding" has this to say (among other things):
> Occasionally entities in local scopes must be mangled too (e.g.
> because inlining or template compilation causes multiple translation
> units to require access to that entity). The encoding for such
> entities is as follows:
>
> <local-name> := Z <function encoding> E <entity name>
> [<discriminator>]
> := Z <function encoding> E s [<discriminator>]
> <discriminator> := _ <non-negative number>
> The first production is used for named local static objects and
> classes, which are identified by their declared names. The <entity
> name> may itself be a compound name, but it is relative to the
> closest enclosing function, i.e. none of the components of the
> function encoding appear in the entity name.
This seems to suggest that the first production doesn't apply to
member functions of local classes, nor to local enumeration types.
I assume that's unintentional since the next sentence says:
> It is possible to have nested function scopes, e.g. when dealing
> with a member function in a local class. In such cases, the function
> encoding will itself have <local-name> structure.
and the other production of <local-name> doesn't apply at all.
Now consider the following example:
void x() {
{ struct X {}; }
struct X {
void foo() { foo(); } // #1
} x1;
x1.foo();
{ struct X {
void foo() { foo(); } // #2
} x2;
x2.foo();
}
}
g++ produces the following mangled names for the X::foo members:
_ZZ1xvEN1X3fooE_0v for #1
_ZZ1xvEN1X3fooE_1v for #2
Note that both have discriminators, for which the spec says:
> The discriminator is used only for the second and later occurrences
> of the same name within a single function. In this case <number> is
> n - 2, if this is the nth occurrence, in lexical order, of the given
> name.
The "same name" here is X::foo and #1 is the first occurrence (no
discriminator needed) while #2 is the second occurrence (disciminator
value 0). So I would've expected instead:
_ZZ1xvEN1X3fooEv for #1
_ZZ1xvEN1X3fooE_0v for #2
Is this correct?
(EDG has a different interpretation:
_ZZ1xvEN1X_03fooEv for #1
_ZZ1xvEN1X_13fooEv for #2
but that's a bug too.)
We could change the spec to ensure that g++'s approach is
"standard" (assuming that I understand what g++ really does here).
I.e., we could specify that the "discriminator" discriminates the "top-
level component" of colliding local names. I think that would be
sufficient.
Now consider a different example:
class C {} c;
inline int g() {
{ struct X {}; }
{ struct X {}; }
struct X {} x;
struct Y { int f(X x, C c) { return f(x, c); }; } y;
return y.f(x, c) + g();
};
int main() {
return g();
}
(bad recursion written as a quick hack to force compilers to spill the
inline functions).
The mangling for g()::Y::f is
ZZ1gvEN1Y1fEZ1gvE1X_11C
^^^
The problem here is that there is no delimiter after the discriminator
"_1" to separate it from the "1" that indicates the length of the
class name "C". So this cannot in general be demangled. (Such
situations become more common in C++0x where local classes can be
template arguments.)
Addressing this requires a change that is technically ABI breakage,
but I think we can do it so that real-world programs are highly
unlikely to break by saying that a <discriminator> is "_<n>" for <n>
<= 9 (that's unchanged), but "__<n>_" when <n> >= 10 (I assume here
that <n> >= 10 doesn't happen in real programs).
Any thoughts?
Daveed
P.S.: There are related issues with unnamed local classes in C++0x,
but I plan to address those along with closure types in a separate
proposal.
More information about the cxx-abi-dev
mailing list