[cxx-abi-dev] Mangling string constants

John McCall rjmccall at apple.com
Sat Feb 21 01:58:23 UTC 2015


> On Feb 20, 2015, at 4:28 PM, Richard Smith <richardsmith at googlers.com> wrote:
> On 20 February 2015 at 15:51, John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>> wrote:
> > On Feb 19, 2015, at 11:44 PM, Dennis Handly <dhandly at cup.hp.com <mailto:dhandly at cup.hp.com>> wrote:
> >> From: David Majnemer <david.majnemer at gmail.com <mailto:david.majnemer at gmail.com>>
> >> It seems that the ABI has no means to mangle the contents of string constants.
> >
> > Why is that needed?
> > The current scheme is to just number the constants in order.
> > And that handles both strings and wide strings.
> > And by the ODR rule the inlines must be the same.
> 
> I think this is what David means by numbering like a reference temporary.
> 
> To the extent that this is needed, I agree with you that that’s the right solution: string literals should be mangled in the same sequence as reference temporaries.  (Which already applies to more than just reference temporaries anyway, since the same concept of lifetime extension applies to std::initializer_list temporaries.)
> 
> I have some of the same concerns here as I do with guaranteeing the uniqueness of string literals within inline functions: I want to make sure the language isn’t accidentally promising something that grotesquely affects performance far out of proportion to its utility to the programmer.  It would be very unfortunate if we, say, introduced thousands of new global weak symbols just to unique the strings used by assertions.  We can take things like this back to the committee if necessary.
> 
> But if we can restrict this guarantee to string literals that appear in reference-temporary-like positions in constexpr initializers, I think it’s reasonable enough.
> 
> We can't. Consider:
> 
> constexpr const char *f(const char *p) { return p; }
> constexpr const char *g() { return "foo"; }
> struct X {
>   constexpr static const char *p = "foo", // ok
>   *q = f("foo"), // not in a "reference-temporary-like" position
>   *r = g(); // string literal is not even lexically within the initializer
> };

Yeah, I thought about this a bit too late.  There are two ways to salvage the idea: mark string literals by position as they appear in the actual constexpr result, or just don’t promise anything in this case.

Another concern with widespread string-literal mangling that occurs to me is whether it will completely defeat ordinary string-literal sharing.  To do this feature optimally, we would need… in ELF terms, what, a COMDAT alias (?) into the string literal section?   This might be pushing the boundaries of supported linker behavior a lot.  If we have to emit separate, unmergeable string literal objects just because they were used in a constexpr, that would be a disaster.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20150220/94688325/attachment.html>


More information about the cxx-abi-dev mailing list