[cxx-abi-dev] mangling of UCNs

Thu Sep 19 16:14:49 UTC 2002

scott douglass <sdouglass at arm.com> writes:

> >Identifiers are encoded in UTF-8.
> 
> Mangling is just a synonym for "encoding" isn't it?

Not really. Every identifier is encoded, usually in US-ASCII. To
mangle an identifier means to use escape characters for
meta-information, i.e. the characters don't literally represent their
character value anymore, but some other information.

For UTF-8, this is different. Characters mean characters (even though
multiple bytes might be needed for a character).

> That seems reasonable if we think all linkers are happy to eat UTF-8.

I think the gABI more-or-less defines that arbitrary byte sequences
can appear in symbols - atleast it does not pose a restriction on the
byte sequences. If some linker fails to process this correctly, I'd
claim that the linker has a bug.

For some compilers, the input that the assembler accepts may also be
relevant. Again, I'd claim that this is a vendor issue, not one that
the ABI must be concerned with. If non-ASCII letters in identifiers
are not supported, the vendor must fix the tool-chain anyway to
achieve interoperability.

Regards,
Martin