mangling of UCNs
scott douglass
sdouglass at arm.com
Thu Sep 19 12:36:35 UTC 2002
Hello,
The mangling of UCNs is not specified by the ABI. I think it should be. Perhaps the idea is that the mangling of UCNs would have to be the same as an implementation's C99 implementation? The only implementation I know that supports UCNs encodes them as "_unnnn" which makes "x\u00c0" mangle the same as the legal user identifier "x_u00c0".
Anyway, here's a fairly simple proposal. I'm imagining that allowing "$" in identifiers is a common extension.
"n" is a hex digit [0-9a-f].
Treat \U0000nnnn as \unnnn
Encode hex digits as lower case.
\u0024 => "$"
\u0040 => "@"
\unnnn => "\unnnnn"
\Unnnn => "\Unnnnnnnnn"
This assumes the linker is happy with symbols containing "$", "@" and "\".
As far as I know "@" is rarely allowed in identifiers so an alternative that avoids "\" is:
Treat \U0000nnnn as \unnnn
\u0024 => "$"
\unnnn => "@unnnnn"
\Unnnn => "@Unnnnnnnnn"
Other alternatives:
You could pick a different escape charater from the other non-identifier characters, e.g. "#", "!", ...
You could use two escape characters instead of "@u" and "@U", e.g. "@00co" and "!00010000".
You could adopt the variable width encoding similar to what other manglings use, e.g. "@c0_" and "@10000_".
Interestingly Annex E gives no legal UCNs that require the \Unnnnnnnn form.
Fire away.
More information about the cxx-abi-dev
mailing list