Special characters in mangled names
Christophe de Dinechin
ddd at cup.hp.com
Mon Nov 15 18:51:51 UTC 1999
Thanks for the clarification. This sounds like an interesting idea.
I sure would appreciate decent link times with C++ :-)
Christophe
> >>>>> "Christophe" == Christophe de Dinechin <ddd at cup.hp.com> writes:
>
> Christophe> Can you please explain what is a "collision-resistant"
> Christophe> hash function? I doubt you would get the same
> Christophe> collisions from two different translation units, so I
> Christophe> don't see how you could ensure a name that is
> Christophe> identical between TUs and also avoids collisions.
>
> A collision-resistant hash function is a notion from cryptography.
> (That's the world I spend a lot of my time in when I'm not doing
> compiler stuff.)
>
> Suppose you have an n-bit hash, so you have 2^n hash values. A
> collision-resistant hash is one where the probability of two randomly
> chosen strings hashing to the same value is (very close to) 1/(2^n).
> A stronger notion of this is that finding strings that collide is
> computationally infeasible.
>
> Certainly, hashing introduces a probabilistic nature to things: it
> becomes possible that two different functions could hash to the same
> hash-mangled name. However, by choosing a good hash function (and
> provably good ones exist) and enough bits, you can make it
> considerably less likely that in the next hundred years and two
> distinct functions will hash to the same name than that cosmic rays
> will cause unpredicatable linker errors.
>
> Christophe> Also: There is a wide assumption that name mangling is
> Christophe> reversible. In other words, there is a c++filt tool,
> Christophe> and hundreds of scripts use it somehow (combined with
> Christophe> nm, to process the error output of ld, etc.) I'm not
> Christophe> sure it is wise to give up this assumption.
>
> Yup, this is the biggest objection I can think of.
>
> We originally came up with this idea for our C++-to-C translator. We
> ship this to people with embedded systems whose linkers only support
> 16-characters; by using a collision-resistant hash they can use C++.
> Nobody has ever run into a collision. We solved the c++-filt problem
> by keeping a database mapping hashes back to mangled names. (The
> probabilistic guarantee says that this database can actually be
> global; in our lifetime will never see two things with the same hash.)
> So, it's still possible to make a c++-filt that works, but it is
> admittedly more difficult.
>
> The biggest advantage to this scheme is that you can put an upper
> bound on symbol lengths, even if the presence of truly huge template
> usage. (I've seen programs where mangled names approached a megabyte
> in length.) I would only suggest hashing long names; names under 100
> characters, or even a thousand characters, say, could be left
> unhashed.
>
> --
> Mark Mitchell mark at codesourcery.com
> CodeSourcery, LLC http://www.codesourcery.com
>
>
More information about the cxx-abi-dev
mailing list