RTTI data layout flaw
Nathan Sidwell
nathan at acm.org
Mon Jan 17 09:34:00 UTC 2000
Hi,
There is a flaw in the specified rtti data structures. The data
structures work only when a complete linked program has a definition of
all the required classes. However, C++ does not guarantee this. Here is
an example
-- fn1.cc
struct Foo;
void fn () {
throw (Foo **)0;
}
-- main1.cc
struct Foo;
struct Bar;
void fn ();
int main () {
try { fn ();}
catch (Bar const *const *) { printf ("wrong\n");}
catch (Foo const *const *) { printf ("ok\n");}
return 0;
}
Notice, I'm dealing with **, which is a pointer to complete type, so
[except.throw]/3 and [except.handle]/1 do not prevent this.
A complete program consisting of fn1.o and main1.o, is well defined and
should print "ok\n", not "wrong\n". The difficulty is that the compiler
never saw any definition of Foo or Bar, so what does it output as the
type_info object for them? The type_info object for Foo ** consists of
__pointer_type_info
->__pointer_type_info
->What?
We cannot use weak linkage for a Foo type_info object, as, in this case
that would resolve to zero, and be indistinguishable from the Bar in
Bar **.
Another choice would be to emit an empty __class_type_info object, but
then what ensures that the real __class_type_info object is resolved to,
when linked with an object file which does contain a definition of Foo?
One choice which might work is some kind of __class_proxy_type_info,
which contains a single member pointing to the real __class_type_info
object with weak linkage. The name mangling for a
__class_proxy_type_info will be different to that of a __class_type_info
object. In the above case, Foo ** would be represented as
__pointer_type_info
->__pointer_type_info
->__class_proxy_type_info
->(weakly)__class_type_info.
The final __class_type_info object is not emitted in the compilation
unit, as Foo's definition is never seen. Now, we can distinguish
`Bar const *const *' from `Foo const *const *', as the inner pointers
will point to different __class_proxy_type_info objects.
In pointer_type_info's we do not need to go via a
__class_proxy_type_info, if the compilation unit has seen the
definition of the pointed-to class. The catch matching algorithm will
need to be aware that it might be given two paths, one of which goes
via a class_proxy_type_info, and one that does not. In this case, the
class_proxy_type_info should point to the real class.
type_info::operator == is not affected by proxies. Proxies are only
important when traversing the pointer heirarchy in catch matching.
There is a difficulty with DSO's containing the definition of the
class, and loaded into programs without the definition. Here is an
example, with some pseudo code describing the loading.
--fn2.cc
struct Foo {};
struct Bar : Foo {};
void fn1 () {
throw (Foo const **)0;
}
void fn2 () {
try
{ throw (Bar *)0;}
catch (Foo *)
{}
catch (...)
{abort ();}
}
--main2.cc
struct Foo;
int main ()
{
handle == dlopen ("fn2"); // load the library
void (*f1) () = dlsymbol (handle, "fn1"); // get fn1 entry
void (*f2) () = dlsymbol (handle, "fn2"); // get fn2 entry
try
{ (*f1) ();}
catch (Foo **)
{ printf ("ok\n");}
(*f2) ();
}
This should print "ok\", and not call abort in fn2. But, consider what
the type_info objects look like. Here I've assumed a name mangling,
fn2.o (before loading)
__ti_Foo:__class_type_info
Foo's descriptor
__ti_pFoo:__pointer_type_info
target = &__ti_Foo
__ti_ppFoo:__pointer_type_info
target = &__ti_pFoo
__ti_Bar:__si_class_type_info
Bar's descriptor
base = &__ti_Foo
__ti_pBar:__pointer_type_info
target = &__ti_Bar
main2.o (before loading)
__ti_proxy_Foo:__class_proxy_type_info
target = &__ti_Foo (weak, zero)
__ti_pFoo:__pointer_type_info
target = &__ti_proxy_Foo
__ti_ppFoo:__pointer_type_info
target = &__ti_pFoo (this will be NULL)
When fn2 is loaded, COMDAT linkage will resolve some of the symbols in
fn2, to those in main2. Namely __ti_pFoo & __ti_ppFoo.
Consider what happens in the catch clause of fn2. The thrown type_info
will be __ti_pBar, which fully describes a pointer to Bar. The catch
type_info will be __ti_pFoo, which is the instance defined in main2.
This does not fully describe a pointer to Foo, as the proxy's target is
NULL. To get this to work, the proxy's target needs adjusting.
I'm not familiar with the nitty gritty details of DSO loading, nor with
SO loading, to know whether the above behaves as desired. As the ABI
should support shared libraries, we need to ensure it works for those. I
do not know whether another goal is to support DSO's.
nathan
--
Dr Nathan Sidwell :: sidwell at codesourcery.com
nathan at acm.org http://www.cs.bris.ac.uk/~nathan/ nathan at cs.bris.ac.uk
More information about the cxx-abi-dev
mailing list