vtable layout
Christophe de Dinechin
ddd at cup.hp.com
Mon Aug 30 21:47:24 UTC 1999
> >>>>> thomson <thomson at ca.ibm.com> writes:
>
> > I don't see how this solves my diamond case
>
> > struct V1 { virtual void f(); virtual void g(); };
> > struct Other1 { virtual void ignore1(); }
> > struct X : Other1, virtual V1 { virtual void f(); }
>
> > struct Y : Other1, virtual V1 { virtual void g(); }
>
> > struct ZZ: X, Y {}
>
> You're right, I didn't think it through far enough. On the sides
of the
> diamonds, we decide where the adjustments go. They end up in the first
> available slot, which is slot -1 in both classes. But only one
adjustment
> can be at that offset from the V1 vptr, so the adjustments from V1
to X and
> Y must be identical. Which they're not, so this doesn't work. It gets
> worse if the two classes have different numbers of virtual functions.
>
> Christophe?
I get it now, sorry for my previous post. I believe that this
example has been brought up earlier (two or three weeks ago). You are
right, that's one of the two cases where we still need to emit a
thunk. We also need a thunk in some cases of covariant return type
(to perform a "post" adjustment).
In terms of performance, the impact is limited, because it will
occur only if you use an A* to call f() or g(). With a B*, a C* or a
D*, the pair (vtable, offset) is unique. The same offset can be
reused for f() and g() and mean, in one case, "convert_to_X", in the
other case, "convert_to_Y". Same thing for non-virtual inheritance.
Last, the thunk generated in that case is no worse than the thunk
that would be generated otherwise: we win in other cases, and don't
lose in this one.
> >> This isn't an outrageous idea, it only works for nonvirtual
inheritance
> >> but we are already on a path where the solutions for the
virtual and
> >> nonvirtual cases have to be different. We end up with more entry
> >> points, but they are simpler than the
reach-back-into-the-vtable ones.
>
When we discussed the problem for covariant returns, someone
(Jason?) pointed out that the ABI simply mandated the presence of the
offsets in the vtable, but that you can be ABI-compatible and
generate thunks that never use the offsets.
> > ... And quite slower too ...
>
> Why?
A thunk approach means that your virtual calls will look like:
- Indirect branch, almost always mispredicted (probably well over 99%)
- I-Cache miss on thunk, since the thunk is quite "unique"
- Direct branch, almost always mispredicted, since prefetching did
not have time to recover
- Possible I-Cache miss on target function
On the other hand, the method I proposed has the following benefits:
- The indirect branch mispredicts as before
- Once its target is known, the I-cache and pipeline are filled
with useful information (the target function)
- D-cache misses on the vtable offsets are unlikely if any virtual
function of the same class was called recently
- Call-site adjustment costs zero, in the sense that it is needed to
get the the vptr anyway.
- If call site adjustment is all that is needed, then the necessary
adjustment is done at a place where scheduling is easier (the
caller), rather than at a place where scheduling is impossible (the
thunk)
For more details, see the complete code trail I sent with my initial
proposal.
Best regards
Christophe
More information about the cxx-abi-dev
mailing list