[cxx-abi-dev] Proposing an ABI restriction on loads from an object's vtable pointer

John McCall rjmccall at apple.com
Thu Jul 28 16:52:37 UTC 2016


> On Jul 27, 2016, at 7:21 PM, John McCall <rjmccall at apple.com> wrote:
>> On Jul 21, 2016, at 6:42 PM, Peter Collingbourne <pcc at google.com <mailto:pcc at google.com>> wrote:
>> 
>> Hi all,
>> 
>> The ABI currently requires that virtual tables for a class appear consecutively in a virtual table group. I would like to propose a restriction that would require that compilers may only access the virtual table associated with the address point stored in an object's virtual table pointer, and may not rely on any knowledge that the compiler may have about the relative layout of other virtual tables in the virtual table group.
>> 
>> The purpose of this restriction is to allow an implementation to split a virtual table group along virtual table boundaries.
>> 
>> Motivation
>> 
>> There are at least two scenarios which would benefit from vtable splitting: clients which want to place data either before or after the ABI-required part of a virtual table, and clients which want to control the layout of virtual tables for performance or security reasons.
>> 
>> As an example of the first scenario, when performing whole-program virtual call optimization, Clang will apply an optimization known as virtual constant propagation [0], which causes data to be laid out at a specific offset from the address point of each virtual table in a hierarchy. If that virtual table appears in a virtual table group, padding is required to place the data at an appropriate offset for each class. Because of the current restriction that vtables must appear consecutively, the optimizer may need to add more padding than necessary, or inhibit the optimization entirely if it would require too much padding.
>> 
>> As an example of the second scenario, an implementation may wish to lay out virtual tables hierarchically either in order to increase the likelihood of a cache hit when repeatedly making the same virtual call over a set of heterogeneous objects, or to efficiently implement a security mitigation (specifically control flow integrity [1]) based on checking virtual table addresses for set membership. Placing only virtual tables (rather than virtual table groups) consecutively would likely increase the cache hit likelihood further and reduces the amount of metadata required to implement set membership checks.
>> 
>> In an experiment involving the Chromium web browser, I have measured a binary size decrease of 1.5%, and a median performance improvement of about 1% on Chromium's layout benchmarks when comparing a binary compiled with control flow integrity and whole-program virtual call optimization against a binary compiled with control flow integrity, whole-program virtual call optimization and a prototype implementation of vtable splitting.
>> 
>> Commentary
>> 
>> Although the ABI specifies [2] the calling convention for virtual calls, which requires the call to be made using the this-adjustment appropriate for the object from which the virtual table pointer was loaded, the as-if rule could in principle allow a program to make a call using a different virtual table if the virtual table group contains multiple secondary virtual tables, as the distance between these virtual tables would be fixed (the same would be possible for all virtual tables if the dynamic type were known, but in that case the program could just call the appropriate virtual function directly).
> 
> In what situation would the distance between secondary virtual tables in a VTT be fixed where you don't know the dynamic type?  Derived classes can always introduce or re-introduce virtual bases in ways that re-order the secondary virtual tables.

Okay, thinking about it more, the idea is that, because the enumeration order is depth-first, there will always be a local range of the compound v-table that contains the v-tables of the non-virtual for any given portion of the class hierarchy.  Because the secondary tables never have new function pointers added to them, they do not grow to the right; and because v-call offsets are always added to the primary v-table for a virtual base, they do not grow to the left.  Therefore, a secondary v-table of a non-virtual base is fixed in size, and so you could theoretically reach from one secondary v-table to another with a constant offset.  For this to be profitable, of course, you would have to have one secondary table already loaded when you tried to use the other; but that could happen.  So I agree that this would be a possible optimization today.

>> The purported benefit would be to avoid an additional virtual pointer load from the object in cases where consecutive calls are made to virtual functions introduced in different bases. However, it seems to me that cases where this is beneficial would be rare: not only would you need at least three bases and a derived class which does not override any of the called virtual functions, but when performing two consecutive calls it seems likely that the vtable would need to be reloaded anyway, either from the object or from the stack, especially with majority caller-save ABIs such as x86-64, or in any event because the first virtual call may have changed the object's dynamic type.

This part of your argument is weak.  Putting the v-table in a callee-save register would be quite reasonable if you're doing many repeat calls.  I don't see why it would matter whether the majority of registers are callee-save as long as the absolute number is at least 2; even i386 gives us 3 general-purpose callee-save registers, and x86-64 has 5.  And it's undefined behavior to change a pointer's dynamic type like that, although that can be tricky to take advantage of.

That said, I would say that the trade-offs still break in your favor here.  The optimization potential of this sort of contrived situation — calls to virtual methods of two different secondary v-tables — doesn't out-weigh the optimization potential of permitting non-standard organization of secondary v-tables.

>> It seems (according to experiments [3] carried out at godbolt.org <http://godbolt.org/>) that all major compilers (gcc, clang, icc) do already use the appropriate vtable group and therefore are compliant with the proposed restriction.
>> 
>> (There would also seem to be nothing preventing an implementation from choosing to load the RTTI pointer or offset-to-top from another virtual table group. However I would consider this even less likely to be beneficial than a virtual call via another virtual table.)

I agree, I cannot imagine why an optimizer would deliberately do this when it could get the same information from a simpler source.

>> The ABI specifies that the vtables in a group shall be laid out consecutively when referenced via a vtable group symbol, and I'm not proposing to change this. The effect of this proposal would be to allow a vtable to be split if the vtable group symbol is not referenced directly by name outside of the translation unit(s) participating in the optimization. This may be the case when a class has internal linkage, or if the program is linked with LTO, which allows the compiler to know which symbols are referenced outside of the LTO'd part of the program.
>> 
>> Wording
>> 
>> I propose to add two paragraphs to the section of the ABI describing virtual table groups, as follows:
>> 
>> diff --git a/abi.html b/abi.html
>> index 79cda2c..fce0c60 100644
>> --- a/abi.html
>> +++ b/abi.html
>> @@ -1193,6 +1193,18 @@ and again excluding primary bases
>>  (which share virtual tables with the classes for which they are primary).
>>  </ul>
>>  
>> +<p>
>> +When performing a virtual call or loading any other data from an address
>> +derived from the address point stored in an object's virtual table pointer,
>> +a program may only load from the virtual table associated with that address
>> +point, and not from any other virtual table in the same virtual table group
>> +which might be presumed to be located at a fixed offset from the address
>> +point as a result of the above layout algorithm.
>> +
>> +<p>
>> +The purpose of this restriction is to allow an implementation to split a
>> +virtual table group along virtual table boundaries if its symbol is not
>> +visible to other translation units.

I would say this more generally: the ABI does not make guarantees about the relative layout of v-tables in an object or a VTT.  It guarantees only the layout of the global symbol.  It does not guarantee that the v-table pointers actually installed in an object or a VTT will point into that global symbol.

John.

>>  
>>  <p>
>>  <a name="vtable-construction">
>> 
>> 
>> Thanks,
>> Peter
>> 
>> [0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html <http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html>
>> [1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html <http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html>
>> [2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller <https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller>
>> [3] https://godbolt.org/g/wX7Ay6 <https://godbolt.org/g/wX7Ay6> is a three-bases test case by Richard Smith, https://godbolt.org/g/7eG8A1 <https://godbolt.org/g/7eG8A1> is a dynamic-type-known test case by me
>> _______________________________________________
>> cxx-abi-dev mailing list
>> cxx-abi-dev at codesourcery.com <mailto:cxx-abi-dev at codesourcery.com>
>> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev
> 
> _______________________________________________
> cxx-abi-dev mailing list
> cxx-abi-dev at codesourcery.com
> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20160728/3d8fac90/attachment-0001.html>


More information about the cxx-abi-dev mailing list