[cxx-abi-dev] Proposing an ABI restriction on loads from an object's vtable pointer

Peter Collingbourne pcc at google.com
Fri Jul 22 01:42:06 UTC 2016


Hi all,

The ABI currently requires that virtual tables for a class appear
consecutively in a virtual table group. I would like to propose a
restriction that would require that compilers may only access the virtual
table associated with the address point stored in an object's virtual table
pointer, and may not rely on any knowledge that the compiler may have about
the relative layout of other virtual tables in the virtual table group.

The purpose of this restriction is to allow an implementation to split a
virtual table group along virtual table boundaries.

Motivation

There are at least two scenarios which would benefit from vtable splitting:
clients which want to place data either before or after the ABI-required
part of a virtual table, and clients which want to control the layout of
virtual tables for performance or security reasons.

As an example of the first scenario, when performing whole-program virtual
call optimization, Clang will apply an optimization known as virtual
constant propagation [0], which causes data to be laid out at a specific
offset from the address point of each virtual table in a hierarchy. If that
virtual table appears in a virtual table group, padding is required to
place the data at an appropriate offset for each class. Because of the
current restriction that vtables must appear consecutively, the optimizer
may need to add more padding than necessary, or inhibit the optimization
entirely if it would require too much padding.

As an example of the second scenario, an implementation may wish to lay out
virtual tables hierarchically either in order to increase the likelihood of
a cache hit when repeatedly making the same virtual call over a set of
heterogeneous objects, or to efficiently implement a security mitigation
(specifically control flow integrity [1]) based on checking virtual table
addresses for set membership. Placing only virtual tables (rather than
virtual table groups) consecutively would likely increase the cache hit
likelihood further and reduces the amount of metadata required to implement
set membership checks.

In an experiment involving the Chromium web browser, I have measured a
binary size decrease of 1.5%, and a median performance improvement of about
1% on Chromium's layout benchmarks when comparing a binary compiled with
control flow integrity and whole-program virtual call optimization against
a binary compiled with control flow integrity, whole-program virtual call
optimization and a prototype implementation of vtable splitting.

Commentary

Although the ABI specifies [2] the calling convention for virtual calls,
which requires the call to be made using the this-adjustment appropriate
for the object from which the virtual table pointer was loaded, the as-if
rule could in principle allow a program to make a call using a different
virtual table if the virtual table group contains multiple secondary
virtual tables, as the distance between these virtual tables would be fixed
(the same would be possible for all virtual tables if the dynamic type were
known, but in that case the program could just call the appropriate virtual
function directly).

The purported benefit would be to avoid an additional virtual pointer load
from the object in cases where consecutive calls are made to virtual
functions introduced in different bases. However, it seems to me that cases
where this is beneficial would be rare: not only would you need at least
three bases and a derived class which does not override any of the called
virtual functions, but when performing two consecutive calls it seems
likely that the vtable would need to be reloaded anyway, either from the
object or from the stack, especially with majority caller-save ABIs such as
x86-64, or in any event because the first virtual call may have changed the
object's dynamic type. It seems (according to experiments [3] carried out
at godbolt.org) that all major compilers (gcc, clang, icc) do already use
the appropriate vtable group and therefore are compliant with the proposed
restriction.

(There would also seem to be nothing preventing an implementation from
choosing to load the RTTI pointer or offset-to-top from another virtual
table group. However I would consider this even less likely to be
beneficial than a virtual call via another virtual table.)

The ABI specifies that the vtables in a group shall be laid out
consecutively when referenced via a vtable group symbol, and I'm not
proposing to change this. The effect of this proposal would be to allow a
vtable to be split if the vtable group symbol is not referenced directly by
name outside of the translation unit(s) participating in the optimization.
This may be the case when a class has internal linkage, or if the program
is linked with LTO, which allows the compiler to know which symbols are
referenced outside of the LTO'd part of the program.

Wording

I propose to add two paragraphs to the section of the ABI describing
virtual table groups, as follows:

diff --git a/abi.html b/abi.html
index 79cda2c..fce0c60 100644
--- a/abi.html
+++ b/abi.html
@@ -1193,6 +1193,18 @@ and again excluding primary bases
 (which share virtual tables with the classes for which they are primary).
 </ul>

+<p>
+When performing a virtual call or loading any other data from an address
+derived from the address point stored in an object's virtual table pointer,
+a program may only load from the virtual table associated with that address
+point, and not from any other virtual table in the same virtual table group
+which might be presumed to be located at a fixed offset from the address
+point as a result of the above layout algorithm.
+
+<p>
+The purpose of this restriction is to allow an implementation to split a
+virtual table group along virtual table boundaries if its symbol is not
+visible to other translation units.

 <p>
 <a name="vtable-construction">


Thanks,
Peter

[0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html
[1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
[2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller
[3] https://godbolt.org/g/wX7Ay6 is a three-bases test case by Richard
Smith, https://godbolt.org/g/7eG8A1 is a dynamic-type-known test case by me
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20160721/c8874c56/attachment.html>


More information about the cxx-abi-dev mailing list