From jwakely at redhat.com Fri Jul 1 14:13:52 2016 From: jwakely at redhat.com (Jonathan Wakely) Date: Fri, 1 Jul 2016 15:13:52 +0100 Subject: [cxx-abi-dev] Non-trivial move constructor In-Reply-To: References: <56CDB628.8000302@redhat.com> <225867D4-033A-4AA5-8BD7-0741E22743DF@apple.com> Message-ID: <20160701141352.GW7722@redhat.com> On 22/06/16 12:59 -0700, Reid Kleckner wrote: >This bug still isn't fixed in Clang. It's >https://llvm.org/bugs/show_bug.cgi?id=19668. You should probably go ahead >and update the document. It's probably also the cause of https://llvm.org/bugs/show_bug.cgi?id=23034 which I've been asked about (because it involves the libstdc++ std::tuple). Is the current status that Clang is still believed to require a change, and that G++ is doing the right thing already? >On Wed, Feb 24, 2016 at 10:41 PM, John McCall wrote: > >> > On Feb 24, 2016, at 1:14 PM, Richard Smith >> wrote: >> > On 24 February 2016 at 12:56, John McCall wrote: >> >>> On Feb 24, 2016, at 11:43 AM, Richard Smith >> wrote: >> >>> On 24 February 2016 at 05:54, Jason Merrill wrote: >> >>>> On 02/24/2016 05:51 AM, Marc Glisse wrote: >> >>>>> >> >>>>> in 3.1.1, we use "In the special case where the parameter type has a >> >>>>> non-trivial copy constructor or destructor" to force passing by >> >>>>> reference. It seems that for C++11, this should also include move >> >>>>> constructors, for the same reasons. >> >>>> >> >>>> >> >>>> We talked about adding move constructors to that sentence years ago. >> Did it >> >>>> never make it into the spec? >> >>> >> >>> Looks like it didn't. The rule we ended up with was: >> >>> >> >>> "[Pass an object of class type by value if] every copy constructor and >> >>> move constructor is deleted or trivial and at least one of them is not >> >>> deleted, and the destructor is trivial.? >> >>> >> >>> >> >>> However, this seems overly-cautious to me; it would seem sufficient >> >>> for there to be at least one copy or move constructor that is trivial >> >>> and not deleted, and a trivial destructor. It's not really >> >>> particularly plausible for there to be a trivial copy and a >> >>> non-trivial move or vice versa, but it *is* plausible for there to be >> >>> two non-deleted copy constructors -- a trivial one, and one that takes >> >>> a const volatile reference -- and in that case, passing through >> >>> registers seems completely reasonable. How about changing the rule in >> >>> 3.1.1 bullet 1 to: >> >>> >> >>> "In the special case where the parameter type does not have both a >> >>> trivial destructor and at least one trivial copy or move constructor >> >>> that is not deleted, the caller must allocate space for a temporary >> >>> copy, and pass the resulting copy by reference (below). Specifically >> >>> [?]" >> >> >> >> I agree with your proposal in theory, but I?m concerned about changing >> >> the ABI at this point. We *are* talking about the language standard >> that was >> >> released six years ago, and an area of that standard that was >> theoretically >> >> fully implemented by compilers several years before that. >> >> >> >> Do we understand the scope of the ABI disagreement between GCC and >> Clang here? >> >> What do other compilers do? >> > >> > Clang's rule is the one in the ABI: a class is passed indirectly if it >> > has a non-trivial destructor or a non-trivial copy constructor. This >> > rule definitely needs some adjustment, because it's not meaningful to >> > ask whether an implicitly-deleted function is trivial. >> >> That sounds like it?s on us to fix. Do GCC and other compilers correctly >> implement the rule that we agreed on? If so, I?ll go ahead and apply >> the change to the ABI document, and we should fix this in clang. >> >> John. >> _______________________________________________ >> cxx-abi-dev mailing list >> cxx-abi-dev at codesourcery.com >> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev >> >_______________________________________________ >cxx-abi-dev mailing list >cxx-abi-dev at codesourcery.com >http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev From rjmccall at apple.com Fri Jul 1 15:43:02 2016 From: rjmccall at apple.com (John McCall) Date: Fri, 1 Jul 2016 08:43:02 -0700 Subject: [cxx-abi-dev] Non-trivial move constructor In-Reply-To: <20160701141352.GW7722@redhat.com> References: <56CDB628.8000302@redhat.com> <225867D4-033A-4AA5-8BD7-0741E22743DF@apple.com> <20160701141352.GW7722@redhat.com> Message-ID: <185E047E-9BF5-4DFE-ABA7-7584A0FF5CF0@apple.com> > On Jul 1, 2016, at 7:13 AM, Jonathan Wakely wrote: > On 22/06/16 12:59 -0700, Reid Kleckner wrote: >> This bug still isn't fixed in Clang. It's >> https://llvm.org/bugs/show_bug.cgi?id=19668. You should probably go ahead >> and update the document. > > It's probably also the cause of > https://llvm.org/bugs/show_bug.cgi?id=23034 > which I've been asked about (because it involves the libstdc++ > std::tuple). > > Is the current status that Clang is still believed to require a > change, and that G++ is doing the right thing already? Yes. John. > > > >> On Wed, Feb 24, 2016 at 10:41 PM, John McCall wrote: >> >>> > On Feb 24, 2016, at 1:14 PM, Richard Smith >>> wrote: >>> > On 24 February 2016 at 12:56, John McCall wrote: >>> >>> On Feb 24, 2016, at 11:43 AM, Richard Smith >>> wrote: >>> >>> On 24 February 2016 at 05:54, Jason Merrill wrote: >>> >>>> On 02/24/2016 05:51 AM, Marc Glisse wrote: >>> >>>>> >>> >>>>> in 3.1.1, we use "In the special case where the parameter type has a >>> >>>>> non-trivial copy constructor or destructor" to force passing by >>> >>>>> reference. It seems that for C++11, this should also include move >>> >>>>> constructors, for the same reasons. >>> >>>> >>> >>>> >>> >>>> We talked about adding move constructors to that sentence years ago. >>> Did it >>> >>>> never make it into the spec? >>> >>> >>> >>> Looks like it didn't. The rule we ended up with was: >>> >>> >>> >>> "[Pass an object of class type by value if] every copy constructor and >>> >>> move constructor is deleted or trivial and at least one of them is not >>> >>> deleted, and the destructor is trivial.? >>> >>> >>> >>> >>> >>> However, this seems overly-cautious to me; it would seem sufficient >>> >>> for there to be at least one copy or move constructor that is trivial >>> >>> and not deleted, and a trivial destructor. It's not really >>> >>> particularly plausible for there to be a trivial copy and a >>> >>> non-trivial move or vice versa, but it *is* plausible for there to be >>> >>> two non-deleted copy constructors -- a trivial one, and one that takes >>> >>> a const volatile reference -- and in that case, passing through >>> >>> registers seems completely reasonable. How about changing the rule in >>> >>> 3.1.1 bullet 1 to: >>> >>> >>> >>> "In the special case where the parameter type does not have both a >>> >>> trivial destructor and at least one trivial copy or move constructor >>> >>> that is not deleted, the caller must allocate space for a temporary >>> >>> copy, and pass the resulting copy by reference (below). Specifically >>> >>> [?]" >>> >> >>> >> I agree with your proposal in theory, but I?m concerned about changing >>> >> the ABI at this point. We *are* talking about the language standard >>> that was >>> >> released six years ago, and an area of that standard that was >>> theoretically >>> >> fully implemented by compilers several years before that. >>> >> >>> >> Do we understand the scope of the ABI disagreement between GCC and >>> Clang here? >>> >> What do other compilers do? >>> > >>> > Clang's rule is the one in the ABI: a class is passed indirectly if it >>> > has a non-trivial destructor or a non-trivial copy constructor. This >>> > rule definitely needs some adjustment, because it's not meaningful to >>> > ask whether an implicitly-deleted function is trivial. >>> >>> That sounds like it?s on us to fix. Do GCC and other compilers correctly >>> implement the rule that we agreed on? If so, I?ll go ahead and apply >>> the change to the ABI document, and we should fix this in clang. >>> >>> John. >>> _______________________________________________ >>> cxx-abi-dev mailing list >>> cxx-abi-dev at codesourcery.com >>> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev >>> > >> _______________________________________________ >> cxx-abi-dev mailing list >> cxx-abi-dev at codesourcery.com >> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev > From richardsmith at google.com Wed Jul 20 01:04:07 2016 From: richardsmith at google.com (Richard Smith) Date: Tue, 19 Jul 2016 18:04:07 -0700 Subject: [cxx-abi-dev] C++ ABI version 2 In-Reply-To: References: Message-ID: Another item for the list: Variadic virtual functions with covariant return types are currently problematic: it's not possible in general to generate an adjustor thunk for them, because it's not possible in general to forward a (non-tail) varargs call. Similar problems exist for the conversion to function pointer in a non-capturing varargs lambda. We can fix this by changing the calling convention for varargs non-static member functions so that they are passed a va_list object directly (that is, effectively put the va_start / va_end into the caller, and convert a va_start in the callee into a va_copy from the va_list argument). Then forwarding the varargs become trivial. (It seems preferable to apply this change to all non-static member functions, not just virtual functions, so that we don't need to emit two quite different codepaths for a call through a pointer to member.) On 12 May 2015 at 17:29, Richard Smith wrote: > Another item for the Itanium C++ ABI version 2 list: > > The ABI currently specifies that the initial guard variable load is an > acquire load (3.3.2, "An implementation supporting thread-safety on > multiprocessor systems must also guarantee that references to the > initialized object do not occur before the load of the initialization flag. > On Itanium, this can be done by using a ld1.acq operation to load the > flag."). > > This is inefficient on systems where an acquire load requires a fence. > Using an algorithm due to Mike Burrows (described in the appendix of > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2660.htm) the > same interface can be implemented starting with a relaxed load, where the > acquire operation is performed only the first time each thread hits the > initialization. > > On 19 November 2013 at 17:57, Richard Smith > wrote: > >> Hi, >> >> There are a few things in the current ABI which are known to be >> suboptimal, but we cannot change because doing so would introduce an ABI >> break. However, vendors sometimes get an opportunity to break their ABI (or >> are defining a new ABI), and for some vendors, this is a very common >> occurrence. To this end, I think it would be valuable for the ABI document >> to describe what we might want to put in a 'Version 2' of the ABI; that is, >> a set of changes that we recommend be made whenever a vendor has a chance >> to introduce an ABI break. >> >> (Or perhaps this should be viewed from the opposite perspective: we could >> make improvements to the ABI, with an annex listing changes that old >> platforms must make for compatibility.) >> >> Would there be support for this idea? >> >> >> In off-line discussion with John McCall, we came up with the following >> list of potential changes that might be made (sorry if I forgot any): >> >> * Make constructors and destructors return 'this' instead of returning >> 'void', in order to allow callers to avoid a reload in common cases and to >> allow more tail calls. >> * Simplify case 2b in non-POD class layout. >> * Make virtual functions that are defined as 'inline' not be key >> functions >> * Fix the bug that -1 is both the null pointer-to-data-member value and >> also a valid value of a pointer-to-data-member (could use SIZE_MIN instead) >> * Relax the definition of POD used in the ABI, in order to allow more >> class types to be passed in registers >> >> Are there any other things that it would make sense to change in a >> version 2 of the ABI? >> >> >> Also, would there be any support for documenting common deviations from >> the ABI that platform vendors might want to consider when specifying their >> own ABIs? In addition to some of the above, this would also include: >> >> * Representation of pointers-to-member-functions (in particular, the >> current representation assumes that the lowest bit of a function pointer is >> unused, which isn't true in general) >> * Representation of guard variables (some platforms use the native word >> size rather than forcing this to be 64 bits wide) >> >> Are there any others? >> >> >> Thanks! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hstong at ca.ibm.com Thu Jul 21 00:24:42 2016 From: hstong at ca.ibm.com (Hubert Tong) Date: Wed, 20 Jul 2016 20:24:42 -0400 Subject: [cxx-abi-dev] C++ ABI version 2 In-Reply-To: References: Message-ID: I believe at least the covariant return case can be solved with alternative function entry points which record the adjustments necessary on return. Of course, the va_list option can still be presented. -- HT From: Richard Smith To: "cxx-abi-dev at codesourcery.com" Date: 19-07-2016 09:04 p.m. Subject: Re: [cxx-abi-dev] C++ ABI version 2 Sent by: cxx-abi-dev-bounces at codesourcery.com Another item for the list: Variadic virtual functions with covariant return types are currently problematic: it's not possible in general to generate an adjustor thunk for them, because it's not possible in general to forward a (non-tail) varargs call. Similar problems exist for the conversion to function pointer in a non-capturing varargs lambda. We can fix this by changing the calling convention for varargs non-static member functions so that they are passed a va_list object directly (that is, effectively put the va_start / va_end into the caller, and convert a va_start in the callee into a va_copy from the va_list argument). Then forwarding the varargs become trivial. (It seems preferable to apply this change to all non-static member functions, not just virtual functions, so that we don't need to emit two quite different codepaths for a call through a pointer to member.) On 12 May 2015 at 17:29, Richard Smith wrote: Another item for the Itanium C++ ABI version 2 list: The ABI currently specifies that the initial guard variable load is an acquire load (3.3.2, "An implementation supporting thread-safety on multiprocessor systems must also guarantee that references to the initialized object do not occur before the load of the initialization flag. On Itanium, this can be done by using a ld1.acq operation to load the flag."). This is inefficient on systems where an acquire load requires a fence. Using an algorithm due to Mike Burrows (described in the appendix of http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2660.htm) the same interface can be implemented starting with a relaxed load, where the acquire operation is performed only the first time each thread hits the initialization. On 19 November 2013 at 17:57, Richard Smith wrote: Hi, There are a few things in the current ABI which are known to be suboptimal, but we cannot change because doing so would introduce an ABI break. However, vendors sometimes get an opportunity to break their ABI (or are defining a new ABI), and for some vendors, this is a very common occurrence. To this end, I think it would be valuable for the ABI document to describe what we might want to put in a 'Version 2' of the ABI; that is, a set of changes that we recommend be made whenever a vendor has a chance to introduce an ABI break. (Or perhaps this should be viewed from the opposite perspective: we could make improvements to the ABI, with an annex listing changes that old platforms must make for compatibility.) Would there be support for this idea? In off-line discussion with John McCall, we came up with the following list of potential changes that might be made (sorry if I forgot any): ?* Make constructors and destructors return 'this' instead of returning 'void', in order to allow callers to avoid a reload in common cases and to allow more tail calls. ?* Simplify case 2b in non-POD class layout. ?* Make virtual functions that are defined as 'inline' not be key functions ?* Fix the bug that -1 is both the null pointer-to-data-member value and also a valid value of a pointer-to-data-member (could use SIZE_MIN instead) ?* Relax the definition of POD used in the ABI, in order to allow more class types to be passed in registers Are there any other things that it would make sense to change in a version 2 of the ABI? Also, would there be any support for documenting common deviations from the ABI that platform vendors might want to consider when specifying their own ABIs? In addition to some of the above, this would also include: ?* Representation of pointers-to-member-functions (in particular, the current representation assumes that the lowest bit of a function pointer is unused, which isn't true in general) ?* Representation of guard variables (some platforms use the native word size rather than forcing this to be 64 bits wide) Are there any others? Thanks! _______________________________________________ cxx-abi-dev mailing list cxx-abi-dev at codesourcery.com http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From richardsmith at google.com Thu Jul 21 00:30:42 2016 From: richardsmith at google.com (Richard Smith) Date: Wed, 20 Jul 2016 17:30:42 -0700 Subject: [cxx-abi-dev] C++ ABI version 2 In-Reply-To: References: Message-ID: On 20 July 2016 at 17:24, Hubert Tong wrote: > I believe at least the covariant return case can be solved with > alternative function entry points which record the adjustments necessary on > return. > A constant adjustment is not sufficient if you're converting to a virtual base. > Of course, the va_list option can still be presented. > > -- HT > > [image: Inactive hide details for Richard Smith ---19-07-2016 09:04:25 > p.m.---Another item for the list: Variadic virtual functions wit]Richard > Smith ---19-07-2016 09:04:25 p.m.---Another item for the list: Variadic > virtual functions with covariant return types are currently > > From: Richard Smith > To: "cxx-abi-dev at codesourcery.com" > Date: 19-07-2016 09:04 p.m. > Subject: Re: [cxx-abi-dev] C++ ABI version 2 > Sent by: cxx-abi-dev-bounces at codesourcery.com > ------------------------------ > > > > Another item for the list: > > Variadic virtual functions with covariant return types are currently > problematic: it's not possible in general to generate an adjustor thunk for > them, because it's not possible in general to forward a (non-tail) varargs > call. Similar problems exist for the conversion to function pointer in a > non-capturing varargs lambda. > > We can fix this by changing the calling convention for varargs non-static > member functions so that they are passed a va_list object directly (that > is, effectively put the va_start / va_end into the caller, and convert a > va_start in the callee into a va_copy from the va_list argument). Then > forwarding the varargs become trivial. > > (It seems preferable to apply this change to all non-static member > functions, not just virtual functions, so that we don't need to emit two > quite different codepaths for a call through a pointer to member.) > > On 12 May 2015 at 17:29, Richard Smith <*richardsmith at google.com* > > wrote: > > Another item for the Itanium C++ ABI version 2 list: > > The ABI currently specifies that the initial guard variable load is an > acquire load (3.3.2, "An implementation supporting thread-safety on > multiprocessor systems must also guarantee that references to the > initialized object do not occur before the load of the initialization flag. > On Itanium, this can be done by using a ld1.acq operation to load the > flag."). > > This is inefficient on systems where an acquire load requires a fence. > Using an algorithm due to Mike Burrows (described in the appendix of > *http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2660.htm* > ) > the same interface can be implemented starting with a relaxed load, where > the acquire operation is performed only the first time each thread hits the > initialization. > > On 19 November 2013 at 17:57, Richard Smith <*richardsmith at google.com* > > wrote: > Hi, > > There are a few things in the current ABI which are known to be > suboptimal, but we cannot change because doing so would introduce an ABI > break. However, vendors sometimes get an opportunity to break their ABI (or > are defining a new ABI), and for some vendors, this is a very common > occurrence. To this end, I think it would be valuable for the ABI document > to describe what we might want to put in a 'Version 2' of the ABI; that is, > a set of changes that we recommend be made whenever a vendor has a chance > to introduce an ABI break. > > (Or perhaps this should be viewed from the opposite perspective: we > could make improvements to the ABI, with an annex listing changes that old > platforms must make for compatibility.) > > Would there be support for this idea? > > > In off-line discussion with John McCall, we came up with the following > list of potential changes that might be made (sorry if I forgot any): > > * Make constructors and destructors return 'this' instead of > returning 'void', in order to allow callers to avoid a reload in common > cases and to allow more tail calls. > * Simplify case 2b in non-POD class layout. > * Make virtual functions that are defined as 'inline' not be key > functions > * Fix the bug that -1 is both the null pointer-to-data-member value > and also a valid value of a pointer-to-data-member (could use SIZE_MIN > instead) > * Relax the definition of POD used in the ABI, in order to allow more > class types to be passed in registers > > Are there any other things that it would make sense to change in a > version 2 of the ABI? > > > Also, would there be any support for documenting common deviations > from the ABI that platform vendors might want to consider when specifying > their own ABIs? In addition to some of the above, this would also include: > > * Representation of pointers-to-member-functions (in particular, the > current representation assumes that the lowest bit of a function pointer is > unused, which isn't true in general) > * Representation of guard variables (some platforms use the native > word size rather than forcing this to be 64 bits wide) > > Are there any others? > > > Thanks! > > _______________________________________________ > cxx-abi-dev mailing list > cxx-abi-dev at codesourcery.com > http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From hstong at ca.ibm.com Thu Jul 21 00:33:29 2016 From: hstong at ca.ibm.com (Hubert Tong) Date: Wed, 20 Jul 2016 20:33:29 -0400 Subject: [cxx-abi-dev] C++ ABI version 2 In-Reply-To: References: Message-ID: ... and I can record the adjustments necessary as a pointer to a function. -- HT From: Richard Smith To: Hubert Tong/Toronto/IBM at IBMCA Cc: "cxx-abi-dev at codesourcery.com" Date: 20-07-2016 08:30 p.m. Subject: Re: [cxx-abi-dev] C++ ABI version 2 On 20 July 2016 at 17:24, Hubert Tong wrote: I believe at least the covariant return case can be solved with alternative function entry points which record the adjustments necessary on return. A constant adjustment is not sufficient if you're converting to a virtual base. Of course, the va_list option can still be presented. -- HT Inactive hide details for Richard Smith ---19-07-2016 09:04:25 p.m.---Another item for the list: Variadic virtual functions witRichard Smith ---19-07-2016 09:04:25 p.m.---Another item for the list: Variadic virtual functions with covariant return types are currently From: Richard Smith To: "cxx-abi-dev at codesourcery.com" Date: 19-07-2016 09:04 p.m. Subject: Re: [cxx-abi-dev] C++ ABI version 2 Sent by: cxx-abi-dev-bounces at codesourcery.com Another item for the list: Variadic virtual functions with covariant return types are currently problematic: it's not possible in general to generate an adjustor thunk for them, because it's not possible in general to forward a (non-tail) varargs call. Similar problems exist for the conversion to function pointer in a non-capturing varargs lambda. We can fix this by changing the calling convention for varargs non-static member functions so that they are passed a va_list object directly (that is, effectively put the va_start / va_end into the caller, and convert a va_start in the callee into a va_copy from the va_list argument). Then forwarding the varargs become trivial. (It seems preferable to apply this change to all non-static member functions, not just virtual functions, so that we don't need to emit two quite different codepaths for a call through a pointer to member.) On 12 May 2015 at 17:29, Richard Smith wrote: Another item for the Itanium C++ ABI version 2 list: The ABI currently specifies that the initial guard variable load is an acquire load (3.3.2, "An implementation supporting thread-safety on multiprocessor systems must also guarantee that references to the initialized object do not occur before the load of the initialization flag. On Itanium, this can be done by using a ld1.acq operation to load the flag."). This is inefficient on systems where an acquire load requires a fence. Using an algorithm due to Mike Burrows (described in the appendix of http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2660.htm) the same interface can be implemented starting with a relaxed load, where the acquire operation is performed only the first time each thread hits the initialization. On 19 November 2013 at 17:57, Richard Smith < richardsmith at google.com> wrote: Hi, There are a few things in the current ABI which are known to be suboptimal, but we cannot change because doing so would introduce an ABI break. However, vendors sometimes get an opportunity to break their ABI (or are defining a new ABI), and for some vendors, this is a very common occurrence. To this end, I think it would be valuable for the ABI document to describe what we might want to put in a 'Version 2' of the ABI; that is, a set of changes that we recommend be made whenever a vendor has a chance to introduce an ABI break. (Or perhaps this should be viewed from the opposite perspective: we could make improvements to the ABI, with an annex listing changes that old platforms must make for compatibility.) Would there be support for this idea? In off-line discussion with John McCall, we came up with the following list of potential changes that might be made (sorry if I forgot any): ?* Make constructors and destructors return 'this' instead of returning 'void', in order to allow callers to avoid a reload in common cases and to allow more tail calls. ?* Simplify case 2b in non-POD class layout. ?* Make virtual functions that are defined as 'inline' not be key functions ?* Fix the bug that -1 is both the null pointer-to-data-member value and also a valid value of a pointer-to-data-member (could use SIZE_MIN instead) ?* Relax the definition of POD used in the ABI, in order to allow more class types to be passed in registers Are there any other things that it would make sense to change in a version 2 of the ABI? Also, would there be any support for documenting common deviations from the ABI that platform vendors might want to consider when specifying their own ABIs? In addition to some of the above, this would also include: ?* Representation of pointers-to-member-functions (in particular, the current representation assumes that the lowest bit of a function pointer is unused, which isn't true in general) ?* Representation of guard variables (some platforms use the native word size rather than forcing this to be 64 bits wide) Are there any others? Thanks! _______________________________________________ cxx-abi-dev mailing list cxx-abi-dev at codesourcery.com http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From rnk at google.com Thu Jul 21 13:59:29 2016 From: rnk at google.com (Reid Kleckner) Date: Thu, 21 Jul 2016 09:59:29 -0400 Subject: [cxx-abi-dev] C++ ABI version 2 In-Reply-To: References: Message-ID: Alternative function entry points seem like too much of a burden on the compiler implementing the ABI. I'm not sure how I would transliterate that to C either. Translating va_start to va_copy in variadic virtual functions with a covariant return type seems much simpler to me. On Wed, Jul 20, 2016 at 8:33 PM, Hubert Tong wrote: > ... and I can record the adjustments necessary as a pointer to a function. > > -- HT > > [image: Inactive hide details for Richard Smith ---20-07-2016 08:30:52 > p.m.---On 20 July 2016 at 17:24, Hubert Tong ]Richard > Smith ---20-07-2016 08:30:52 p.m.---On 20 July 2016 at 17:24, Hubert Tong < > hstong at ca.ibm.com> wrote: > I believe at least the covariant > > From: Richard Smith > To: Hubert Tong/Toronto/IBM at IBMCA > Cc: "cxx-abi-dev at codesourcery.com" > Date: 20-07-2016 08:30 p.m. > Subject: Re: [cxx-abi-dev] C++ ABI version 2 > ------------------------------ > > > > On 20 July 2016 at 17:24, Hubert Tong <*hstong at ca.ibm.com* > > wrote: > > I believe at least the covariant return case can be solved with > alternative function entry points which record the adjustments necessary on > return. > > > A constant adjustment is not sufficient if you're converting to a virtual > base. > > Of course, the va_list option can still be presented. > > -- HT > > [image: Inactive hide details for Richard Smith ---19-07-2016 09:04:25 > p.m.---Another item for the list: Variadic virtual functions wit]Richard > Smith ---19-07-2016 09:04:25 p.m.---Another item for the list: Variadic > virtual functions with covariant return types are currently > > From: Richard Smith <*richardsmith at google.com* > > > To: "*cxx-abi-dev at codesourcery.com* " < > *cxx-abi-dev at codesourcery.com* > > Date: 19-07-2016 09:04 p.m. > Subject: Re: [cxx-abi-dev] C++ ABI version 2 > Sent by: *cxx-abi-dev-bounces at codesourcery.com* > > ------------------------------ > > > > > Another item for the list: > > Variadic virtual functions with covariant return types are currently > problematic: it's not possible in general to generate an adjustor thunk for > them, because it's not possible in general to forward a (non-tail) varargs > call. Similar problems exist for the conversion to function pointer in a > non-capturing varargs lambda. > > We can fix this by changing the calling convention for varargs > non-static member functions so that they are passed a va_list object > directly (that is, effectively put the va_start / va_end into the caller, > and convert a va_start in the callee into a va_copy from the va_list > argument). Then forwarding the varargs become trivial. > > (It seems preferable to apply this change to all non-static member > functions, not just virtual functions, so that we don't need to emit two > quite different codepaths for a call through a pointer to member.) > > On 12 May 2015 at 17:29, Richard Smith <*richardsmith at google.com* > > wrote: > Another item for the Itanium C++ ABI version 2 list: > > The ABI currently specifies that the initial guard variable load > is an acquire load (3.3.2, "An implementation supporting thread-safety on > multiprocessor systems must also guarantee that references to the > initialized object do not occur before the load of the initialization flag. > On Itanium, this can be done by using a ld1.acq operation to load the > flag."). > > This is inefficient on systems where an acquire load requires a > fence. Using an algorithm due to Mike Burrows (described in the appendix of > *http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2660.htm* > ) > the same interface can be implemented starting with a relaxed load, where > the acquire operation is performed only the first time each thread hits the > initialization. > > On 19 November 2013 at 17:57, Richard Smith < > *richardsmith at google.com* > wrote: > Hi, > > There are a few things in the current ABI which are known to be > suboptimal, but we cannot change because doing so would introduce an ABI > break. However, vendors sometimes get an opportunity to break their ABI (or > are defining a new ABI), and for some vendors, this is a very common > occurrence. To this end, I think it would be valuable for the ABI document > to describe what we might want to put in a 'Version 2' of the ABI; that is, > a set of changes that we recommend be made whenever a vendor has a chance > to introduce an ABI break. > > (Or perhaps this should be viewed from the opposite perspective: > we could make improvements to the ABI, with an annex listing changes that > old platforms must make for compatibility.) > > Would there be support for this idea? > > > In off-line discussion with John McCall, we came up with the > following list of potential changes that might be made (sorry if I forgot > any): > > * Make constructors and destructors return 'this' instead of > returning 'void', in order to allow callers to avoid a reload in common > cases and to allow more tail calls. > * Simplify case 2b in non-POD class layout. > * Make virtual functions that are defined as 'inline' not be > key functions > * Fix the bug that -1 is both the null pointer-to-data-member > value and also a valid value of a pointer-to-data-member (could use > SIZE_MIN instead) > * Relax the definition of POD used in the ABI, in order to > allow more class types to be passed in registers > > Are there any other things that it would make sense to change in > a version 2 of the ABI? > > > Also, would there be any support for documenting common > deviations from the ABI that platform vendors might want to consider when > specifying their own ABIs? In addition to some of the above, this would > also include: > > * Representation of pointers-to-member-functions (in > particular, the current representation assumes that the lowest bit of a > function pointer is unused, which isn't true in general) > * Representation of guard variables (some platforms use the > native word size rather than forcing this to be 64 bits wide) > > Are there any others? > > > Thanks! > _______________________________________________ > cxx-abi-dev mailing list > *cxx-abi-dev at codesourcery.com* > *http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev* > > > > > > > > _______________________________________________ > cxx-abi-dev mailing list > cxx-abi-dev at codesourcery.com > http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From jason at redhat.com Thu Jul 21 17:53:31 2016 From: jason at redhat.com (Jason Merrill) Date: Thu, 21 Jul 2016 13:53:31 -0400 Subject: [cxx-abi-dev] Passing an empty class by value In-Reply-To: <56D11FA2.3010408@redhat.com> References: <38C37E44FD352B44ABFC58410B0790D0901271A2@ORSMSX103.amr.corp.intel.com> <42A290F1-70B3-4BC1-A4F5-F42051DB7629@apple.com> <566B1803.8070201@redhat.com> <56D11FA2.3010408@redhat.com> Message-ID: On Fri, Feb 26, 2016 at 11:01 PM, Jason Merrill wrote: > I also notice that the ABI says "If the base ABI does not specify rules for > empty classes, then an empty class has size and alignment 1." It also says, "Empty classes will be passed no differently from ordinary classes.... The contents of the single byte parameter slot are unspecified, and the callee may not depend on any particular value." and "A result of an empty class type will be returned as though it were a struct containing a single char, i.e. struct S { char c; };. The actual content of the return register is unspecified." If we want the (new) psABI wording to override this, we need to update these rules by referring to the base ABI in these passages as well. Jason From jason at redhat.com Thu Jul 21 18:02:48 2016 From: jason at redhat.com (Jason Merrill) Date: Thu, 21 Jul 2016 14:02:48 -0400 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding Message-ID: P0135 seems to require that we elide the copy when using the result of a function returning by value to initialize a base class subobject, but the ABI doesn't currently require that such a function avoid clobbering tail padding when initializing its return object. Thoughts? Jason From rjmccall at apple.com Thu Jul 21 18:45:02 2016 From: rjmccall at apple.com (John McCall) Date: Thu, 21 Jul 2016 11:45:02 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: <0BB4FB95-1332-4FCA-8F63-6F6E18C549F9@apple.com> > On Jul 21, 2016, at 11:02 AM, Jason Merrill wrote: > P0135 seems to require that we elide the copy when using the result of > a function returning by value to initialize a base class subobject, > but the ABI doesn't currently require that such a function avoid > clobbering tail padding when initializing its return object. > Thoughts? This is not possible in general. A function returning by value returns a complete object, i.e. one with its own virtual base subobjects. We have no choice but to emit that to a temporary and move out of the non-virtual subobject. The next semantic question is whether it's compatible with NRVO, i.e. whether there are guarantees about the existence of padding on named local variables. Assuming that it's possible in some definable cases (and I think you could easily revise the standard to make it only apply to classes without v-bases), it seems abstractly reasonable. Certainly it's generally preferable to avoid a high-level copy/move + destroy pair than to use a larger store at the end of very specific initializers. As an implementor, I think I'm most worried about how this + NRVO would mess up our existing peepholes that assume the existence of tail padding on certain complete objects. John. From richardsmith at googlers.com Thu Jul 21 18:48:13 2016 From: richardsmith at googlers.com (Richard Smith) Date: Thu, 21 Jul 2016 11:48:13 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: <0BB4FB95-1332-4FCA-8F63-6F6E18C549F9@apple.com> References: <0BB4FB95-1332-4FCA-8F63-6F6E18C549F9@apple.com> Message-ID: On 21 July 2016 at 11:45, John McCall wrote: > > On Jul 21, 2016, at 11:02 AM, Jason Merrill wrote: > > P0135 seems to require that we elide the copy when using the result of > > a function returning by value to initialize a base class subobject, > > but the ABI doesn't currently require that such a function avoid > > clobbering tail padding when initializing its return object. > > Thoughts? > > This is not possible in general. A function returning by value returns a > complete > object, i.e. one with its own virtual base subobjects. We have no choice > but to > emit that to a temporary and move out of the non-virtual subobject. > That's a great point. At least for classes with virtual bases, we need to go via a temporary object when initializing a base class with a prvalue. I'll file a core issue for this. > The next semantic question is whether it's compatible with NRVO, i.e. > whether > there are guarantees about the existence of padding on named local > variables. > > Assuming that it's possible in some definable cases (and I think you could > easily revise the standard to make it only apply to classes without > v-bases), > it seems abstractly reasonable. Certainly it's generally preferable to > avoid > a high-level copy/move + destroy pair than to use a larger store at the end > of very specific initializers. > > As an implementor, I think I'm most worried about how this + NRVO would > mess up our existing peepholes that assume the existence of tail padding > on certain complete objects. > > John. > _______________________________________________ > cxx-abi-dev mailing list > cxx-abi-dev at codesourcery.com > http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richardsmith at googlers.com Thu Jul 21 18:45:16 2016 From: richardsmith at googlers.com (Richard Smith) Date: Thu, 21 Jul 2016 11:45:16 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: On 21 July 2016 at 11:02, Jason Merrill wrote: > P0135 seems to require that we elide the copy when using the result of > a function returning by value to initialize a base class subobject, > but the ABI doesn't currently require that such a function avoid > clobbering tail padding when initializing its return object. > Thoughts? If the function clobbers the tail padding of its return object, at least GCC and Clang will miscompile the program today, without P0135: #include struct X { ~X() {} int n; char d; }; struct Y { Y(); char c[3]; }; struct Z : X, virtual Y { Z(); }; X f() { X nrvo; memset(&nrvo, 0, sizeof(X)); return nrvo; } Z::Z() : Y(), X(f()) {} Y::Y() : c{1, 2, 3} {} int main() { Z z; return z.c[0]; } GCC -O0 returns 1 from main, as it should. GCC -O2 and Clang (any optimization level, even with -fno-elide-constructors) returns 0. (It looks like Clang gets this "wrong" in two ways: first, NRVO is apprently never correct on a type whose tail padding could be reused, and second, we assume that we can memcpy a trivially-copyable base class at its full size -- effectively, we seem to assume that we won't initialize the tail padding of a base class before we initialize the base class itself.) At this point I'm questioning the wisdom of allowing a virtual base to be allocated into tail padding. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason at redhat.com Thu Jul 21 20:20:07 2016 From: jason at redhat.com (Jason Merrill) Date: Thu, 21 Jul 2016 16:20:07 -0400 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: On Thu, Jul 21, 2016 at 2:45 PM, Richard Smith wrote: > On 21 July 2016 at 11:02, Jason Merrill wrote: >> >> P0135 seems to require that we elide the copy when using the result of >> a function returning by value to initialize a base class subobject, >> but the ABI doesn't currently require that such a function avoid >> clobbering tail padding when initializing its return object. >> Thoughts? > > If the function clobbers the tail padding of its return object, at least GCC > and Clang will miscompile the program today, without P0135: > > #include > struct X { ~X() {} int n; char d; }; > struct Y { Y(); char c[3]; }; > struct Z : X, virtual Y { Z(); }; > > X f() { X nrvo; memset(&nrvo, 0, sizeof(X)); return nrvo; } > Z::Z() : Y(), X(f()) {} > Y::Y() : c{1, 2, 3} {} > > int main() { > Z z; > return z.c[0]; > } > > GCC -O0 returns 1 from main, as it should. GCC -O2 and Clang (any > optimization level, even with -fno-elide-constructors) returns 0. Thanks for the testcase. > (It looks like Clang gets this "wrong" in two ways: first, NRVO is apprently > never correct on a type whose tail padding could be reused Hmm, I was thinking that the NRVO was fine, but the caller shouldn't elide the copy because the function might clobber tail padding. But that gets back to my initial question, since P0135 requires that elision. Avoiding NRVO here doesn't conflict with P0135, but it does create a new ABI requirement that existing code might violate. > and second, we > assume that we can memcpy a trivially-copyable base class at its full size > -- effectively, we seem to assume that we won't initialize the tail padding > of a base class before we initialize the base class itself.) And I'd fixed that in one place already, but still needed to fix it in another. > At this point I'm questioning the wisdom of allowing a virtual base to be > allocated into tail padding. Yep. Jason From richardsmith at googlers.com Thu Jul 21 20:31:59 2016 From: richardsmith at googlers.com (Richard Smith) Date: Thu, 21 Jul 2016 13:31:59 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: On 21 July 2016 at 13:20, Jason Merrill wrote: > On Thu, Jul 21, 2016 at 2:45 PM, Richard Smith > wrote: > > On 21 July 2016 at 11:02, Jason Merrill wrote: > >> > >> P0135 seems to require that we elide the copy when using the result of > >> a function returning by value to initialize a base class subobject, > >> but the ABI doesn't currently require that such a function avoid > >> clobbering tail padding when initializing its return object. > >> Thoughts? > > > > If the function clobbers the tail padding of its return object, at least > GCC > > and Clang will miscompile the program today, without P0135: > > > > #include > > struct X { ~X() {} int n; char d; }; > > struct Y { Y(); char c[3]; }; > > struct Z : X, virtual Y { Z(); }; > > > > X f() { X nrvo; memset(&nrvo, 0, sizeof(X)); return nrvo; } > > Z::Z() : Y(), X(f()) {} > > Y::Y() : c{1, 2, 3} {} > > > > int main() { > > Z z; > > return z.c[0]; > > } > > > > GCC -O0 returns 1 from main, as it should. GCC -O2 and Clang (any > > optimization level, even with -fno-elide-constructors) returns 0. > > Thanks for the testcase. > > > (It looks like Clang gets this "wrong" in two ways: first, NRVO is > apprently > > never correct on a type whose tail padding could be reused > > Hmm, I was thinking that the NRVO was fine, but the caller shouldn't > elide the copy because the function might clobber tail padding. But > that gets back to my initial question, since P0135 requires that > elision. Avoiding NRVO here doesn't conflict with P0135, but it does > create a new ABI requirement that existing code might violate. Given John's observation that P0135 can't even work in theory for the case of a base class with virtual bases, it seems like disabling P0135 for the case of initializing a base class of a class with vbases may be the simplest way forward. > and second, we > > assume that we can memcpy a trivially-copyable base class at its full > size > > -- effectively, we seem to assume that we won't initialize the tail > padding > > of a base class before we initialize the base class itself.) > > And I'd fixed that in one place already, but still needed to fix it in > another. > > > At this point I'm questioning the wisdom of allowing a virtual base to be > > allocated into tail padding. > > Yep. > > Jason > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rjmccall at apple.com Thu Jul 21 20:34:06 2016 From: rjmccall at apple.com (John McCall) Date: Thu, 21 Jul 2016 13:34:06 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: <63F2FD6B-6768-4FDD-AC49-7ED1C229BBA0@apple.com> > On Jul 21, 2016, at 1:20 PM, Jason Merrill wrote: > On Thu, Jul 21, 2016 at 2:45 PM, Richard Smith > wrote: >> On 21 July 2016 at 11:02, Jason Merrill wrote: >>> >>> P0135 seems to require that we elide the copy when using the result of >>> a function returning by value to initialize a base class subobject, >>> but the ABI doesn't currently require that such a function avoid >>> clobbering tail padding when initializing its return object. >>> Thoughts? >> >> If the function clobbers the tail padding of its return object, at least GCC >> and Clang will miscompile the program today, without P0135: >> >> #include >> struct X { ~X() {} int n; char d; }; >> struct Y { Y(); char c[3]; }; >> struct Z : X, virtual Y { Z(); }; >> >> X f() { X nrvo; memset(&nrvo, 0, sizeof(X)); return nrvo; } >> Z::Z() : Y(), X(f()) {} >> Y::Y() : c{1, 2, 3} {} >> >> int main() { >> Z z; >> return z.c[0]; >> } >> >> GCC -O0 returns 1 from main, as it should. GCC -O2 and Clang (any >> optimization level, even with -fno-elide-constructors) returns 0. > > Thanks for the testcase. > >> (It looks like Clang gets this "wrong" in two ways: first, NRVO is apprently >> never correct on a type whose tail padding could be reused > > Hmm, I was thinking that the NRVO was fine, but the caller shouldn't > elide the copy because the function might clobber tail padding. But > that gets back to my initial question, since P0135 requires that > elision. Avoiding NRVO here doesn't conflict with P0135, but it does > create a new ABI requirement that existing code might violate. P0135 is broken and cannot be implemented as written. Given that we're telling the committee to fix it, we should ask for something that we think can be reasonably implemented with acceptable backwards compatibility. We should not be encouraging implementations to avoid NRVO. It seems to me that the bug is that implementations should not be evaluating call results in-place into base sub-objects that have tail padding that might be reused. John. From rjmccall at apple.com Thu Jul 21 20:57:05 2016 From: rjmccall at apple.com (John McCall) Date: Thu, 21 Jul 2016 13:57:05 -0700 Subject: [cxx-abi-dev] Guaranteed copy elision and tail padding In-Reply-To: References: Message-ID: <88E38761-9B5D-49D5-B57A-DF9FFD72DDE9@apple.com> > On Jul 21, 2016, at 1:31 PM, Richard Smith wrote: > On 21 July 2016 at 13:20, Jason Merrill > wrote: > On Thu, Jul 21, 2016 at 2:45 PM, Richard Smith > > wrote: > > On 21 July 2016 at 11:02, Jason Merrill > wrote: > >> > >> P0135 seems to require that we elide the copy when using the result of > >> a function returning by value to initialize a base class subobject, > >> but the ABI doesn't currently require that such a function avoid > >> clobbering tail padding when initializing its return object. > >> Thoughts? > > > > If the function clobbers the tail padding of its return object, at least GCC > > and Clang will miscompile the program today, without P0135: > > > > #include > > struct X { ~X() {} int n; char d; }; > > struct Y { Y(); char c[3]; }; > > struct Z : X, virtual Y { Z(); }; > > > > X f() { X nrvo; memset(&nrvo, 0, sizeof(X)); return nrvo; } > > Z::Z() : Y(), X(f()) {} > > Y::Y() : c{1, 2, 3} {} > > > > int main() { > > Z z; > > return z.c[0]; > > } > > > > GCC -O0 returns 1 from main, as it should. GCC -O2 and Clang (any > > optimization level, even with -fno-elide-constructors) returns 0. > > Thanks for the testcase. > > > (It looks like Clang gets this "wrong" in two ways: first, NRVO is apprently > > never correct on a type whose tail padding could be reused > > Hmm, I was thinking that the NRVO was fine, but the caller shouldn't > elide the copy because the function might clobber tail padding. But > that gets back to my initial question, since P0135 requires that > elision. Avoiding NRVO here doesn't conflict with P0135, but it does > create a new ABI requirement that existing code might violate. > > Given John's observation that P0135 can't even work in theory for the case of a base class with virtual bases, it seems like disabling P0135 for the case of initializing a base class of a class with vbases may be the simplest way forward. We re-use tail padding of all bases, not just virtual bases. It's true that the Itanium ABI generally initializes things in ascending address order, but there are *two* exceptions. The first, as you've noted, is virtual bases. The second is when the primary base class is not the first base class in inheritance order: struct A { char c; A() : c(15) {} }; struct B { virtual void foo() {} char d; }; struct C : A, B {}; int main() { C c; } Here the 'A' base is allocated in the tail padding of the 'B' base. Now, 'B' is not technically trivially-copyable, but... Also, it's a big world, and other/alternative/future ABIs might want to do all sorts of things. It's also not that hard to imagine future language features that would rely on knowing whether a constructor is initializing a base sub-object or a complete object (for example, the language could provide a way to declare constructors that are only allowed to initialize one or the other). It seems to me that the maximally correct thing is to disable the P0135 mandate for the case of initializing a base sub-object, full stop. If we can define conditions in which it's acceptable to elide the copy, great, but that should be up to the implementation / ABI. (Semantic features like base-only constructors wouldn't prevent us from doing this best-effort today because adding one to an existing type is an ODR violation anyway. We could easily adjust the ABI rule to disable in-place copy elision into base subobjects when the chosen copy constructor is base-only or something.) John. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pcc at google.com Fri Jul 22 01:42:06 2016 From: pcc at google.com (Peter Collingbourne) Date: Thu, 21 Jul 2016 18:42:06 -0700 Subject: [cxx-abi-dev] Proposing an ABI restriction on loads from an object's vtable pointer Message-ID: Hi all, The ABI currently requires that virtual tables for a class appear consecutively in a virtual table group. I would like to propose a restriction that would require that compilers may only access the virtual table associated with the address point stored in an object's virtual table pointer, and may not rely on any knowledge that the compiler may have about the relative layout of other virtual tables in the virtual table group. The purpose of this restriction is to allow an implementation to split a virtual table group along virtual table boundaries. Motivation There are at least two scenarios which would benefit from vtable splitting: clients which want to place data either before or after the ABI-required part of a virtual table, and clients which want to control the layout of virtual tables for performance or security reasons. As an example of the first scenario, when performing whole-program virtual call optimization, Clang will apply an optimization known as virtual constant propagation [0], which causes data to be laid out at a specific offset from the address point of each virtual table in a hierarchy. If that virtual table appears in a virtual table group, padding is required to place the data at an appropriate offset for each class. Because of the current restriction that vtables must appear consecutively, the optimizer may need to add more padding than necessary, or inhibit the optimization entirely if it would require too much padding. As an example of the second scenario, an implementation may wish to lay out virtual tables hierarchically either in order to increase the likelihood of a cache hit when repeatedly making the same virtual call over a set of heterogeneous objects, or to efficiently implement a security mitigation (specifically control flow integrity [1]) based on checking virtual table addresses for set membership. Placing only virtual tables (rather than virtual table groups) consecutively would likely increase the cache hit likelihood further and reduces the amount of metadata required to implement set membership checks. In an experiment involving the Chromium web browser, I have measured a binary size decrease of 1.5%, and a median performance improvement of about 1% on Chromium's layout benchmarks when comparing a binary compiled with control flow integrity and whole-program virtual call optimization against a binary compiled with control flow integrity, whole-program virtual call optimization and a prototype implementation of vtable splitting. Commentary Although the ABI specifies [2] the calling convention for virtual calls, which requires the call to be made using the this-adjustment appropriate for the object from which the virtual table pointer was loaded, the as-if rule could in principle allow a program to make a call using a different virtual table if the virtual table group contains multiple secondary virtual tables, as the distance between these virtual tables would be fixed (the same would be possible for all virtual tables if the dynamic type were known, but in that case the program could just call the appropriate virtual function directly). The purported benefit would be to avoid an additional virtual pointer load from the object in cases where consecutive calls are made to virtual functions introduced in different bases. However, it seems to me that cases where this is beneficial would be rare: not only would you need at least three bases and a derived class which does not override any of the called virtual functions, but when performing two consecutive calls it seems likely that the vtable would need to be reloaded anyway, either from the object or from the stack, especially with majority caller-save ABIs such as x86-64, or in any event because the first virtual call may have changed the object's dynamic type. It seems (according to experiments [3] carried out at godbolt.org) that all major compilers (gcc, clang, icc) do already use the appropriate vtable group and therefore are compliant with the proposed restriction. (There would also seem to be nothing preventing an implementation from choosing to load the RTTI pointer or offset-to-top from another virtual table group. However I would consider this even less likely to be beneficial than a virtual call via another virtual table.) The ABI specifies that the vtables in a group shall be laid out consecutively when referenced via a vtable group symbol, and I'm not proposing to change this. The effect of this proposal would be to allow a vtable to be split if the vtable group symbol is not referenced directly by name outside of the translation unit(s) participating in the optimization. This may be the case when a class has internal linkage, or if the program is linked with LTO, which allows the compiler to know which symbols are referenced outside of the LTO'd part of the program. Wording I propose to add two paragraphs to the section of the ABI describing virtual table groups, as follows: diff --git a/abi.html b/abi.html index 79cda2c..fce0c60 100644 --- a/abi.html +++ b/abi.html @@ -1193,6 +1193,18 @@ and again excluding primary bases (which share virtual tables with the classes for which they are primary). +

+When performing a virtual call or loading any other data from an address +derived from the address point stored in an object's virtual table pointer, +a program may only load from the virtual table associated with that address +point, and not from any other virtual table in the same virtual table group +which might be presumed to be located at a fixed offset from the address +point as a result of the above layout algorithm. + +

+The purpose of this restriction is to allow an implementation to split a +virtual table group along virtual table boundaries if its symbol is not +visible to other translation units.

Thanks, Peter [0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html [1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html [2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller [3] https://godbolt.org/g/wX7Ay6 is a three-bases test case by Richard Smith, https://godbolt.org/g/7eG8A1 is a dynamic-type-known test case by me -------------- next part -------------- An HTML attachment was scrubbed... URL: From rjmccall at apple.com Thu Jul 28 02:21:01 2016 From: rjmccall at apple.com (John McCall) Date: Wed, 27 Jul 2016 19:21:01 -0700 Subject: [cxx-abi-dev] Proposing an ABI restriction on loads from an object's vtable pointer In-Reply-To: References: Message-ID: > On Jul 21, 2016, at 6:42 PM, Peter Collingbourne wrote: > > Hi all, > > The ABI currently requires that virtual tables for a class appear consecutively in a virtual table group. I would like to propose a restriction that would require that compilers may only access the virtual table associated with the address point stored in an object's virtual table pointer, and may not rely on any knowledge that the compiler may have about the relative layout of other virtual tables in the virtual table group. > > The purpose of this restriction is to allow an implementation to split a virtual table group along virtual table boundaries. > > Motivation > > There are at least two scenarios which would benefit from vtable splitting: clients which want to place data either before or after the ABI-required part of a virtual table, and clients which want to control the layout of virtual tables for performance or security reasons. > > As an example of the first scenario, when performing whole-program virtual call optimization, Clang will apply an optimization known as virtual constant propagation [0], which causes data to be laid out at a specific offset from the address point of each virtual table in a hierarchy. If that virtual table appears in a virtual table group, padding is required to place the data at an appropriate offset for each class. Because of the current restriction that vtables must appear consecutively, the optimizer may need to add more padding than necessary, or inhibit the optimization entirely if it would require too much padding. > > As an example of the second scenario, an implementation may wish to lay out virtual tables hierarchically either in order to increase the likelihood of a cache hit when repeatedly making the same virtual call over a set of heterogeneous objects, or to efficiently implement a security mitigation (specifically control flow integrity [1]) based on checking virtual table addresses for set membership. Placing only virtual tables (rather than virtual table groups) consecutively would likely increase the cache hit likelihood further and reduces the amount of metadata required to implement set membership checks. > > In an experiment involving the Chromium web browser, I have measured a binary size decrease of 1.5%, and a median performance improvement of about 1% on Chromium's layout benchmarks when comparing a binary compiled with control flow integrity and whole-program virtual call optimization against a binary compiled with control flow integrity, whole-program virtual call optimization and a prototype implementation of vtable splitting. > > Commentary > > Although the ABI specifies [2] the calling convention for virtual calls, which requires the call to be made using the this-adjustment appropriate for the object from which the virtual table pointer was loaded, the as-if rule could in principle allow a program to make a call using a different virtual table if the virtual table group contains multiple secondary virtual tables, as the distance between these virtual tables would be fixed (the same would be possible for all virtual tables if the dynamic type were known, but in that case the program could just call the appropriate virtual function directly). In what situation would the distance between secondary virtual tables in a VTT be fixed where you don't know the dynamic type? Derived classes can always introduce or re-introduce virtual bases in ways that re-order the secondary virtual tables. John. > > The purported benefit would be to avoid an additional virtual pointer load from the object in cases where consecutive calls are made to virtual functions introduced in different bases. However, it seems to me that cases where this is beneficial would be rare: not only would you need at least three bases and a derived class which does not override any of the called virtual functions, but when performing two consecutive calls it seems likely that the vtable would need to be reloaded anyway, either from the object or from the stack, especially with majority caller-save ABIs such as x86-64, or in any event because the first virtual call may have changed the object's dynamic type. It seems (according to experiments [3] carried out at godbolt.org ) that all major compilers (gcc, clang, icc) do already use the appropriate vtable group and therefore are compliant with the proposed restriction. > > (There would also seem to be nothing preventing an implementation from choosing to load the RTTI pointer or offset-to-top from another virtual table group. However I would consider this even less likely to be beneficial than a virtual call via another virtual table.) > > The ABI specifies that the vtables in a group shall be laid out consecutively when referenced via a vtable group symbol, and I'm not proposing to change this. The effect of this proposal would be to allow a vtable to be split if the vtable group symbol is not referenced directly by name outside of the translation unit(s) participating in the optimization. This may be the case when a class has internal linkage, or if the program is linked with LTO, which allows the compiler to know which symbols are referenced outside of the LTO'd part of the program. > > Wording > > I propose to add two paragraphs to the section of the ABI describing virtual table groups, as follows: > > diff --git a/abi.html b/abi.html > index 79cda2c..fce0c60 100644 > --- a/abi.html > +++ b/abi.html > @@ -1193,6 +1193,18 @@ and again excluding primary bases > (which share virtual tables with the classes for which they are primary). > > > +

> +When performing a virtual call or loading any other data from an address > +derived from the address point stored in an object's virtual table pointer, > +a program may only load from the virtual table associated with that address > +point, and not from any other virtual table in the same virtual table group > +which might be presumed to be located at a fixed offset from the address > +point as a result of the above layout algorithm. > + > +

> +The purpose of this restriction is to allow an implementation to split a > +virtual table group along virtual table boundaries if its symbol is not > +visible to other translation units. > >

> > > > Thanks, > Peter > > [0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html > [1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html > [2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller > [3] https://godbolt.org/g/wX7Ay6 is a three-bases test case by Richard Smith, https://godbolt.org/g/7eG8A1 is a dynamic-type-known test case by me > _______________________________________________ > cxx-abi-dev mailing list > cxx-abi-dev at codesourcery.com > http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From rjmccall at apple.com Thu Jul 28 16:52:37 2016 From: rjmccall at apple.com (John McCall) Date: Thu, 28 Jul 2016 09:52:37 -0700 Subject: [cxx-abi-dev] Proposing an ABI restriction on loads from an object's vtable pointer In-Reply-To: References: Message-ID: <549195CC-4BC8-451B-AA61-0BC14F451930@apple.com> > On Jul 27, 2016, at 7:21 PM, John McCall wrote: >> On Jul 21, 2016, at 6:42 PM, Peter Collingbourne > wrote: >> >> Hi all, >> >> The ABI currently requires that virtual tables for a class appear consecutively in a virtual table group. I would like to propose a restriction that would require that compilers may only access the virtual table associated with the address point stored in an object's virtual table pointer, and may not rely on any knowledge that the compiler may have about the relative layout of other virtual tables in the virtual table group. >> >> The purpose of this restriction is to allow an implementation to split a virtual table group along virtual table boundaries. >> >> Motivation >> >> There are at least two scenarios which would benefit from vtable splitting: clients which want to place data either before or after the ABI-required part of a virtual table, and clients which want to control the layout of virtual tables for performance or security reasons. >> >> As an example of the first scenario, when performing whole-program virtual call optimization, Clang will apply an optimization known as virtual constant propagation [0], which causes data to be laid out at a specific offset from the address point of each virtual table in a hierarchy. If that virtual table appears in a virtual table group, padding is required to place the data at an appropriate offset for each class. Because of the current restriction that vtables must appear consecutively, the optimizer may need to add more padding than necessary, or inhibit the optimization entirely if it would require too much padding. >> >> As an example of the second scenario, an implementation may wish to lay out virtual tables hierarchically either in order to increase the likelihood of a cache hit when repeatedly making the same virtual call over a set of heterogeneous objects, or to efficiently implement a security mitigation (specifically control flow integrity [1]) based on checking virtual table addresses for set membership. Placing only virtual tables (rather than virtual table groups) consecutively would likely increase the cache hit likelihood further and reduces the amount of metadata required to implement set membership checks. >> >> In an experiment involving the Chromium web browser, I have measured a binary size decrease of 1.5%, and a median performance improvement of about 1% on Chromium's layout benchmarks when comparing a binary compiled with control flow integrity and whole-program virtual call optimization against a binary compiled with control flow integrity, whole-program virtual call optimization and a prototype implementation of vtable splitting. >> >> Commentary >> >> Although the ABI specifies [2] the calling convention for virtual calls, which requires the call to be made using the this-adjustment appropriate for the object from which the virtual table pointer was loaded, the as-if rule could in principle allow a program to make a call using a different virtual table if the virtual table group contains multiple secondary virtual tables, as the distance between these virtual tables would be fixed (the same would be possible for all virtual tables if the dynamic type were known, but in that case the program could just call the appropriate virtual function directly). > > In what situation would the distance between secondary virtual tables in a VTT be fixed where you don't know the dynamic type? Derived classes can always introduce or re-introduce virtual bases in ways that re-order the secondary virtual tables. Okay, thinking about it more, the idea is that, because the enumeration order is depth-first, there will always be a local range of the compound v-table that contains the v-tables of the non-virtual for any given portion of the class hierarchy. Because the secondary tables never have new function pointers added to them, they do not grow to the right; and because v-call offsets are always added to the primary v-table for a virtual base, they do not grow to the left. Therefore, a secondary v-table of a non-virtual base is fixed in size, and so you could theoretically reach from one secondary v-table to another with a constant offset. For this to be profitable, of course, you would have to have one secondary table already loaded when you tried to use the other; but that could happen. So I agree that this would be a possible optimization today. >> The purported benefit would be to avoid an additional virtual pointer load from the object in cases where consecutive calls are made to virtual functions introduced in different bases. However, it seems to me that cases where this is beneficial would be rare: not only would you need at least three bases and a derived class which does not override any of the called virtual functions, but when performing two consecutive calls it seems likely that the vtable would need to be reloaded anyway, either from the object or from the stack, especially with majority caller-save ABIs such as x86-64, or in any event because the first virtual call may have changed the object's dynamic type. This part of your argument is weak. Putting the v-table in a callee-save register would be quite reasonable if you're doing many repeat calls. I don't see why it would matter whether the majority of registers are callee-save as long as the absolute number is at least 2; even i386 gives us 3 general-purpose callee-save registers, and x86-64 has 5. And it's undefined behavior to change a pointer's dynamic type like that, although that can be tricky to take advantage of. That said, I would say that the trade-offs still break in your favor here. The optimization potential of this sort of contrived situation ? calls to virtual methods of two different secondary v-tables ? doesn't out-weigh the optimization potential of permitting non-standard organization of secondary v-tables. >> It seems (according to experiments [3] carried out at godbolt.org ) that all major compilers (gcc, clang, icc) do already use the appropriate vtable group and therefore are compliant with the proposed restriction. >> >> (There would also seem to be nothing preventing an implementation from choosing to load the RTTI pointer or offset-to-top from another virtual table group. However I would consider this even less likely to be beneficial than a virtual call via another virtual table.) I agree, I cannot imagine why an optimizer would deliberately do this when it could get the same information from a simpler source. >> The ABI specifies that the vtables in a group shall be laid out consecutively when referenced via a vtable group symbol, and I'm not proposing to change this. The effect of this proposal would be to allow a vtable to be split if the vtable group symbol is not referenced directly by name outside of the translation unit(s) participating in the optimization. This may be the case when a class has internal linkage, or if the program is linked with LTO, which allows the compiler to know which symbols are referenced outside of the LTO'd part of the program. >> >> Wording >> >> I propose to add two paragraphs to the section of the ABI describing virtual table groups, as follows: >> >> diff --git a/abi.html b/abi.html >> index 79cda2c..fce0c60 100644 >> --- a/abi.html >> +++ b/abi.html >> @@ -1193,6 +1193,18 @@ and again excluding primary bases >> (which share virtual tables with the classes for which they are primary). >> >> >> +

>> +When performing a virtual call or loading any other data from an address >> +derived from the address point stored in an object's virtual table pointer, >> +a program may only load from the virtual table associated with that address >> +point, and not from any other virtual table in the same virtual table group >> +which might be presumed to be located at a fixed offset from the address >> +point as a result of the above layout algorithm. >> + >> +

>> +The purpose of this restriction is to allow an implementation to split a >> +virtual table group along virtual table boundaries if its symbol is not >> +visible to other translation units. I would say this more generally: the ABI does not make guarantees about the relative layout of v-tables in an object or a VTT. It guarantees only the layout of the global symbol. It does not guarantee that the v-table pointers actually installed in an object or a VTT will point into that global symbol. John. >> >>

>> >> >> >> Thanks, >> Peter >> >> [0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html >> [1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html >> [2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller >> [3] https://godbolt.org/g/wX7Ay6 is a three-bases test case by Richard Smith, https://godbolt.org/g/7eG8A1 is a dynamic-type-known test case by me >> _______________________________________________ >> cxx-abi-dev mailing list >> cxx-abi-dev at codesourcery.com >> http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev > > _______________________________________________ > cxx-abi-dev mailing list > cxx-abi-dev at codesourcery.com > http://sourcerytools.com/cgi-bin/mailman/listinfo/cxx-abi-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: