[cxx-abi-dev] pointer-to-data-member representation for null pointer is not conforming

Richard Smith richardsmith at google.com
Fri Dec 21 05:37:32 UTC 2012


On Thu, Dec 20, 2012 at 8:53 PM, John McCall <rjmccall at apple.com> wrote:

> On Dec 20, 2012, at 7:09 PM, John McCall <rjmccall at apple.com> wrote:
>
> On Dec 20, 2012, at 4:19 PM, Richard Smith <richardsmith at google.com>
> wrote:
>
> Consider the following:
>
> struct E {};
> struct X : E {};
> struct C : E, X { char x; };
>
> char C::*c1 = &C::x;
> char X::*x = (char(X::*))c1;
> char C::*c2 = x2;
>
> int main() { return c2 != 0; }
>
> I believe this program is valid and has defined behavior; per
> [expr.static.cast]p12, we can convert a pointer to a member of a derived
> class to a pointer to a member of a base class, so long as the base class
> is a base class of the class containing the original member.
>
> Per the ABI, C::x is at offset 0, C::E is at offset 0, and C::X and
> C::X::E are at offset 1 (they can't go at 0 due to the collision of the
> empty E base class). So the value of c1 is 0. And the value of x is... -1.
> Whoops.
>
> Finally, the conversion from x to c2 preserves the -1 value (conversion of
> a null member pointer produces a null member pointer), giving the wrong
> value for x2, and resulting in main returning 0, where the standard
> requires it to return 1 (likewise, returning x != 0 would produce the wrong
> value).
>
>
> Yep.
>
> Personally, I've been aware of this for awhile and consider it an
> unfixable defect.  I don't know if it's generally known, though, and I
> can't find any prior discussion on the list.
>
> I'm not aware of any non-artificial code that the defect has ever broken;
>  there are some decent just-so stories for why that might be true:
>   (1) Data member pointers provide a really awkward abstraction that just
> aren't used that much:
>     (1a) They let you abstract over any member you want!
>     (1b) As long as that member has exactly the right type, not something
> implicitly convertible to it!
>     (1c) And as long as that member is actually stored in a field, not
> computed from it!
>     (1d) And as long as that field is a field of the class or one of its
> bases, not a field of a field of the class!
>   (2) Everything about the syntax of member pointers — making them, using
> them, writing their types — is kindof weird-looking, and many people don't
> like using them.
>   (3) The sorts of low-level programmers who would use this strange
> abstraction are often more comfortable using offsetof and explicit char*
> manipulation anyway.
>   (4) People usually use data member pointers on hierarchically boring
> types anyway — generally leaf classes.
>   (5) People usually don't mix data member pointers from different levels
> of the class hierarchy, and therefore generally don't convert do hierarchy
> conversions on them.
>   (6) People usually don't work with null member pointers — they use
> member pointers as a way of abstracting an access for some algorithm, and
> generally that doesn't admit a null value.
>   (6) Vanishingly few non-empty subclasses are ever going to be laid out
> at an offset of 1:
>     (6a) The base class must have an alignment of 1, meaning (for pretty
> much every platform out there) no virtual functions, no interesting data
> structures, no pointers, no ints — nothing but bools and chars and arrays
> thereof.
>     (6b) The derived class cannot have any virtual functions or virtual
> bases.
>     (6c) The derived class must have multiple base classes, the first of
> which has to be either empty (totally empty, lacking even virtual methods)
> or size 1.
>
>
> I went to dinner and realized that this point isn't as useful as I thought
> — you don't need a base class to be laid out at an offset of 1, you need a
> base class to be laid out immediately after a base A that has a field of
> size 1 at offset datasize(A)-1.
>

You need the field to be in the derived class in order for this to be a
problem; otherwise, the cast would have undefined behavior. Hence, the base
class must be empty, and indeed must be a repeated empty base class (to not
be at offset 0).


>  I *can* imagine a number of use cases that cause situations like this, so
> while most of my other points stand, it isn't quite as cut-and-dry as I
> made it out to be.
>

#include <iostream>

struct noncopyable {
  noncopyable() = default;
  noncopyable(const noncopyable&) = delete;
};
struct serializable : noncopyable {
  template<typename T> void serialize(T serializable::**members) {
    while (*members) std::cout << this->**members++ << std::endl;
  }
};
struct MyWonderfulType : noncopyable, serializable {
  char c = 'x';
  void serialize() {
    char serializable::*(CharMembers[]) = {
(char(serializable::*))&MyWonderfulType::c, nullptr };
    serializable::serialize(CharMembers);
  }
};

int main() {
  MyWonderfulType mwt;
  mwt.serialize();
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sourcerytools.com/pipermail/cxx-abi-dev/attachments/20121220/a6ca9107/attachment.html>


More information about the cxx-abi-dev mailing list