[cxx-abi-dev] Run-time array checking

Thu Sep 13 02:15:47 UTC 2012

On Sep 11, 2012, at 8:44 PM, John McCall wrote:

> On Sep 11, 2012, at 12:28 PM, Mike Herrick wrote:
>> On Sep 10, 2012, at 12:35 PM, John McCall wrote:
>>> I wouldn't say option 2 is *out*, it's just not *sufficient*, in the same ways that __cxa_vec_new was never sufficient.
>>> 
>>> Would you mind writing up a formal proposal (or even a patch)?
>> 
>> Not at all (assuming we can figure out what the best course of action is).
> 
> Thanks!
> 
>>> At a high level I think the required changes are:
>>> 
>>> 1) Adding the new __cxa_throw_bad_array_new_length routine.  There's a still an open question here, I think:  it's a better user experience if std::bad_array_new_length carries the length argument.  Unfortunately (a) that's a bit complicated to encode as an operand to the routine, because we'd also need to track whether that's signed or unsigned, and (b) it looks like libc++ doesn't have space for carrying this information, and libstdc++ apparently hasn't been revised for this rule change yet.
>> 
>> We agree that having the length argument is desirable from a user's point of view, but it seems rather difficult for the compiler to convey this value to a library routine given that its type may be signed or unsigned and it may or may not be larger than size_t/ptrdiff_t.
> 
> I hadn't thought of the wider-than-size_t problem, although amusingly I did
> remember that case when writing the bounds checks in clang.
> 
> Hmm.  At the risk of prescribing an overly complicated API, I would suggest:
>  void __cxa_throw_bad_array_new_length(uintptr_t sizeData, int flags);
> where 'flags' is:
>  (sizeof(size) << 1) | std::is_signed<decltype(size)>::value
> and where sizeData is either:
> size, converted to a uintptr_t, if sizeof(size) <= sizeof(uintptr_t), or
> &size otherwise (throwing it in some temporary memory).
> Converting to a uintptr_t means zero-extending or sign-extending as appropriate.

I'm a little leery of passing size (sizeData) in this fashion.  [Also, std::uintptr_t appears to be optional in the standard.]

If we went this route, I'd argue to separate flags above into two separate arguments.

Any other opinions on whether we should try to save this value (and if so, in which manner)?

> 
> In the common case, this should be a pretty small addition to the call
> sequence — on x86-64, for example, it would be (at worst) a register-register
> move and a small immediate-to-register move.  I think that's a reasonable
> sacrifice for the benefit of letting the ABI library report useful information in
> the exception.
> 
> (ABI libraries will probably just saturate the bound value when storing it
> in the exception, but this lets them decide how and when to do so.)
> 
>>> 2) Including this behavior in the specification for __cxa_vec_new{,2,3}.
>>> 
>>> 3) If desired, adding __cxa_vec_new_signed{,2,3}.
>> 
>> We're thinking that (because of the difficulty mentioned above) it's best to make one change: namely to add __cxa_throw_bad_array_new_length(void).  This pushes the responsibility to the compiler (where the type is known), and hopefully results in generated code that can be more easily optimized.  The existing routines would be unchanged.
> 
> Are you suggesting that the existing routines would not do this overflow
> checking?  That makes them much less valuable for their intended
> purposes of code-size optimization, because we'd still need several
> checks and a whole second call sequence in the generated code.
> 
> I think the existing routines should do the check, assuming (as they must)
> that they were given a valid unsigned value that fits within a size_t.  In
> optimal code, this will just be some easily-predicted branch-on-overflow
> instructions after the existing arithmetic;  it's peanuts compared to the rest
> of the work.

I agree that the existing routines should be updated to do whatever checking they can (i.e., for overflow in the typical case where sizeof(size) <= sizeof(size_t) and decltype(size) is unsigned).

> 
> If we're not going to add signed/oversized variants — both reasonable
> choices, in my view — then the compiler can still use __cxa_vec_new*
> as long as as it puts an appropriate check in front if either:
>  - sizeof(size) > sizeof(size_t)
>  - decltype(size) is signed

  - size < number_of_initialized_elements

> This check is required if __cxa_throw_bad_array_new_length takes
> any information about the size value and type.  I want it to take that
> information.  However, if the consensus goes the other way and
> __cxa_throw_bad_array_new_length does *not* take any information
> about the size value, we can avoid this extra call in the extremely
> common case that sizeof(element) > 1, because the overflow check
> in __cxa_vec_new* will automatically trigger for negative values.
> Thus we can skip all checking relating to "normal" signed size values,
> and for "oversized" size values we can simply saturate at -1 or some
> other value which is guaranteed to fail to allocate.

Assuming there are no architectures where this doesn't hold true, it sounds good to me.

Mike.