[cxx-abi-dev] Run-time array checking
John McCall
rjmccall at apple.com
Thu Sep 13 03:07:39 UTC 2012
On Sep 12, 2012, at 7:15 PM, Mike Herrick wrote:
> On Sep 11, 2012, at 8:44 PM, John McCall wrote:
>> On Sep 11, 2012, at 12:28 PM, Mike Herrick wrote:
>>> On Sep 10, 2012, at 12:35 PM, John McCall wrote:
>>>> I wouldn't say option 2 is *out*, it's just not *sufficient*, in the same ways that __cxa_vec_new was never sufficient.
>>>>
>>>> Would you mind writing up a formal proposal (or even a patch)?
>>>
>>> Not at all (assuming we can figure out what the best course of action is).
>>
>> Thanks!
>>
>>>> At a high level I think the required changes are:
>>>>
>>>> 1) Adding the new __cxa_throw_bad_array_new_length routine. There's a still an open question here, I think: it's a better user experience if std::bad_array_new_length carries the length argument. Unfortunately (a) that's a bit complicated to encode as an operand to the routine, because we'd also need to track whether that's signed or unsigned, and (b) it looks like libc++ doesn't have space for carrying this information, and libstdc++ apparently hasn't been revised for this rule change yet.
>>>
>>> We agree that having the length argument is desirable from a user's point of view, but it seems rather difficult for the compiler to convey this value to a library routine given that its type may be signed or unsigned and it may or may not be larger than size_t/ptrdiff_t.
>>
>> I hadn't thought of the wider-than-size_t problem, although amusingly I did
>> remember that case when writing the bounds checks in clang.
>>
>> Hmm. At the risk of prescribing an overly complicated API, I would suggest:
>> void __cxa_throw_bad_array_new_length(uintptr_t sizeData, int flags);
>> where 'flags' is:
>> (sizeof(size) << 1) | std::is_signed<decltype(size)>::value
>> and where sizeData is either:
>> size, converted to a uintptr_t, if sizeof(size) <= sizeof(uintptr_t), or
>> &size otherwise (throwing it in some temporary memory).
>> Converting to a uintptr_t means zero-extending or sign-extending as appropriate.
>
> I'm a little leery of passing size (sizeData) in this fashion. [Also, std::uintptr_t appears to be optional in the standard.]
Well, the simpler alternative is to saturate to a size_t/ssize_t (saturation being unnecessary unless the size is actually of a larger type than size_t) and just pass a flag indicating whether it's signed. That inherently loses information, of course.
I know there are platforms which don't provide std::uintptr_t, but are there platforms which *can't* support std::uintptr_t? That is, is this a real limitation or a "some system headers are dumber than others" limitation?
> If we went this route, I'd argue to separate flags above into two separate arguments.
Is there a good reason to, other than to get a slightly prettier-looking API? I know that minimizing function arguments seems like a micro-optimization, but I'm not sure that's inappropriate in this context; we certainly already have users that begrudge us the size of array-new, and this entire discussion is about making it larger. Every instruction helps.
>> If we're not going to add signed/oversized variants — both reasonable
>> choices, in my view — then the compiler can still use __cxa_vec_new*
>> as long as as it puts an appropriate check in front if either:
>> - sizeof(size) > sizeof(size_t)
>> - decltype(size) is signed
>
> - size < number_of_initialized_elements
Oh, yes, of course. If it's a nested array allocation, we need to do that
overflow check outside as well.
>> This check is required if __cxa_throw_bad_array_new_length takes
>> any information about the size value and type. I want it to take that
>> information. However, if the consensus goes the other way and
>> __cxa_throw_bad_array_new_length does *not* take any information
>> about the size value, we can avoid this extra call in the extremely
>> common case that sizeof(element) > 1, because the overflow check
>> in __cxa_vec_new* will automatically trigger for negative values.
>> Thus we can skip all checking relating to "normal" signed size values,
>> and for "oversized" size values we can simply saturate at -1 or some
>> other value which is guaranteed to fail to allocate.
>
> Assuming there are no architectures where this doesn't hold true, it sounds good to me.
We'd certainly have a lot more optimization flexibility if we don't try
to preserve the bad size value. My worry is that we'd be *forcing* a poor
debugging experience on programmers — they'd have to reproduce the
problem in a debugger in order to have any idea what the bad value was.
I'll readily grant that this is already true for a large class of other bugs in C++.
Anyway, I've asked Howard Hinnant, Apple's C++ library maintainer, to
catch up on the discussion and weigh in.
John.
More information about the cxx-abi-dev
mailing list