Thunks, vol. XXII

thomson at ca.ibm.com thomson at ca.ibm.com
Wed Sep 1 16:13:42 UTC 1999


Christophe:

>Also note Jim's idea of predicating the adjustment, using the low
>bit of the function pointer. This would mean that the adjustment
>would probably cost much less than 3 cycles, with an extra cost at
>call site that we did not analyze yet.

No, I hadn't seen this.  With all the discussion about branches
I had been thinking about predication and thunks though, but wondered
how to control it.  Bits from the function pointer is an
interesting idea and set me going:

Caller:
     addi      Rthis=#preadjustment,Rxxx  ;;
     ld8       Rvptr=[Rthis]          ;;
     addi      Rfndesc=#slot_offset,Rvptr ;;
     ld8       Rfnep=[Rfndesc],8            ;;
     ld8       GP=[Rfndesc]
     mov       BRn=Rfnep
     shl       Rmask=Rfnep,6          ;;
     mov       pr=Rmask,0x380
     br.call   BR0=BRn                ;;


Callee:
__foo_2:  (p6) addi Rthis=adj_value_1,Rthis
          (p7) addi Rthis=adj_value_2,Rthis
          (p8) addi Rthis=adj_value_3,Rthis ;;
__foo:    alloc ...

It gets complicated, because you need different variants
if there are more than 3 adjustments, or if any of them
don't fit into 14 bits, but a moderately parallel
implementation handle a lot of common nonvirtual
cases with only one extra cycle, wouldn't it?




Brian Thomson
VisualAge C/C++ Chief Architect






More information about the cxx-abi-dev mailing list