substitutions

Tue Apr 18 22:11:38 UTC 2000

Jim Dehnert wrote:
> 
> I'll take a shot at this, but please correct me if I get it wrong.
> 
> > From: Alain Miniussi <alainm at cup.hp.com>
> >
> > Suppose that we need to encode the following
> >
> > C1::C2
> > C1::C3::C4
> > C1::C3::C5
> >
> > in a name, we'll get :
> >
> > N2C12C2E .... NS<n1>_2C32C4E  NS<n2>_2C5E ......
> 
> So, the substitution dictionary that gets built up is:
> 
> C1
> C1::C2
> ...
> C1::C3
> C1::C3::C4
> ...
> C1::C3::C6

Yes, except that the corresponding string in the encoding are:

C1       ->   2C1
C1::C2   ->   2C12C2
...
C1::C3   ->   S<c1n>_2C3
C1::C3::C4 -> S<c1n>_2C32C4
...
C1::C3::C5 -> ????2C5

> Assuming nothing in the ellipses, these are, at the point of the last
> one:
>   S2_
>   S1_
>   S0_
>   S_
> 
> > The problem:
> >
> > We accept the substitution only if the size of the encoded
> > substitution is strictly smaller than the size of the
> > substituted entity.
> >
> > Now, let's say that S<n2>_ is longuer than S<n1>_2C3, what should
> > we do ?
> 
> Ah, I think I see your problem.  The entity that we're considering for
> substitution is 2C12C3, _not_ S<n1>_2C3.  Earlier substitutions don't
> come into play.

But we don't have the string 2C12C3 in the encoding, if C1::C2 already
appear before, we need to reuse (let say that C1 is a long enough name)
the C1 preffix, so we only have S<nc1>2C2 appearing in the encoding.
And probably some map indicating that C1::C3 is encoded with that
string.

> > Clearly, we can't write NS<n1>_2C3 instead of S<n2>_ because
> > n1 does not reffers to the same entity at that point.
> 
> Right.  Our choices at this point are (with the above numbering)
> 2C12C3 (no substitution), S2_2C3 (substitute for C1), or S0_
> (substitute for C1::C3).  We wouldn't choose the second because it's
> not shorter, and would choose the third unless <n> were > 3 digits.

Maybe my problem will be more clear with the following example:

struct C1xxxx {
    struct C2 {};
    struct C3 {
        struct C4 {};
    };
}

template <class T12, class T13, class Tune, class T134>
struct Temp : virtual something{};

And we need to encode the name:

Temp<C1xxxx::C2,C1xxxx::C3,int,C1xxxx::C3::C4>

(to encode it's vptr for example)

We have:

4Temp 
  I
    N 6C1xxxx 2C2 E 
    N S1_ 2C3 ES
    i
    N S2_ 2C4 E // S2_ -> S1_ 2C3, let's call S2_ S<C13n>_ instead.
  E

Now imagine, instead of "int", we have something very 
big that generate a 5 digit (or more) C13n (sure, it's 
a big number, it won't append very often) so that 
strlen( "S<C13n>_2C4" ) > strlen ("S1_ 2C3")

At that point, I don't see what to write instead of "S<C13n>_2C4".

> > If we are
> > ready to replace n1 with it's updated value n1+delta, the rule
> > and the implementation becomes more complicated (imagine that
> > we have something more complex than S<n1>_2C3, with some
> > substitued template args and so on...).
> 
> I don't think this is true if you recognize that you're not
> substituting for already-substituted strings.  Do you?
> 
> > Now, the typical size of a substitution will be 3, encoded source
> > names are at least 2 char long (and I don't think it's the typical
> > size).
> > So the only "real world" (but every one has it's own, so...)
> > waste of space of more than 1 char/substitution I can think of
> > involve builtin type. What about supressing the "smaller size"
> > rule and saying that builtin types can't be source of further
> > substitution ?
> 
> I don't feel strongly about this.  Does anyone else?
> 
> Jim
> 
> -           Jim Dehnert         dehnert at sgi.com
>                                 (650)933-4272