Values, copies and string building

82 views
Skip to first unread message

Jens Lideström

unread,
Oct 17, 2022, 8:39:51 AM10/17/22
to
I'm wondering the following about M programs in GTM:

When are values copied, and can separate variables sometimes point to the same value?

Are the values of the variables copied in all of the following examples?

```
set localVariable=otherLocalVariable
set localVariable(1)=otherLocalVariable(
set global=otherLocalVariable
set global(1)=otherLocalVariable
set $extract(localVariable,1,2)=otherLocalVariable
do routine(localVariable)
set localVariable=$$extrinsicFunction()
```

These are the operations I want to illustrate with the examples:

* SET
- Are the value of local variables copied when one local is set to another? On globals?
- On the top-level values and values of subscripts?

* On SET `$extract(x,1,2)="X"`? Will the old contents of x be discarded, and a new string allocated with an updated copy?

* On routine and extrinsic functions invocations, are the arguments copied? (Reference arguments are obviously not copied.)

I realise than when the values are small this is not important, because it will be fast to make copies. But if the values are large it seems like it might become a performance problem. It would be good to know a little about how this works.

One example where copies can be slow is when you build a string in a loop:

```
for i=1:1:10000 set text=$get(text)_i
```

It seems like in this example the content of the variable `text` will be copied in each iteration, making the loop runtime O(n^2) in the length of `text`.

Is there any way to avoid all the copies of `text` and build a string with linear execution time?

Are these kinds of things described in the GTM documentation? Or in other places?

Best regards,
Jens Lideström

K.S. Bhaskar

unread,
Oct 18, 2022, 1:07:13 PM10/18/22
to
On Monday, October 17, 2022 at 8:39:51 AM UTC-4, jens.li...@vgregion.se wrote:
> I'm wondering the following about M programs in GTM:
>
> When are values copied, and can separate variables sometimes point to the same value?

[KSB] Yes, sometimes separate variables can point to the same value. In some cases, it is explicit, and in some cases, it is implicit.

For explicitly connecting variables, look at alias variables (https://docs.yottadb.com/ProgrammersGuide/langext.html#alias-variable-extensions).

For the implicit cases, an explanation might help. Actual strings are stored in a garbage collected heap called the string pool. An "mval" is data structure that includes (among other fields) a pointer to a location in the string pool and an associated length. So, a statement such as SET X=$EXTRACT(Y,…) creates a new mval that points to a part of the stringpool that the mval for X points to.

Strings in the string pool are never modified, but they are garbage collected from time to time.

> Are the values of the variables copied in all of the following examples?
>
> ```
> set localVariable=otherLocalVariable
> set localVariable(1)=otherLocalVariable(

[KSB] In the above two cases, mvals are copied, not strings in the string pool.

> set global=otherLocalVariable
> set global(1)=otherLocalVariable

[KSB] Outside a transaction these two update an in-memory copy of the database block. When they occur inside transactions, the database blocks are updated when transactions are committed.

> set $extract(localVariable,1,2)=otherLocalVariable

[KSB] Yes, this does copy data in the string pool.

> do routine(localVariable)

[KSB] This copies an mval.

> set localVariable=$$extrinsicFunction()

[KSB] The mval of the value returned by the extrinsic function is assigned to the local variable.

> ```
>
> These are the operations I want to illustrate with the examples:
>
> * SET
> - Are the value of local variables copied when one local is set to another? On globals?
> - On the top-level values and values of subscripts?
>
> * On SET `$extract(x,1,2)="X"`? Will the old contents of x be discarded, and a new string allocated with an updated copy?
>
> * On routine and extrinsic functions invocations, are the arguments copied? (Reference arguments are obviously not copied.)
>
> I realise than when the values are small this is not important, because it will be fast to make copies. But if the values are large it seems like it might become a performance problem. It would be good to know a little about how this works.

[KSB] If the answers above did not answer these question, please ask again.

> One example where copies can be slow is when you build a string in a loop:
>
> ```
> for i=1:1:10000 set text=$get(text)_i
> ```
>
> It seems like in this example the content of the variable `text` will be copied in each iteration, making the loop runtime O(n^2) in the length of `text`.
>
> Is there any way to avoid all the copies of `text` and build a string with linear execution time?

[KSB] The concatenate operator has an optimization that if the string being concatenated to is the last string in the string pool such that the string can just be 'extended' then it does so without copying the entire string over so this should not be a performance issue other than of course doing one character at a time. The loop is far more expensive than the concatenate. In a loop such as this, the first set may copy the string but subsequent extensions shouldn't because the shortcut would be available as it would then be the last string in the string pool.

> Are these kinds of things described in the GTM documentation? Or in other places?

[KSB] Only in the source code.

Regards
– Bhaskar

>
> Best regards,
> Jens Lideström

Jens Lideström

unread,
Oct 18, 2022, 2:23:21 PM10/18/22
to
Thanks for the explanation, Bhaskar! This is super interesting information.

/Jens
Reply all
Reply to author
Forward
0 new messages