Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Outerjoin: MergeKeys true, key variables should be in output

101 views
Skip to first unread message

paul.d...@gmail.com

unread,
Aug 28, 2015, 5:16:06 PM8/28/15
to
My apologies for the re-post, but my original had a typo in the subject line, make it likely that no one will recognize the outerjoin command. Also, I had errors in my closing 3 points. Here is the corrected post.

According to http://www.mathworks.com/help/matlab/ref/outerjoin.html, If you specify, 'MergeKeys',true, then outerjoin includes all key variables in the output table, C, and overrides the inclusion or exclusion of key variables specified via the 'LeftVariables' and 'RightVariables' name-value pair arguments.

I'm finding this to be untrue:

x=floor(10*rand(4,2))
y=floor(10*rand(4,2))
outerjoin( ...
array2table(x), array2table(y), ...
'Leftkeys',2, 'Rightkeys',1, ...
'MergeKeys', true, ...
'LeftVariables',1, 'RightVariables',2 ...
)

The key variable is x2 (merged with y1. It doesn't show up in the output:

x =
3 6
9 7
6 6
6 3
y =
1 0
0 9
1 3
3 9
ans =
x1 y2
___ ___
NaN 9
NaN 0
NaN 3
6 9
3 NaN
6 NaN
9 NaN

I'm wondering if:
(1) I'm mistaken,
(2) the documentation is wrong and hence will change, or
(3) the function behaviour is wrong and hence will change.

dpb

unread,
Aug 28, 2015, 6:09:01 PM8/28/15
to
On 08/28/2015 4:16 PM, paul.d...@gmail.com wrote:
...

> According to
> http://www.mathworks.com/help/matlab/ref/outerjoin.html,
...[a specific behavoir]...
> I'm finding this to be untrue:
...

> I'm wondering if:
> (1) I'm mistaken,
> (2) the documentation is wrong and hence will change, or
> (3) the function behaviour is wrong and hence will change.

This is the stuff of a support request at <www.mathworks.com>

--

paul.d...@gmail.com

unread,
Aug 28, 2015, 6:42:34 PM8/28/15
to
On Friday, August 28, 2015 at 6:09:01 PM UTC-4, dpb wrote:
Noted. I just submitted it to TMW. Thanks.

Peter Perkins

unread,
Aug 30, 2015, 10:10:55 AM8/30/15
to
Paul, you've specified 'LeftVariables' and 'RightVaribales', and those
override what normally happens. The key sentence in the doc is this:

"You can use 'LeftVariables' to include or exclude key variables as well
as nonkey variables from the output, C."

With such a small example, it's hard to say what you need to do here,
but maybe you want to exclude variables from the input tables by
"subscripting them out" when you pass the tables into outerjoin.

Hope this helps.

dpb

unread,
Aug 30, 2015, 10:32:30 AM8/30/15
to
On 08/30/2015 9:10 AM, Peter Perkins wrote:
> Paul, you've specified 'LeftVariables' and 'RightVaribales', and those
> override what normally happens. The key sentence in the doc is this:
>
> "You can use 'LeftVariables' to include or exclude key variables as well
> as nonkey variables from the output, C."
>
> With such a small example, it's hard to say what you need to do here,
> but maybe you want to exclude variables from the input tables by
> "subscripting them out" when you pass the tables into outerjoin.
>
> Hope this helps.

Peter, I'm with Paul...it _may_ be the behavior is WAD, but I, like he,
can't really decipher what the expected behavior really is from the
documentation. The use of 'or' there is confusing as it appears to mean
either of two disparate actions but I don't see how to know which.

Admittedly, I don't have the facility so haven't explored it, but I
can't make heads nor tails of the functionality just reading...

--

paul.d...@gmail.com

unread,
Aug 30, 2015, 1:18:55 PM8/30/15
to
On Sunday, August 30, 2015 at 10:32:30 AM UTC-4, dpb wrote:
>On 08/30/2015 9:10 AM, Peter Perkins wrote:
>> Paul, you've specified 'LeftVariables' and 'RightVaribales', and
>> those override what normally happens. The key sentence in the doc
>> is this:
>>
>> "You can use 'LeftVariables' to include or exclude key variables as
>> well as nonkey variables from the output, C."
>>
>> With such a small example, it's hard to say what you need to do
>> here, but maybe you want to exclude variables from the input tables
>> by "subscripting them out" when you pass the tables into outerjoin.
>
> Peter, I'm with Paul...it _may_ be the behavior is WAD, but I, like
> he, can't really decipher what the expected behavior really is from
> the documentation. The use of 'or' there is confusing as it appears
> to mean either of two disparate actions but I don't see how to know
> which.
>
> Admittedly, I don't have the facility so haven't explored it, but I
> can't make heads nor tails of the functionality just reading...

Also, as per my original post, the documentation clearly says that the
MergeKeys behavious trumps LeftVariables and RightVariables behaviour.

Peter Perkins

unread,
Aug 31, 2015, 9:58:29 AM8/31/15
to
Paul, my apologies. I had not seen that line in the doc, it does appear
to be incorrect. I will make a note to have it fixed. The lines that
say, "... include or exclude key variables ..." are intended to mean
"use this to say what should or should not be in the output?. I'll make
a note to have that clarified.

Do you have an argument that the opposite behavior, i.e. that outerjoin
ought to behave the way that line in the doc says, is what you need? If
you leave out 'MergeKeys', then you'll get the key in the output. If
your goal in using 'LeftVariables' and 'RightVariables' is to exclude
variables that are not keys, you can do that with subscripting (either
before or after calling outerjoin).

Peter Perkins

unread,
Aug 31, 2015, 10:42:23 AM8/31/15
to
Sorry, I meant to say, "If you leave out 'LeftKeys' and 'RightKeys',
then you'll get the (merged) key in the output."

paul.d...@gmail.com

unread,
Sep 2, 2015, 1:17:52 PM9/2/15
to
No need to apologize, I'm glad that the documentation will be revised.

Frankly, though, I'm not sure why it's necessary to take control away from the user. The user can benefit from LeftVariables and RightVariables, and can also include key variables if he/she wishes.

I was confused by your 1st paragraph. My interpretation of the doc is that Left/RightVariables select which columns appear in the output, and if MergeKeys is true, then the merged key will be included regardless of whether it is specified in Left/RightVariables. If that is the *intended* meaning, it seems reasonable (though I still prefer the greater control in the preceding paragraph). However, it isn't what I observe. The example in my original post indicates that MergeKeys being true has no bearing on the output, and that the output is determined only by Left/RightVariables.

You also said that I can have the merge key included by leaving out Left/RightVariables, but that removes any selective control from the user at all. In answer to the question of whether I want to use Left/RightVariables for inclusion or exclusion, my first gut reaction would be to include all by default, and the exception would be if the user specifies Left/RightVariables. In the latter case, exclude by default, and include only if fields are specified in Left/RightVariables. However, I understand your suggestion about indexing to select fields for the outerjoin. I'll play around with that going forward. Thanks.

You
0 new messages