Lesson 10, Ch4 - Level 1

30 views
Skip to first unread message

Sudha

unread,
Mar 13, 2015, 7:47:58 AM3/13/15
to anova...@googlegroups.com
proc contents data=orion.orders;
run;
proc contents data=orion.order_item;
run;
data work.allorders;
merge orion.orders orion.order_item;
by order_id;
run;

I am getting an error : ERROR: BY variables are not properly sorted on data set ORION.ORDER_ITEM.

So, i put in a sort step and used the newly created output dataset for the merge.
Although it does merge using order_id, the total no. of observations for me is 733 and not 732 as per the solution.

 

Laxmi Chaudhary

unread,
Mar 13, 2015, 11:27:45 PM3/13/15
to anova...@googlegroups.com
Hi Sudha,

You need to sort your data set first by the BY variable. You have used proc contents instead of proc sort.
The code should be:

proc sort data=orion.orders out=sort_orders;
    by order_id;
run;

proc sort data=orion.order_item out=sort_Item;
    by order_id;
run;

data work.allorders;
merge sort_orders sort_item;
by order_id;
run;
Message has been deleted

Sudha

unread,
Mar 14, 2015, 8:32:33 AM3/14/15
to anova...@googlegroups.com
Hi Laxmi

Thanks for the response, but:
The proc contents mentioned is the starter code as part of the question. I have also used a proc step.  (the code provided was the solution given on the site and not my code).
The question states that "A sort is not required because the datasets are already sorted by the common variable". 
 
Here are the issues:
1. The log says that orion.order__item is not sorted. (Hence, a proc sort was used - although supposedly not required, as per the question).
2. Just like my code, running your code also gives the total no. of observations as 733.

Here is my code:

proc contents data=orion.orders;
run;

proc contents data=orion.order_item;
run;

proc sort data=orion.order_item
out=work.order_item;
by order_id;
run;

data work.allorders;
merge orion.orders work.order_item;
by order_id;
run;

proc print data=work.allorders;
run;

Laxmi Chaudhary

unread,
Mar 15, 2015, 9:17:19 PM3/15/15
to anova...@googlegroups.com
Hey Sudha,

You need to sort both the data sets. If you look at the output of proc contents, non of the data set is sorted by Order_id. 
Nos of observation in each data set: Orders (1418) and Order_item (732)
----------------------------------------------------  Merged_Orders (1628)... since no condition stated it outputs both matches and non matches

Try this 
=========
proc contents data=orion3.orders;
run;

proc contents data=orion3.order_item;
run;

proc sort data=orion3.orders out=orders;
   by order_id;
run;

proc sort data=orion3.order_item out=order_item;
   by order_id;
run;

data Merged_orders;
   merge orders order_item;
   by order_id;
run;

proc print data=Merged_orders;
run;

Laxmi Chaudhary

unread,
Mar 15, 2015, 9:17:37 PM3/15/15
to anova...@googlegroups.com

Try this 
=========
   by order_id;
run;

proc print data=Merged_orders;
run;

Sudha

unread,
Mar 20, 2015, 7:20:20 AM3/20/15
to anova...@googlegroups.com
Hey laxmi
Yes, although the proc contents does say "NO" against the sorted column , upon going through both the datasets orion.order_item and orion.orders - we can see that the order_id observations are listed in ascending order.

However, in spite of sorting both data sets prior to merging them - the final output still shows 733 observations and 12 variables.
i do see one observation where the order_id is missing - maybe the solution does not take this into account??? - but it ideally should.

Reply all
Reply to author
Forward
0 new messages