Order by clause displayed sorted output without executing any reducer

santlal gupta

unread,

May 4, 2016, 4:11:26 AM5/4/16

to Lingual User

Hi,

I am new to lingual, i ran below query on 1MB of input file:

select * from "logTest"."logTest" where "f1">0 order by "f1";

This query displayed result in sorted order, but when i saw log file, i got number of reduce task as 0 and also it only launched mapper. Below is the log from it:

2016-05-04 12:20:28,707 INFO [main] org.apache.hadoop.mapred.MapTask: numReduceTasks: 0

2016-05-04 12:20:13,702 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1451399010344_2296 = 1287639. Number of splits = 2

2016-05-04 12:20:30,192 INFO [main] cascading.flow.hadoop.FlowMapper: sinking to: Hfs["SQLTypedTextDelimited[['f1', 'f2', 'f3' | int, String, double]]"]["hdfs://UbuntuD1:8020/user/hduser/results/20160504-121959-D60ED4737F.tcsv"]

I also found that the result of the query was saved at below location:

2016-05-04 12:21:20,225 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of task 'attempt_1451399010344_2296_m_000001_0' to hdfs://UbuntuD1:8020/user/hduser/results/20160504-121959-D60ED4737F.tcsv/_temporary/1/task_1451399010344_2296_m_000001

When I went to hdfs://UbuntuD1:8020/user/hduser/results/20160504-121959-D60ED4737F.tcsv location i found below 2 part files (i assume this is the result of two map tasks. please correct me if i am wrong) but the data in them was not in sorted order.

hduser@UbuntuD2:~$ hadoop fs -ls /user/hduser/results/20160504-121959-D60ED4737F.tcsv

Found 3 items

-rw-r--r-- 3 hduser hadoop 0 2016-05-04 12:21 /user/hduser/results/20160504-121959-D60ED4737F.tcsv/_SUCCESS

-rw-r--r-- 3 hduser hadoop 640276 2016-05-04 12:20 /user/hduser/results/20160504-121959-D60ED4737F.tcsv/part-00000

-rw-r--r-- 3 hduser hadoop 640227 2016-05-04 12:21 /user/hduser/results/20160504-121959-D60ED4737F.tcsv/part-00001

So my question is, how lingual sorted the data without executing any reducer?

Why the result of the query displayed on the shell was sorted when the output of mapper is unsorted?

I have attached the log file for more information.

Thanks,

Santlal

application_1451399010344_2296.txt

Andre Kelpe

unread,

May 4, 2016, 6:51:52 AM5/4/16

to lingua...@googlegroups.com

Which version of lingual are you using?

- André

> --
> You received this message because you are subscribed to the Google Groups
> "Lingual User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lingual-user...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
André Kelpe
an...@concurrentinc.com
http://concurrentinc.com

santlal gupta

unread,

May 5, 2016, 2:59:15 AM5/5/16

to Lingual User

Hi Andre,

Thanks for your quick response.

I am using following version :

lingual : 1.2.1

Cascading: 3.0.1
Hadoop : 2.6.0

Thanks

Santlal

Andre Kelpe

unread,

May 6, 2016, 6:15:56 AM5/6/16

to lingua...@googlegroups.com

That is odd. Would you mind sharing an example log file and the way
you configured your catalog with me?

Thanks!

- André

santlal gupta

unread,

May 9, 2016, 2:54:09 AM5/9/16

to Lingual User

Hi Andre,

Thanks for your response.

I have attached log files for your reference.

for configuring catalog, i followed steps mentioned in lingual user guide (http://docs.cascading.org/lingual/1.2/).

Thanks

Santlal

UbuntuD3.myCluster_52529

UbuntuD4.myCluster_49849

santlal gupta

unread,

May 11, 2016, 2:58:23 AM5/11/16

to Lingual User

Hi Andre,

Did you get a chance to look into the log files. I am unable to proceed further with my poc. I am waiting for your response.