@Senthil - I am calculating average of r,g,b in per image in the mapper task, and outputting the averaged value.(The pair R,G,B). All these are then averaged in the reducer to get average of RGB of all images in the HIB.
I tried setting my mapreduce.io.sort.mb to 200 and running the program. Unfortunately, the program aborts way early then it was earlier. Here is the output -
15/06/24 11:30:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/24 11:30:05 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted avg1k-hipi-op
15/06/24 11:30:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/24 11:30:08 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/06/24 11:30:08 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/06/24 11:30:08 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/24 11:30:08 INFO input.FileInputFormat: Total input paths to process : 1
Spawned 2map tasks
15/06/24 11:30:08 INFO mapreduce.JobSubmitter: number of splits:2
15/06/24 11:30:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local37413710_0001
15/06/24 11:30:10 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
15/06/24 11:30:10 INFO mapreduce.Job: Running job: job_local37413710_0001
15/06/24 11:30:10 INFO mapred.LocalJobRunner: OutputCommitter set in config null
15/06/24 11:30:10 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
15/06/24 11:30:10 INFO mapred.LocalJobRunner: Waiting for map tasks
15/06/24 11:30:10 INFO mapred.LocalJobRunner: Starting task: attempt_local37413710_0001_m_000000_0
15/06/24 11:30:10 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
15/06/24 11:30:10 INFO mapred.MapTask: Processing split: hdfs://localhost:54310/user/ubuntu/images1k.hib.dat:0+134272033
15/06/24 11:30:10 INFO mapred.MapTask: (EQUATOR) 0 kvi 52428796(209715184)
15/06/24 11:30:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 200
15/06/24 11:30:10 INFO mapred.MapTask: soft limit at 167772160
15/06/24 11:30:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 209715200
15/06/24 11:30:10 INFO mapred.MapTask: kvstart = 52428796; length = 13107200
15/06/24 11:30:10 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
Record starts at byte 0 and ends at byte 134272032
15/06/24 11:30:11 INFO mapreduce.Job: Job job_local37413710_0001 running in uber mode : false
15/06/24 11:30:11 INFO mapreduce.Job: map 0% reduce 0%
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000ed78b000, 136794112, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 136794112 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/ubuntu/hs_err_pid5158.log