hi all,
Thanks for setting this framework up and the detailed writeup at
(Among the description of what I did, I have the questions as Question1 and Question2).
I followed the steps there, and was able to get the spot instances on EC2 running. Before doing any changes to do my own extraction, I was simply trying to run the code using the Wat processor. However, the queue does'nt seem to start getting processed.
./bin/master queue --bucket-prefix CC-MAIN-2013-48/segments/1386163041297/wat/
added 100 objects to the queue
Then I started 10 workers (with instance type c3.xlarge). One of the changes I made in dpef.properties is the following, since the AMI already listed there gave an error.
## AMI which will be launched (Make sure the AMI you select has e.g. the write system language, which can influence your reading and writing of files.)
ec2ami = ami-01940631
Question1:
I am not sure how to do the selection of AMI as per the comment above. It says 'make sure the AMI you select has e.g. the write system language'. How is that done? Is the AMI have chosen okay for that?
./bin/master monitor
Monitoring job queue, extraction rate and running instances.
Q: 100 (0), N: 10/10
Even after like 10-15mins, this was continuing to show this same state.
Question2:
Would appreciate any hints/pointers on how to check if the wat files are being processed. For example, where in EC2 should I go to look at some log files for the jobs?
Thanks in advance for your help,
Cheenu