Hi
I am new to use Cloudera Search. I have following queries
a) when I use post.jar to test Solr Collection. It only requires me to create schema.xml for Solr where I define a simple schema. it does not require me to create morphline file. But when I use MapReduceIndexer then it is required to create morrphline file. So my query is if I had a very simple csv file on which I need to create index . Why MapReduceIndexer requires a morphline file to create. The reason I ask this query is in next query.
b) When I look at Solr documentation, then it can directly accept xml file as input. But cluster has Kerberos so I have to use MapReduceIndexer to create index on xml input file. and Morphline file does not have a direct xml read method like readCSV. So I have to use xquery/xslt. Is there any way not to use morphline file and create index using MapReduceIndexer in accordance with defined schema.xml. If no, then how morphline file helps me , when I use xslt then am I supposed to flatten the xml file and write output as a flattened record to loadSolr.
c) also input xml that I have does have records with nested fields. What is best development practice in terms of how combination of morphline and schema.xml in solr should be written ?
Thanks
Aniruddh