I really appreciate Ruflin for his great work behind Elastica. I am Vijay. I work on data management projects, I started working on Elastica recently and I want to implement Elastic Search on our web application. I have installed Elastica and ran the tests and wrote few examples myself. Everything is great. I want to know what is the best way to index xmls. I have huge amounts of xml files. A sample xml looks like this.
....
....
1 ) what's the best way to index xml files, read from db or file system. Can anyone provide some examples of doing it? I have read "Lucene in Action" and it shows how to index the xml document. It parses the xml document, reads each tag and creates a field for it. Should I use the same approach or is there any way I can convert the xml to json and use it.
2) I am trying to define my mapping like this.
3) my web application is to display these documents for add/create or edit . So, the xmls will be changing.. once the document is modified, do I need to remove the document from the index and create a new document and add it to the index?
<document>
<document_id>mnwp000002</document_id>
<record_create_date>04-30-2004</record_create_date>
<indexing_data_id>mnwp</indexing_data_id>
<item_title>Mrs. George Elder Adams, of New York, who took part in the picketing of the White House by members of the Woman's Party.</item_title>
<author_creator label="pht">Edmonston, Washington, D.C.</author_creator>
<source_collection>Records of the National Woman's Party</source_collection>
<collection_id>ammem/mnwp</collection_id>
<physical_locator_id label="Location">National Woman's Party Records, Group I, Container I:147, Folder: Adams, Mrs. George E.</physical_locator_id>
<document_type>still image</document_type>
<genre authority="bgtchm">Photographs</genre>
<medium>1 photograph: print; 4 x 6 in.</medium>
<text_date>[ca. 1917-1920]</text_date>
<language_of_cataloging>eng</language_of_cataloging>
<digital_origin>reformatted digital</digital_origin>
<subject label="lcsh">National Woman's Party</subject>
<subject label="lcsh">Suffragists--United States--1910-1920</subject>
<subject label="lcsh">The Suffragist (serial)</subject>
<subject label="lcsh">Women--Suffrage--New York (State)</subject>
<subject label="local">Adams, Mrs. George Elder</subject>
<geog_subject>
<country>United States</country>
<state>New York</state>
</geog_subject>
<note label="Summary">Studio portrait, Mrs. George Elder Adams of New York, in hat and fur stole, standing with a copy of the newsletter The Suffragist in her hands.</note>
<note>Title transcribed from item.</note>
<digital_object fileGrp_ptr="GRP001">
<do_reference>
<do_digital_id>147001</do_digital_id>
<do_aggregate>mnwp</do_aggregate>
</do_reference>
<do_handle_information>hdl:loc.mss/mnwp.147001</do_handle_information>
<do_display_type_id>p</do_display_type_id>
</digital_object>
<restriction_description>No known restrictions on use or reproduction.</restriction_description>
<division_id>mss</division_id>
<date_sorter>19170000</date_sorter>
<fileSec>
<fileGrp ID="GRP001">
<file ID="FILE001" MIMETYPE="image/tiff" CREATED="2004-12-22" USE="master" SEQ="1" SIZE="5198214">/master/mss/mnwp/147/147001u.tif</file>
<file ID="FILE002" MIMETYPE="image/gif" CREATED="2005-06-29" USE="thumbnail" SEQ="1" SIZE="7980">/service/mss/mnwp/147/147001t.gif</file>
<file ID="FILE003" MIMETYPE="image/jpeg" CREATED="2005-06-29" USE="service-high" SEQ="1" SIZE="657830">/service/mss/mnwp/147/147001v.jpg</file>
<file ID="FILE004" MIMETYPE="image/jpeg" CREATED="2005-06-29" USE="service-low" SEQ="1" SIZE="27897">/service/mss/mnwp/147/147001r.jpg</file>
</fileGrp>
</fileSec>
</document>
Please, provide me with any samples or documentation that helps.
Really appreciate any kind of help.. !!
- Vijay.