How to? bulk record creation clojurewerkz.elastisch.native.bulk/bulk-with-index-and-type

83 views
Skip to first unread message

Dave Tenny

unread,
Sep 2, 2016, 12:53:38 PM9/2/16
to clojure-elasticsearch
So I have a bunch of records I wish to load in bulk.

Here's a sample clojure- 'pr' -d record

{:_index ".kibana", :_type "search", :_id "Firm-Provisions-Used-(associated-with-docs)", :_score 1.0, :_version -1, :highlight {}, :_source {:hits 0, :columns ["APC-DEPLOYMENT-ID" "firm-name" "firm-id" "provision-name" "provision-id"], :description "", :sort ["_score" "desc"], :title "Firm Provisions Used (associated with docs)", :version 1, :kibanaSavedObjectMeta {:searchSourceJSON "{\"index\":\"kira-provision-summaries-*\",\"query\":{\"query_string\":{\"analyze_wildcard\":true,\"query\":\"*\"}},\"filter\":[],\"highlight\":{\"pre_tags\":[\"@kibana-highlighted-field@\"],\"post_tags\":[\"@/kibana-highlighted-field@\"],\"fields\":{\"*\":{}},\"require_field_match\":false,\"fragment_size\":2147483647}}"}}}

You'll notice it has useful values for :_id (in particular), as well as :_index and :_type.

I cannot seem to find a way to call the native/bulk apis in Elastisch (3.0.0-beta-1) for these records.

If I try bulk-with-index-and-type and any of the doc stream prep utilities bulk-create, bulk-index, bulk-update,
nothing works.  Furthermore, they're mostly inclined to put in the :_index and :_type values that I already have, and so I'm not sure why I'd want them.

If I just try to call bulk-with-index-and-type on a vector of documents for the type, still no luck.

Btw, I notice the unit test for native/bulk have no bulk-create tests.

The could of course be something else that's off on my end.

Meanwhile, what is *supposed* to be the correct native/bulk way to create records like the above?

Thanks for any tips.

(And yes, if you're familiar with Kibana, I'm playing with its indices directly instead of via the UI).


Dave Tenny

unread,
Sep 6, 2016, 11:37:14 AM9/6/16
to clojure-elasticsearch
Well, to answer my own questions.

I was basically trying to take a search hit and use it to populate an index.    There are two problems.  
(1) Elastisch doesn't support bulk create, if you use the native 'bulk-create' API you're going to be disappointed .  Use bulk-index.
(2) There's a flat map structure requirement for documents to be inserted.

You'll see this exception if you try to use bulk-create:
org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: no requests added;


The following function (or docstring) may be of use to you if you are trying to massage search hits for subsequent index population:

(defn search-hit->elastisch-document
  "When you do a search in Elastisch, a single 'hit' in the search result might look like:
  {:_index \".kibana\" :_type \"search\" :_id \"Foo\" ... _source {:columns ... :description ...}}
  Where the first '...' is the ES document metadata, and the '...' in the _source maps is the various
  properties of the document.
  
  If you want to take the above hit and use it to create a document though, 
  if you pass that into the (esb/bulk-create ...) interface it will correctly pick off the
  ES metaproperties, but not the _source stuff.  Elastisch wants ALL the properties in at the same
  level in a map. E.g.
  {:_index \".kibana\" :_type \"search\" :_id \"Foo\" ... :columns ... :description ...}

  So this function does that transformation.  If there's such a function in Elastisch I haven't found it.
  Elastisch doesn't even have a test coded for bulk-create much less the sort of search-hit->doc-action
  I'm talking about.

  FURTHERMORE: It looks like Elastisch doesn't support 'create' operations at all.
  Looking at cnv/->action-requests and the AddOperation protocols, there's no 'create' support.
  Use 'index.  E.g. (esb/bulk conn (esb/bulk-index ...docs...)), not bulk-create.

  Input: search hit or similar map.
  Output: suitably flattened map."
  [search-hit]
  ;; *TBD*: DO I need to remove everything but _id _type _index w.r.t. ES meta properties?
  ;; e.g. remove :_score?  For now I am.  That includes _score, _version, :highlight (Elastisch meta?),
  ;; and maybe other things like aggregates or what have you.
  (assoc (:_source search-hit)
         :_id (:_id search-hit)
         :_type (:_type search-hit)
         :_index (:_index search-hit)))


Reply all
Reply to author
Forward
0 new messages