Is it possible to retrieve the original data structure, or JSON encoded string of the document, from the results of a bleve index.Search?
After looking at https://github.com/blevesearch/bleve/blob/master/http/doc_get.go, I tried something like this:
import (
"encoding/json"
"fmt"
"github.com/blevesearch/bleve"
"github.com/blevesearch/bleve/document"
"time"
)
index, _ := bleve.Open("MyIndex.bleve")
query := bleve.NewMatchQuery("<my search query>")
searchRequest := bleve.NewSearchRequest(query)
searchResult, _ := index.Search(searchRequest)
// collect the original documents
for _, val := range searchResult.Hits {
id := val.ID
doc, _ := index.Document(id)
rv := struct {
ID string `json:"id"`
Fields map[string]interface{} `json:"fields"`
}{
ID: id,
Fields: map[string]interface{}{},
}
for _, field := range doc.Fields {
var newval interface{}
switch field := field.(type) {
case *document.TextField:
newval = string(field.Value())
case *document.NumericField:
n, err := field.Number()
if err == nil {
newval = n
}
case *document.DateTimeField:
d, err := field.DateTime()
if err == nil {
newval = d.Format(time.RFC3339Nano)
}
}
existing, existed := rv.Fields[field.Name()]
if existed {
switch existing := existing.(type) {
case []interface{}:
rv.Fields[field.Name()] = append(existing, newval)
case interface{}:
arr := make([]interface{}, 2)
arr[0] = existing
arr[1] = newval
rv.Fields[field.Name()] = arr
}
} else {
rv.Fields[field.Name()] = newval
}
}
js, _ := json.MarshalIndent(rv, "", " ")
fmt.Printf("%s\n", js)
}
}
but the result isn’t exactly like the original document that was indexed. Is there a better way to do this?
My intent is to get the IDs from the results of a search, somehow fetch the original documents corresponding to the IDs from the key-value store, and then render those documents into customized text and/or web views.
Is trying to regenerate the original document from a search result an inappropriate use of bleve?
-Indraniel
Is it possible to retrieve the original data structure, or JSON encoded string of the document, from the results of a bleve
index.Search?
After looking at https://github.com/blevesearch/bleve/blob/master/http/doc_get.go, I tried something like this:
but the result isn’t exactly like the original document that was indexed. Is there a better way to do this?
My intent is to get the IDs from the results of a search, somehow fetch the original documents corresponding to the IDs from the key-value store, and then render those documents into customized text and/or web views.
Is trying to regenerate the original document from a search result an inappropriate use of bleve?
but the result isn’t exactly like the original document that was indexed. Is there a better way to do this?
You didn't mention how it differed, but I'm assuming it was date and number fields that were wrong? If not, it might be useful to share the problems, some of them might be fixable.
Sorry, I forgot to apply the formatting in my earlier post. Hopefully this edition is a bit easier to read in a browser.
On Wednesday, March 11, 2015 at 7:46:33 AM UTC-5, Marty Schoch wrote:
but the result isn’t exactly like the original document that was indexed. Is there a better way to do this?
You didn't mention how it differed, but I'm assuming it was date and number fields that were wrong? If not, it might be useful to share the problems, some of them might be fixable.
My apologies with not showing a concrete example in my earlier post. Here’s a toy example I made with a 2-level nested document data structure:
package main
import (
"encoding/json"
"fmt"
"[github.com/blevesearch/bleve](http://github.com/blevesearch/bleve)"
"[github.com/blevesearch/bleve/document](http://github.com/blevesearch/bleve/document)")
for _, val := range results.Hits {
id := val.ID
doc, _ := index.Document(id)
rv := struct {
ID string `json:"id"` {
rv.Fields[field.Name()] = newval
}
}
j2, _ := json.MarshalIndent(rv, "", " ")
docs = append(docs, j2)
}
There is one other possibility. Bleve offers the ability for an application to store any side-channel information it wants inside the underlying KVstore. Normally this is used by apps which store sequence numbers or progress tracking information so they can safely resume indexing streams of data. But, there is no restriction on how it is used, your application has an entire key space to use as it sees fit. In your case, you could directly store document source in this by performing:index.SetInternal(docID, docSource)Then, simply retrieve it using:index.GetInternal(docID)If you think you also might need to store other things here, then it is up to you to further partition the key space with some prefix.Bleve, won't know about any of this data, and won't use it in any way, but it might be a quick work around you could use for now.
Are documents with nested levels of data structures considered bad search design?
Using index.GetInternal seems to work well for me too. I’ve updated my toy example with the technique and placed it at this github gist:
https://gist.github.com/indraniel/8108bd7def9b5e222417
Thanks again for the explanations!