ElasticSearch index not updated on SQL UPDATE/DELETE

89 views
Skip to first unread message

Xavier Tello

unread,
May 17, 2017, 11:23:50 AM5/17/17
to zom...@googlegroups.com
Hello,

Using ZomboDB 3.1.12, PostgreSQL 9.3, ElasticSearch 1.7.3, I can create my ES index as expected, but when I perform a DELETE in my DB, the corresponding document in ES index is not removed (using Wireshark I can see that no HTTP call is made to ES even when the DB transaction is committed and closed). When I UPDATE a db row, I can see a HTTP POST sent to ES to create a new document, but the old document is still in the index (same document duplicated: old one and new one).

Here is how I create my index:
CREATE TYPE item_desc AS (id INTEGER, title FULLTEXT, description FULLTEXT, date TIMESTAMP);
CREATE INDEX idx_zdb_items ON items USING zombodb(zdb('items', items.ctid), zdb( ROW(id, title, description, modified_at)::item_desc  )) WITH (url='http://localhost:9200/');

Thanks
--
Xavier Tello

xavier...@cbc.ca

unread,
May 17, 2017, 12:58:13 PM5/17/17
to ZomboDB, xavier...@cbc.ca
Updates: I also reproduced this issue with the procedure+data described in the "Getting Started" official ZomboDB Github page.  Same issue with ZomboDB 4.0.0 / ES 2.4.4.
Of course, when using "REINDEX INDEX my_index", the ES index is updated, but I expect ZomboDB to synchronize ES index without having to rebuild this index every time (could lead to huge load in case of frequent modifications and big index)

Eric Ridge

unread,
May 18, 2017, 10:49:16 PM5/18/17
to zom...@googlegroups.com, xavier...@cbc.ca

DELETEs aren’t supported by ZDB.  The Postgres Acess Method API doesn’t route DELETEs through the API, so ZDB doesn’t get notification of them.  You can run a plain VACUUM on the table to force the deleted rows to be removed from the backing ES index, or wait for autovacuum to do it.

 

Note that querying ZDB through Postgres with a SELECT * FROM table WHERE zdb(‘table’, ctid) ==> ‘full text query’-style query will never return deleted rows, so generally this isn’t a problem.

 

If you’re querying ES directly through its REST API, that’s not a supported use-case for ZDB, so that is kinda a buyer-beware situation.

 

That said, I created an issue the other day (https://github.com/zombodb/zombodb/issues/208) that talks about a crashing problem ZDB has with the most-recent PG point releases.  In order to fix that (it’s related to VACUUM), I’m going to need to add support for tracking DELETEd tuples, but the implementation detail there will be adding additional docs to a new type in the ES index, not actually deleting the docs.

 

All the more reason why you should be querying ZDB via PG, not directly querying its ES indexes via ES’ REST API.

 

eric

 

Sent from Mail for Windows 10

--
You received this message because you are subscribed to the Google Groups "ZomboDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zombodb+u...@googlegroups.com.
To post to this group, send email to zom...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zombodb/f322f87e-7ef9-4e09-ab8d-cfae56191393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

 

Xavier Tello

unread,
May 19, 2017, 8:39:28 AM5/19/17
to Eric Ridge, zom...@googlegroups.com
Hello Eric,

Thank you for your answers.

For now we are using PG v9.3.13 so I guess VACUUM will do the trick: I just tried it and it seems that VACUUM is forcing ZDB to rebuild the whole ES index, so UPDATEd/DELETEd rows are synchronized - I will now check if it's an acceptable solution for production to run it more frequently than it is right now.

As for calling ES directly, I had some trouble to write the ZDB requests I needed, so I decided to use ES directly (more flexible, more support/documentation due to ES community). So I guess I will try again soon, I may ask you new questions to write my ZDB requests ;-)

Thanks again!
Xavier

To unsubscribe from this group and stop receiving emails from it, send an email to zombodb+unsubscribe@googlegroups.com.


To post to this group, send email to zom...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zombodb/f322f87e-7ef9-4e09-ab8d-cfae56191393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

 




--
Xavier Tello

Senior Developer
CBC / Digital Operations / Content Distribution
Desk 9B110-H

Eric Ridge

unread,
May 19, 2017, 3:16:11 PM5/19/17
to Xavier Tello, zom...@googlegroups.com

Hi Xavier!

 

Running a plain VACUUM (as in, simply “VACUUM tablename;”) does *NOT* rebuild indexes.  A “VACUUM FULL tablename;” would, however.

 

Also, you might should spend some time with ZDB’s “SYNTAX.md” documentation, to help you with formulating queries.  Note that it even lets you use ES QueryDSL directly:  https://github.com/zombodb/zombodb/blob/master/SYNTAX.md#elasticsearch-json-queries

 

Bypassing PG and talking to ES directly really isn’t a supported situation.

 

eric

 

Sent from Mail for Windows 10

 

To unsubscribe from this group and stop receiving emails from it, send an email to zombodb+u...@googlegroups.com.


To post to this group, send email to zom...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zombodb/f322f87e-7ef9-4e09-ab8d-cfae56191393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



 

--

Xavier Tello

 

Senior Developer

CBC / Digital Operations / Content Distribution

Desk 9B110-H

 

Xavier Tello

unread,
May 19, 2017, 3:26:58 PM5/19/17
to Eric Ridge, zom...@googlegroups.com
Yes, I talked about VACUUM FULL, sorry it was not clear.
Thanks for the link, I will look at it.

Xavier

To unsubscribe from this group and stop receiving emails from it, send an email to zombodb+unsubscribe@googlegroups.com.


To post to this group, send email to zom...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zombodb/f322f87e-7ef9-4e09-ab8d-cfae56191393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



 

--

Xavier Tello

 

Senior Developer

CBC / Digital Operations / Content Distribution

Desk 9B110-H

 

Reply all
Reply to author
Forward
0 new messages