Excluding rows

30 views
Skip to first unread message

SanF Fer

unread,
Apr 18, 2017, 9:03:51 PM4/18/17
to zom...@googlegroups.com
This may be a silly question, but is there a way to exclude rows from ending up in ES?

Eric Ridge

unread,
Apr 19, 2017, 2:09:00 AM4/19/17
to zom...@googlegroups.com
Maybe not silly. Can you be more specific about your use case? I can think of a few ways...

eric
On Tue, Apr 18, 2017 at 7:03 PM SanF Fer <sa.fer...@gmail.com> wrote:
This may be a silly question, but is there a way to exclude rows from ending up in ES?

--
You received this message because you are subscribed to the Google Groups "ZomboDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zombodb+u...@googlegroups.com.
To post to this group, send email to zom...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zombodb/CAMSoeuRbv%2BtSt%2B4GZWVpHc-p_SZ6hN-RYjB_8%2BQJWbCdu_-o2g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

sa.fer...@gmail.com

unread,
Apr 19, 2017, 9:59:20 AM4/19/17
to ZomboDB
In my case I'm storing a file listing in Postgres. It'll contain both folders and files. Say I have these columns:

file_path, file_ext, file_name, file_isdir

In the hypothetical situation that my web application:
  • can search for files, not folders
  • can not search in filepaths
I'd first create a type that excludes the columns that are fast enough to search for in Postgres - (or that I don't want in ES for other reasons):

CREATE TYPE type_files AS (
id bigint,
file_name VARCHAR,
file_ext VARCHAR(8));

CREATE INDEX zdb_idx_files ON files
USING zombodb (zdb('files', ctid), zdb(ROW(
id,
file_name,
file_ext)::type_files)) WITH (url='http://localhost:9200/');

Which I reckon limits the stuff going into ES, except that folders (rows that have `file_isdir` set to true) will still end up in ES, while my web application never uses this data.

The overall idea would be to decrease the performance impact of zombodb's bulk behaviour by having full control over what goes into ES, save some IO and diskspace in the process.

Eric Ridge

unread,
Apr 19, 2017, 1:34:25 PM4/19/17
to ZomboDB
To do what you want, you can simply create a "partial index", like so:

CREATE INDEX idxfoo ON table USING zombodb (....) WITH (...) WHERE file_isdir = false;

Postgres will then only insert rows into that index when its predicate evaluates to true.

When you query, you *must* also include that predicate or else Postgres won't know that it can use the partial index:

SELECT * FROM table WHERE zdb('table', ctid) ==> 'user query' AND file_isdir = false;

You'll only pay a small performance cost when you INSERT or UPDATE records, and save all sorts of time at SELECT.

Note that partial indexes are built-in to Postgres and not anything ZomboDB specific.  Postgres is pretty darn great!

eric


sa.fer...@gmail.com

unread,
Apr 19, 2017, 2:11:02 PM4/19/17
to ZomboDB
Cool! thanks!
Reply all
Reply to author
Forward
0 new messages