Couchbase Lite Full Text Search

436 views
Skip to first unread message

Julio Albuquerque

unread,
Dec 9, 2014, 10:33:18 AM12/9/14
to mobile-c...@googlegroups.com
Hi.
Someone has already developed full text search in Couchbase Lite Android?
I need to search the Couchbase Lite for Android as is done in SQL: like '% text%'
For example: select * from table where name like '% filter%', this form must return all documents where the name contains the filter.
Can anyone help?

Jens Alfke

unread,
Dec 9, 2014, 11:39:03 AM12/9/14
to mobile-c...@googlegroups.com
On Dec 9, 2014, at 7:33 AM, Julio Albuquerque <jcezar.al...@gmail.com> wrote:

Someone has already developed full text search in Couchbase Lite Android?

No, not yet.

I need to search the Couchbase Lite for Android as is done in SQL: like '% text%'
For example: select * from table where name like '% filter%', this form must return all documents where the name contains the filter.

That's not quite the same thing. A SQL LIKE operator is a pattern-match; it doesn't use an index at all, but has to brute-force scan all of the possibilities in O(n) time. You can do the same thing yourself by creating a view that emits that 'name' property as the key or value, then iterating over all the query rows looking for matches.

Full-text search is much faster because it uses an index, but it's word-based. For example, searching for "hat" wouldn't find the word "what" because they're different words.

—Jens

Julio Albuquerque

unread,
Dec 10, 2014, 9:37:26 AM12/10/14
to mobile-c...@googlegroups.com
ny idea how to search partial text in your document?

I developed a routine that brings the documents I need and slant of a loop caught each string and check if it contains the word being sought, but was VERY slow. And are only 12,000 documents.
Any ideas?

Julio.



Julio Albuquerque

unread,
Dec 10, 2014, 9:39:27 AM12/10/14
to mobile-c...@googlegroups.com
If interest, I am using Android with Xamarin Xamarin Studio and the Couchbase Lite .Net component to Xamarin Android.

Jens Alfke

unread,
Dec 10, 2014, 11:38:43 AM12/10/14
to mobile-c...@googlegroups.com

On Dec 10, 2014, at 6:37 AM, Julio Albuquerque <jcezar.al...@gmail.com> wrote:

I developed a routine that brings the documents I need and slant of a loop caught each string and check if it contains the word being sought, but was VERY slow. And areonly 12,000 documents.

Could you show the code?

Also, are you sure you need to find matches inside of words? It's pretty common for text search in apps to only show matches at the start of a word. If you are willing to restrict matching that way, there are faster ways to do it: you can write a map function that emits each word in the string (lowercased, and skipping common words like "the"). Then you can restrict the search to keys that start with the target word.

—Jens

Julio Albuquerque

unread,
Dec 10, 2014, 1:20:55 PM12/10/14
to mobile-c...@googlegroups.com
Basically need to research documents containing the word informed as a filter, for example, search in the documents the word "CHURRASCO" in the fields NOME, PRODUTO AND TITULOS.
Look at the structure of the document:
{
  "id": "E2915",
  "nome": "Botoni Bebidas",
  "fone": [
    {
      "numero": "3421 3400",
      "ddd": "55"
    },
    {
      "numero": "3421 1999",
      "ddd": "55"
    },
    {
      "numero": "3426 4421",
      "ddd": "55"
    },
    {
      "numero": "9912 1271",
      "ddd": "55"
    },
    {
      "numero": "9174 9767",
      "ddd": "55"
    },
    {
      "numero": "8415 3277",
      "ddd": "55"
    },
    {
      "numero": "8112 3251",
      "ddd": "55"
    }
  ],
  "razao": "Paulo Evandro Fantinel Botoni",
  "email": "botoni...@hotmail.com",
  "homepage": "www.botoni.com.br",
  "complemento": "",
  "nr": "603",
  "cep": "97542450",
  "peso": "17",
  "cidade": "Alegrete",
  "bairro": "CENTRO",
  "logradouro": "Mariz e Barros",
  "produto": "- BEBIDAS EM GERAL, DOMINGOS CHURRASCO POR KILO\r\n - NO INVERNO MOCOTÓ;\r\n - GELO, ÁGUA, \r\n - REPRESENTAÇÃO ÁGUA ELLAN (EXCLUSIVIDADE)\r\n - CORTES ESPECIAIS PARA CHURRASCO, DISTRIBUIDOR FONTE DA ILHA;\r\n - PEIXES E FRUTOS DO MAR, PIZZAS, MASSAS E LASANHAS,\r\n - DISTRIBUIDORES ERVA MATE CHARME",
  "latitude": "-29,7860660552979",
  "longitude": "-55,7838821411133",
  "facebook": "",
  "titulos": [
    {
      "descricao": "CHURRASCO POR KILO E FRANGO ASSADO"
    },
    {
      "descricao": "BARES"
    },
    {
      "descricao": "GÁS"
    },
    {
      "descricao": "ÁGUA MINERAL"
    },
    {
      "descricao": "DISTRIBUIDORES DE BEBIDAS"
    },
    {
      "descricao": "GELO"
    }
  ],
  "titular": [
    {
      "descricao": "Fernanda Sasciloto Botoni"
    },
    {
      "descricao": "PAULO EVANDRO FANTINEL BOTONI"
    }
  ],
  "tipo": "E",
  "blativo": "0"
}

Julio Albuquerque

unread,
Dec 10, 2014, 1:32:13 PM12/10/14
to mobile-c...@googlegroups.com

Just as a test, developed this code in C # to get all documents from Couchbase Lite and after trying to find all documents that contain the word for the filter.
This code is only to see the behavior.
I'm using Xamarin Studio, Xamarin Android, C # and Couchbase Lite.


var qry = database.CreateAllDocumentsQuery ();
qry
.AllDocsMode = AllDocsMode.AllDocs;
var docs = qry.Run ();
 
for (int line = 0; line < lines.Count (); line++) {
 
var row = lines.GetRow (line);
                 txtResultado
.Text += row.Document.GetProperty ("nome").ToString ().Contains (textPesquisar.Text) ? (row.Document.GetProperty ("nome").ToString () ) : string.Empty;
                 
}
In total there are 12,000 documents.
And this loop was very slow.
Is there any quick way to go through these documents and search for the word entered by the user?

I'm trying to develop an app for information guide. So users can enter any word and the app has to trazar documents containing that word, wherever it is.

Have rode the entire structure: Couchbase Server, SyncGataway, etc.
Everything is working perfect, however, this type of search I am not able to do and this would make the project unfeasible with Couchbase Lite, but takes a lot of SyncGateway, so I'm insisting.


Jens Alfke

unread,
Dec 10, 2014, 3:05:31 PM12/10/14
to mobile-c...@googlegroups.com
The basic way to optimize this is to index the words, as I said in the previous message:

write a map function that emits each word in the string (lowercased, and skipping common words like "the"). Then you can restrict the search to keys that start with the target word.

Some JS-like pseudocode (I don't know C# well enough to try that):

function(doc) {
words = doc["nome"].toLowercase().split(" ");
for (word in words)
emit(word, null);
}

The splitting into words actually has some more details: you want to ignore punctuation and multiple spaces, and treat other whitespace the same as spaces. I'm guessing C#/.NET has a utility method to do this.

—Jens

Julio Albuquerque

unread,
Dec 17, 2014, 11:29:29 AM12/17/14
to mobile-c...@googlegroups.com
Thanks Jens!
You can answer me another question?
You can develop a windows application in Csharp.net (Visual Studio) using Couchbase Lite as NoSQL database? This application would run on desktop, but it is not a web application.
Is that possible?

Julio


Reply all
Reply to author
Forward
0 new messages