Full Text Search and specific keys needed

58 views
Skip to first unread message

Frederic Yesid Peña Sánchez

unread,
Feb 14, 2014, 3:29:35 PM2/14/14
to mobile-c...@googlegroups.com
Hi.

We need to enable searching by FullTextSearch, but don't know how to do "partial" word search, so if i have the phrase "The quick brown fox", searching for "uic" should match.

(May be this question is for SqLite but SO does not help...)

My View:

[vistaContactos setMapBlock:^(NSDictionary *doc, CBLMapEmitBlock emit) {

            if([doc[@"type"] isEqualToString:@"RuteroMedicos"] && [doc[@"estado_contacto"] integerValue] != 4 && [doc[@"estado_contacto"] integerValue] != 2){

            

                NSString *datoTexto = [NSString stringWithFormat:@"%@ %@ %@ %@ %@", doc[@"codigo_contacto"], doc[@"nombres"], doc[@"apellidos"], doc[@"nombre_especialidad"], doc[@"direccion_visita"]];

                

                emit(CBLTextKey(datoTexto), doc);

                

            }

        } version:@"2.0"]; 

Frederic Yesid Peña Sánchez

unread,
Feb 14, 2014, 5:05:46 PM2/14/14
to mobile-c...@googlegroups.com
I've across this "view" that emits keys allowing me to query by partial words:

[vistaRuteroMedicosFTS setMapBlock:^(NSDictionary *doc, CBLMapEmitBlock emit) {

        if([doc[@"type"] isEqualToString:@"RuteroMedicos"] && [doc[@"estado_contacto"] integerValue] != 4 && [doc[@"estado_contacto"] integerValue] != 2){

            

            NSString *datoTexto = [NSString stringWithFormat:@"%@ %@ %@ %@ %@", doc[@"codigo_contacto"], doc[@"nombres"], doc[@"apellidos"], doc[@"nombre_especialidad"], doc[@"direccion_visita"]];

            

            // Tokenizar y generar keys

            

            NSArray *tokens = [[datoTexto lowercaseString] componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

            tokens = [tokens filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"SELF != ''"]];

            

            // Por cada key, generar las secuencias posibles y emitir las claves

            

            for(NSString *tok in tokens){

                

                NSInteger inicioTok = 0;

                NSInteger longTok = 0;

                

                // "abre" las coincidencias desde el inicio

                

                NSString *emitTok;

                

                for(longTok = 1; longTok < [tok length]; longTok ++){

                    emitTok = [tok substringWithRange:NSMakeRange(inicioTok, longTok)];

                    CLSNSLog(@"Token emitido: '%@'", emitTok);

                    emit(@[emitTok, doc[@"codigo_contacto"]], nil);

                }

                

                // "cierra" las coincidencias hasta el fin

                

                for(inicioTok = 0; inicioTok < [tok length]; inicioTok ++){

                    longTok = [tok length] - inicioTok;

                    emitTok = [tok substringWithRange:NSMakeRange(inicioTok, longTok)];

                    CLSNSLog(@"Token emitido: '%@'", emitTok);

                    emit(@[emitTok, doc[@"codigo_contacto"]], nil);

                }

                

            }

            

        }

    } reduceBlock:^id(NSArray *keys, NSArray *values, BOOL rereduce) {

        CLSNSLog(@"keys: %@", keys[0][1]);

        return keys[0][1];

    } version:@"2.9"];


But can't figure how to emit unique keys, it emits repeating keys for about 17K rows.

thanks.

Jens Alfke

unread,
Feb 14, 2014, 7:23:16 PM2/14/14
to mobile-c...@googlegroups.com

On Feb 14, 2014, at 12:29 PM, Frederic Yesid Peña Sánchez <freder...@gmail.com> wrote:

We need to enable searching by FullTextSearch, but don't know how to do "partial" word search, so if i have the phrase "The quick brown fox", searching for "uic" should match.

The SQLite FTS docs say you can do prefix matching using a "*" character, so "qui*" will match "quick".

I don't think FTS has an option to do general substring matching, probably because in SQL you don't need a fancy indexer to do that, it's just a "LIKE" operation. Couchbase Lite doesn't provide a way to match using LIKE. (And it wouldn't be very efficient even if it did, because LIKE requires a linear scan of the index, looking at the string in every row.)

What is it you're trying to accomplish? Matching against arbitrary substrings seems like an unlikely use for a database.

—Jens

Frederic Yesid Peña Sánchez

unread,
Feb 17, 2014, 9:14:32 AM2/17/14
to mobile-c...@googlegroups.com
Thanks.

I could use "*" but FTS is no gone because of not being able to use "reduce" option.

I ended up using two Queries one after the other, to perform a "search" and organize the results by another field on the documents.
Reply all
Reply to author
Forward
0 new messages