How to specify the `weights` option/property for a `text`index in PYMONGO / PYMODM ?

172 views
Skip to first unread message

Dick Schrauwen

unread,
Feb 12, 2017, 5:36:28 PM2/12/17
to mongodb-user
I'm using PYMODM for accessing mongodb. Cant figure out how to specify the `weights` property for a text index.

The docs leave me in the dark.

The code-snippet:
# fpg_mongo.py

# .....

# ------------------------------------------------------- # -------------------------------------------------------
#   ITEM :: it's all about the feed-items
#
#       Hunter    => Shoots items to kill them twice
#       Storer    => Stores the items (the really precious, new ones only)
#       Alchemist => Turns item-data into wisdom
#       Callgirl  => Her wisdom turns you on                 
# ------------------------------------------------------- # -------------------------------------------------------

class FpgItem(FPGBase):
    '''

'''
    
    # REQUIRED
    # _id
    item_feed            =    fields.ReferenceField(FpgFeed)
    #
    item_title           =    fields.CharField(required=True)
    item_content         =    fields.CharField(required=True)
    # summary, content,
    item_link            =    fields.URLField(required=True)
    # the real thing
    item_date            =    fields.CharField(required=True)
    # published, updated, feed_date_string
    item_date_parsed     =    fields.DateTimeField(required=True)
    # published_parsed, updated_parsed, feed_date_parsed, utc.now()

    # OPTIONAL
    item_media           =    fields.EmbeddedDocumentListField(FpgMedia, required=False)
    # media_content,  media_thumbnail, ALSO: content => src=


    class Meta:

        indexes = [
            # pymongo.IndexModel([('url', pymongo.ASCENDING)], unique=True)

            # ItemTxTindex
            pymongo.IndexModel([('item_title', pymongo.TEXT), ('item_content', pymongo.TEXT)],
                               # **kwargs
                               name="ItemTxTindex",
                               #weights=[('item_title',50),('item_content',30)],
                               #pymongo.weights=[('item_title',50),('item_content',30)],
                               #pymongo.WEIGHTS=[('item_title',50),('item_content',30)],
                               #
                               background=True,
                               unique=False
                               ) # 
            ] # indexes
            

# .....
            


Luke Lovett

unread,
Feb 15, 2017, 12:22:58 PM2/15/17
to mongodb-user
I think this is the IndexModel you're looking for:

from bson.son import SON

class FpgItem(MongoModel):
    ...
    class Meta:
        indexes = [

            IndexModel([('item_title', pymongo.TEXT), ('item_content', pymongo.TEXT)],
                       name='ItemTxTindex',
                       weights=SON([('item_title', 50), ('item_content', 30)]))]

The "weights" property of a text index is a document, like the examples on this page: https://docs.mongodb.com/v3.2/tutorial/control-results-of-text-search/. It's documented on the page for the "createIndexes" page (https://docs.mongodb.com/manual/reference/command/createIndexes/), but it's not obvious that that's the command that's running as a result of defining the "indexes" property in the Meta class. The PyMODM docs should point to this documentation page or at least mention that it's the "createIndexes" command being run on these IndexModels. Filed https://jira.mongodb.org/browse/PYMODM-55

Hope this helps!
Luke

Dick Schrauwen

unread,
Mar 4, 2017, 7:55:40 AM3/4/17
to mongodb-user
Just a remark:
When you use inheritance, indexes should have the property `sparse` to prevent `key exists` errors. Example:
class FpgItem(FPGBase):
    '''

'''
   
    # REQUIRED
    # _id
    item_feed            =    fields.ReferenceField(FpgFeed)
    #
    item_fngrprnt        =    fields.CharField(required=True, blank=True)    #, unique=True)

    item_title           =    fields.CharField(required=True)
    item_content         =    fields.CharField(required=True)
    # summary, content,
    item_link            =    fields.URLField(required=True)
    # the real thing
    item_date            =    fields.DateTimeField(required=True)
    # published_parsed, updated_parsed
    # item_date            =    fields.CharField(required=True)
    # published, updated, feed_date_string
    # item_date_parsed     =    fields.DateTimeField(required=True)

    # published_parsed, updated_parsed, feed_date_parsed, utc.now()

    # OPTIONAL
    item_author          =    fields.CharField(required=False)
    item_media           =    fields.EmbeddedDocumentListField(FpgMedia, required=False, blank=True)
    # - https://docs.mongodb.com/manual/tutorial/control-results-of-text-search/
    class Meta:
        indexes= [
            #text search

            IndexModel([('item_title', pymongo.TEXT), ('item_content', pymongo.TEXT)],
                   name='ItemTEXTindex',

                   weights=SON([('item_title', 50), ('item_content', 30)])
                   ),
            IndexModel([('item_fngrprnt', pymongo.ASCENDING)], sparse=True, unique=True, name='ItemFNGRPRNTindex')

            #IndexModel([('item_fngrprnt', pymongo.ASCENDING)], unique=True, name='ItemFNGRPRNTindex')
            #IndexModel([('item_fngrprnt', pymongo.ASCENDING)], unique=False, name='ItemFNGRPRNTindex')
                  ]  #indexes
       


Dick Schrauwen

unread,
Mar 4, 2017, 8:42:15 AM3/4/17
to mongodb-user
Use SPARSE to prevent:
E11000 duplicate key error collection: fpgold.fpg_base index: FeedURLindex dup key

- 2017-03-01 21:02:43,389|ERROR|fpg_model|Exception! Exc=[E11000 duplicate key error collection: fpgold.fpg_base index: FeedURLindex dup key: { : "http://www.economist.com/sections/business-finance/rss.xml" }], Action=Create, Class=[FpgFeed], Kwargs=[{'feed_subtitle': '', 'feed_url': 'http://www.economist.com/sections/business-finance/rss.xml', 'feed_encoding': 'utf-8', 'feed_lang': 'en', 'feed_date': datetime.datetime(2017, 3, 1, 13, 40, 11, tzinfo=tzutc()), 'feed_title': 'Business and finance'}]]
Reply all
Reply to author
Forward
0 new messages