Modifying multiple documents with multiple fields

Skip to first unread message

Hans im

Mar 26, 2022, 11:07:10 AM3/26/22
to MongoEngine Users
I have a collection named crawl_data with the contents like below.

[{ "file_name":"xyaz", "url":"", "html_data":"i am good", "organization":"org 1", }, { "file_name":"abcd", "url":"", "html_data":"hello", "organization":"org 2", }, { "file_name":"news", "url":"", "html_data":"i have data", "organization":"org3", "score":0.85, "prediction":"valid" }]

For first two documents, score and prediction are not present. I am trying to fetch all documents in the collection which do not have score and prediction and modify with score and prediction. Model for this is

from flask_mongoengine import MongoEngine class CrawlData(db.Document): file_name = db.StringField(max_length=50, required=True) url = db.URLField(max_length=200,required=True,unique=True) html_data = db.StringField() scrape_date = db.DateTimeField(default=datetime.datetime.utcnow, required=True) organization = db.StringField(max_length=250,required=True) score = db.FloatField(min_value=0, max_value=1) prediction = db.StringField(max_length=10) validation_date = db.DateTimeField()

How to update both the document at the same time with different score and prediction (valid or invalid) values. score and prediction values are calculated in another function(Machine Learning). I have around many thousands of document like this and individual updating is not the way I am looking for.

Hans im

Mar 26, 2022, 11:08:48 AM3/26/22
to MongoEngine Users
Formatting went awkward. Sorry

from flask_mongoengine import MongoEngine 
class CrawlData(db.Document): 
 file_name = db.StringField(max_length=50, required=True) 
 url = db.URLField(max_length=200,required=True,unique=True) 
 html_data = db.StringField() 
 scrape_date = db.DateTimeField(default=datetime.datetime.utcnow, required=True) 
 organization = db.StringField(max_length=250,required=True) 
 score = db.FloatField(min_value=0, max_value=1) 
 prediction = db.StringField(max_length=10) 
 validation_date = db.DateTimeField()

Reply all
Reply to author
0 new messages