How to update old data with new data everyday from a scraper

43 views
Skip to first unread message

Adib Neymar Jr.

unread,
Sep 18, 2021, 10:27:48 AM9/18/21
to django...@googlegroups.com
Hello,

What is a good way to compare new data with old data which is updated everyday with Django ORM? Basically I have a scraper which fetches hackathons everyday (basically just a celery task) and I want the newest to be unioned it with my master database which has the latest fetched hackathons from yesterday. I don't want to destroy my master database and then just upload everything that I just fetched since that seems wasteful. This is my strategy in approaching my problem but I am open to hear other options as well.


Thanks,



Adib

Jason Turner

unread,
Sep 18, 2021, 10:31:46 AM9/18/21
to django...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/CAN%3Doy7JkqHEtaizgKtHniofSS7T%2BybqVer0mJc1uRPRLupUSdQ%40mail.gmail.com.

Adib Neymar Jr.

unread,
Sep 19, 2021, 10:29:33 AM9/19/21
to Django users

Interesting, is there a way to peek for an existing record instead of calling get to see whether the record exists in the database? I'm thinking of making this snippet of code a bit more efficient

try:
   obj = Hackathon.objects.get(**each_dict)
except Hackathon.DoesNotExist:
   obj = Hackathon(**each_dict)
   obj.save()
Reply all
Reply to author
Forward
0 new messages