Can I create an RSS Reader like Techi in web2py?

193 views
Skip to first unread message

Gideon George

unread,
Jul 4, 2014, 5:37:17 AM7/4/14
to web...@googlegroups.com
I want to create an awesome rss feed agregator like that of techi.com
please is there any example of site that did it in web2py? Also I need suggestions on how to go about it.
Thank you

Massimo Di Pierro

unread,
Jul 4, 2014, 10:32:45 AM7/4/14
to web...@googlegroups.com
Look at example 14 http://web2py.com/examples/default/examples

You only need to store the news in database and provide a nice layout. not that difficult.

Ricardo Pedroso

unread,
Jul 6, 2014, 3:01:23 AM7/6/14
to web...@googlegroups.com
I have my own personal feed agregator done in web2py. It's not an "awesome" one but my
requirements were to not have images and to not have any javascript.
I want it to be fast and functional.
I currently have the load event and domcontentloaded event in chomium fired in less than 400ms.
You can see it here: http://feeds.uni.me

from memory my setup is:

- nginx with uwsgi protocol, but not with uWSGI server. I implemented a subset of
uwsgi protocol in pure python to be able to use eventlet (I think uWSGI server doesn't support eventlet).

- sessions and cache are stored in redis.

- sqlite database. Currently it has near 300000 (~665Mb) articles in it.
One thing I learn is that sqlite is very slow to "select count(*) from table",
I was using this query for pagination with pages,
than I change to only newer, older buttons to avoid the count(*).

- search is powered by whoosh and bottle - Web2py is querying through a simple restful api.
I have 2 separated whoosh indexes, one for portuguese articles and the other for english.

- a background job to collect and process the feeds,  than the feeds are stored in sqlite and added to
whoosh index.

- perp (http://b0llix.net/perp/ - similar to daemontools) to control all the pieces

- a small VPS with 256Mb of RAM.


Ricardo

Massimo Di Pierro

unread,
Jul 6, 2014, 6:43:20 PM7/6/14
to web...@googlegroups.com
Nice!

Vinicius Assef

unread,
Jul 6, 2014, 10:16:51 PM7/6/14
to web2py
Ricardo, very nice your solution.

Below, I wrote some questions about it.

>
> On Sunday, 6 July 2014 02:01:23 UTC-5, Ricardo Pedroso wrote:
>>
>> - search is powered by whoosh and bottle - Web2py is querying through a
>> simple restful api.

I was thinking about this piece of your architecture.
If your request goes through Web2py, why delegate search to bottle?
(and, also, wait for another request)

Why not just using Whoosh inside Web2py?

>>
>> - a small VPS with 256Mb of RAM.

All these described processes run inside this vps?

Ricardo Pedroso

unread,
Jul 9, 2014, 6:28:59 AM7/9/14
to web...@googlegroups.com
On Mon, Jul 7, 2014 at 3:16 AM, Vinicius Assef <vinic...@gmail.com> wrote:
Ricardo, very nice your solution.

Below, I wrote some questions about it.

>
> On Sunday, 6 July 2014 02:01:23 UTC-5, Ricardo Pedroso wrote:
>>
>> - search is powered by whoosh and bottle - Web2py is querying through a
>> simple restful api.

I was thinking about this piece of your architecture.
If your request goes through Web2py, why delegate search to bottle?
(and, also, wait for another request)

Why not just using Whoosh inside Web2py?


Mainly for two reasons and because my vps has a small amount of memory available
but has 8 cores.

1. I was worried about performance of whoosh.
Whoosh is used in two different parts of the site, the search in the main page
and the "maybe related" when viewing an article (eg: http://feeds.uni.me/feeds/default/article/281126?l=en&s=)
and this are the two slowest (but still fast) things in the site.
So this way I can put another core to work and I think I can release another greenlet when waiting for
whoosh results and handle more concurrency since I'm using eventlet.monkey_patch() and from web2py
I do this for the search:

u = urllib2.urlopen('http://localhost:8080/search/' + str(int(p)) + '/' + arg + '?l=' + lang, timeout=15)
json.loads(u.read())

2. memory consumption of whoosh. Currently I have this memory usage:
web2py - 25MB
search (bottle+whoosh) - 39MB
the background job that grabs and process the feeds - usually 40MB (when running)
nginx master - 5MB
nginx worker - 5.5MB - The site is run only with one process/thread since I'm using
greenlet's through eventlet.

Sometimes(not too often) the background job give me a crash with MemoryError so
the first thing I put down to free some memory is the bottle+whoosh process and the rest
of the site still works, of course, without the "search" and "maybe related" features.


Another, less important, reason was that I like to complicate, it's funnier :)


>>
>> - a small VPS with 256Mb of RAM.

All these described processes run inside this vps?

 yes.

Ricardo

Vinicius Assef

unread,
Jul 9, 2014, 7:21:22 AM7/9/14
to web2py
Thank you, Ricardo.

It's clear, now.
> --
> Resources:
> - http://web2py.com
> - http://web2py.com/book (Documentation)
> - http://github.com/web2py/web2py (Source code)
> - https://code.google.com/p/web2py/issues/list (Report Issues)
> ---
> You received this message because you are subscribed to the Google Groups
> "web2py-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to web2py+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Gideon George

unread,
Aug 8, 2014, 2:03:37 AM8/8/14
to web...@googlegroups.com
Thank you everyone for your contributions.
 
 Ricardo please I am still trying to figure out how to to this, and looking at the url you gave on www.feeds.uni.me redirrects to http://vioos.com/.

I really like the layout of http://vioos.com/ .Please was that done using rss in web2py? if yes how possibly can I have something similar that will extract all political news about Nigeria like that. 

I really desire to see me getting this done perfectly, I will be very happy! Thank you

Ricardo Pedroso

unread,
Aug 8, 2014, 11:42:08 AM8/8/14
to web...@googlegroups.com
On 8/8/14, Gideon George <george...@gmail.com> wrote:
>
>>
>> Thank you everyone for your contributions.
>>
>
> Ricardo please I am still trying to figure out how to to this, and looking
>
> at the url you gave on www.feeds.uni.me redirrects to http://vioos.com/.

it's feeds.uni.me without www.
I don't know nothing about vioos.

Ricardo

Gideon George

unread,
Aug 8, 2014, 10:09:11 PM8/8/14
to

That's really great! I just saw it. initially it was redirecting me to vioos when I put the www prefix.
I really like your layout and everything about it.

Please I need your help to be able to do something similar to extract news and be displayed. Your work is exactly what has been in my imagination, with no images at all. This is absolutely fantastic.
Any tutorial or something? It's my first web2py app I need you to help me on this please.
Thank you very much.

Ricardo Pedroso

unread,
Aug 10, 2014, 6:29:30 PM8/10/14
to web...@googlegroups.com
On 8/8/14, Gideon George <george...@gmail.com> wrote:
> That's really great! I just saw it. initially it was redirecting me to
> vioos
> I really like your layout and everything about it.
>
> Please I need your help to be able to do something similar to extract
> political news concerning Nigeria
> Any tutorial or something? It's my first web2py app I need you to help me
> on this.

I can share the code, if you want.

Ricardo

Gideon George

unread,
Aug 10, 2014, 8:02:37 PM8/10/14
to web...@googlegroups.com
WOW! That's really great to hear. I don't even know how to start appreciating right now.
I am really grateful for that, Absolutely grateful! I will love to learn and implement it. it's gonna help me alot.
Thank you very much and I pray that God grant you all your heart desires, Amen.

It's my very first web2py app and I am really learning and excited. I want to be one of the best in Africa, and I am on my way...I just can't wait to start! 

Ricardo Pedroso

unread,
Aug 13, 2014, 6:28:29 PM8/13/14
to web...@googlegroups.com
Next weekend I can share the code. I have to cleanning it a little bit.

Ricardo

Gideon George

unread,
Aug 14, 2014, 9:49:41 PM8/14/14
to web...@googlegroups.com
Oh Ok, that will be great! I can't wait... Wish you the best of the best. Thank you


You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/3RzivktPaRI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Ricardo Pedroso

unread,
Aug 18, 2014, 6:22:27 PM8/18/14
to web...@googlegroups.com
On 8/14/14, Gideon George <george...@gmail.com> wrote:
> Oh Ok, that will be great! I can't wait... Wish you the best of the best.

Follows attached...


This is not an "plug-n-play" web2py application there are some work to be done.
Follow some install instructions - I hope completed:

create a virtualenv and install:

pip install feedparser
pip install bs4
pip install beautifulsoup4
pip install FilterHTML
pip install times
pip install guess-language
pip install whoosh
pip install lxml
pip install bottle


run the initial setup:

$ python web2py.py -MS feeds -R applications/feeds/scripts/setup.py


it will create a user a...@a.aa with pass 1234, and fetch one reddit feed.


to periodically fetch feeds, run:

$ python web2py.py -MS feeds -R applications/feeds/modules/feeds_update.py


Note:
I recommend to have redis installed and running to have cache
and sessions store in it, otherwise file sessions and file cache
will be used.


I think that's all.
feeds.zip

Gideon George

unread,
Aug 20, 2014, 2:10:46 PM8/20/14
to web...@googlegroups.com
OMG! You need to see the reaction I made as soon as I saw the mail from you! I am so excited and full of happiness.
I really appreciate a great deal and I pray God will continue to grant you all your heart desires and make all your dreams turn to reality.

As a newbie web2py user that has never used a framework before, all the terms here are very new to me, but I will go through them one after the other to make everything work perfectly. I am so excited that about what I will learn.
The site I am developing is www.gidigba.ng I want to have a page to extract all potilical news from Nigeria as rss or something.
Thank you once more again I am very grateful.



Reply all
Reply to author
Forward
0 new messages