Dear Mentors,
I am Mahmoud, a computer science grad student form US. I am planning to contribute in GSoC 2017.
I love python due to its powerful libraries and syntax I used them for example in machine learning courses or Jupyter notebooks. I was exploring the ideas mentioned in
gsoc2017.scrapinghub.com and found "
Increase Crawling Performance through page clustering" an interesting topic I want to contribute in.
I have good knowledge of web programming and browser simulating libraries such as Selenium and JWebUnit due to projects I've done. I also contributed an opensource project (
http://appsensor.org/) to provide some libraries to connect to its REST API web services.
Moreover, I'v done programming projects related to information retrieval on Wikipedia and Twitter data.
I found "Ruairi Fahy" as the mentor for this idea and hope I can get more information, warm up tasks or some starting guides for this idea.
Thank you,
Mahmoud