Regarding Benchmarking and Performanc Optimization

77 views
Skip to first unread message

anir...@gmail.com

unread,
Apr 14, 2022, 3:04:57 PM4/14/22
to MDnalysis-devel
Hello Oliver and Jon!

I have been going through asv documentation and tried my hands on different repositories. I got a hang of how it works and how to interpret results. Since, the project demands to identify different performance bottlenecks and write benchmark cases, I am curious to know as to what points should I consider adding to my proposal draft. I am not certain as to what part of the codebase might encompass bottlenecks, so I really can't comment on the type of benchmarking I would do without actually doing it. The task looks adhoc to me and so would love to know what kind of specifics should I include in my draft. 

 Regards,
Anirvinya G

Jonathan Barnoud

unread,
Apr 14, 2022, 4:21:14 PM4/14/22
to mdnalys...@googlegroups.com
Hi,

As you said, it is difficult to tell what you will find from the benchmarks before writing them. The first thing that come to my mind is to describe what benchmarks you plan on writing, try to focus on parts of the code that are not covered by benchmarks already. Maybe look at the existing benchmarks and try to identify one or two things that could be optimized from there; are there benchmarks that became slower? Why did that happen? Is there something we can do about it? Finally, keep some time in your schedule to optimise what you identified from your new benchmarks, even if you don't know yet what you will optimise.

There may be other ways to write your proposal. This is just one way that came to mind.

Cheers,
Jonathan
Anirvinya G --
You received this message because you are subscribed to the Google Groups "MDnalysis-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mdnalysis-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mdnalysis-devel/404abd72-1b3d-456c-bfed-ac3caa16614an%40googlegroups.com.


Oliver Beckstein

unread,
Apr 15, 2022, 1:31:39 PM4/15/22
to mdnalysis-devel
Hi Anirvinya G,

Adding to what Jonathan said. 

Our ideas page lists goals that we think will be important for MDAnalysis and necessarily, these goals are expressed at a high level of abstraction. in your proposal you have an opportunity to make these goals as concrete as you can. In other words, you have considerable freedom in how you want to tackle the problem. And we are interested in reading YOUR ideas. We will be impressed by a proposal that shows research and describes a feasible and specific plan to accomplish the goal. If you find that major points are unclear then make these points clear should be an objective in your proposal. You have to explain what is unclear, why you need whatever it is, and how you will achieve it.

Even if you haven’t done any benchmarking yet, you can do a survey of the MDAnalysis code (e,g. by module) and note which modules have some benchmarking coverage. From there you can develop a plan for covering the library. We would expect you prioritize benchmarks in your plan to cover highly used and performance sensitive code first. 

As a starting point, code that has been written in cython is almost certainly performance relevant (see a lot of code under mdanalysis.lib, for instance, or trajectory readers). Features like on-the-fly transformations would also benefit from benchmarking, especially as we have open PRs that are supposed to increase performance.

The issue tracker has open and closed issues with the “performance” label, which could give you an initial idea of some of the code areas relevant for benchmarking. There are also labels “benchmarking”. Finally, search the mailing list archives might also help. 

If you have specific questions then we can also answer.

Best,
Oliver


--
Oliver Beckstein (he/his/him)

GitHub: @orbeckst

MDAnalysis – a NumFOCUS fiscally sponsored project





anir...@gmail.com

unread,
Apr 17, 2022, 5:42:03 PM4/17/22
to MDnalysis-devel
Hi!

Thanks a lot for your feedback. I have submitted a proposal keeping in mind the points suggested. I would like to have your feedback and suggestions so I can make changes and edits if any accordingly. 

Regards,
Anirvinya G

Oliver Beckstein

unread,
Apr 18, 2022, 12:57:38 PM4/18/22
to mdnalysis-devel
Hi Anir,

I had a look at your proposal. I didn’t see a link in the PDF to a Google Doc or similar where I could have commented so I am sending comments via the mailing list. I hope they make sense.

1) State clearly in the beginning if you are considering a 175h or 350h project. Your timeline should reflect the project size.

2) I like your idea of assessing importance of code pieces in order to prioritize benchmarking effort. But the execution is still vague and allocating only a week to come up with a solution in the absence of a concrete plan seems too tight. More importantly, it could be a waste of time if you cannot find a solution. Make sure that you state a credible alternative plan (you mention Kern profiler). Include sufficient detail so that we can assess feasibility.

3) You allocate a month to write new benchmarks but there’s no detail. Some of it may depend on prior work but you don’t want everything to depend on a prior objective that might or might now work (namely (2)). I’d also like to see more specifics, some concrete examples. Show us one or two new benchmarks that you already tested. Show us the names of benchmarks that you will write.

4) Documentation is not an afterthought. Integrate documenting throughout. How will your work enable future developers to maintain the high benchmarking standards that you set? Tell us how your successful project will benefit MDAnalysis in the long run.

Sorry to be short. I look forward to reading your final proposal.

Best,
Oliver


anir...@gmail.com

unread,
Apr 20, 2022, 5:58:22 AM4/20/22
to MDnalysis-devel
Hi Oliver!

Thanks a lot for your feedback. I tried to incorporate what you said to the best of my ability. I will try to actively look after issues on GitHub and solve them to get a better hang of the codebase in the meantime. 

Regards,
Anirvinya G 
Reply all
Reply to author
Forward
0 new messages