Welcome to our online, globe-spanning hackathon!
I hope you will have a good start. I talked to all of you in virtual face-to-face, and from what I heard, I think it makes most sense to focus our week on one line of work: "compound / reaction mapping".
We have compounds as free text in the "openTECR recuration" spreadsheet and as kegg identifiers in the Noor dataset. Reactions are formed out of those compound names, and some of the reactions are already linked to Rhea.
The main challenge for interoperability is that there are many aspects about compounds like ionization state or stereoisomers that may be important, depending on one's view on the world. To create interoperability for openTECR, we can develop in this week a way to go from
1) compound free texts (like it is right now) to
2) compound identifiers, then
3) pulling the chemical structure representations,
4) normalize according to our needs, and
5) cross-map compounds to other databases.
Finally, this allows to
6) visualize compounds and to
7) query compounds based on structural similarity,
8) drawing them as interactive input in the browser.
And if we're still running after all of this, we can
9) map the reactions to Rhea, the reaction database.
Of course you are free to do whatever you want -- as a rule for this hackathon, I would rather like you to be happy than to have work done that only I, but not you, think is useful.
Last time we had some principles and some slides about them -- in short, it was a) Relationships, b) Generosity, c) Visible work, and d) Having goals. You can find the slides here:
https://docs.google.com/presentation/d/184Pgq6ixjRrxfHrjDNaQ7H4OpMiSWe3HEK0n1emZ0mg/edit?usp=sharingTo enable us to build all of this, I believe it would be nice if you could write here or elsewhere a short intro about yourself -- who you are, personally and professionally, and what you're interested in this week. If you want to, you could record this as a video or audio, too!
Because it helps myself, I also like to invite you to track your investments in a spreadsheet format similar to the one we used for the actual curation. It satisfies me myself to look back at what I actually did, and find it in a concise form. But if you don't like to track, feel free to NOT do it. Consider it a offer, but not an obligation, please! Here is the link:
https://docs.google.com/spreadsheets/d/1YTiEAf4EVZaGsISSjGfCcEc-UOl1sQ8Z5WeXrIQa6nE/edit?gid=2050965979#gid=2050965979I would be happy if you let me and everyone immediately know about any challenges and/or questions you have! Then we can (at least try to) help you immediately. Also, this will be great for coordination between us.
Looking forward to hacking with you! :)