Existing (partially) computational workflow for SANS structural analysis

2 views
Skip to first unread message

Cameron Neylon

unread,
Apr 18, 2011, 7:40:51 AM4/18/11
to Lark Song
Hi Folks

I thought I'd kick off with a use case description that I think has
some nice characteristics and could potentially test Lark in an
interesting set of ways. We regularly collect SANS data on proteins in
solution. The raw data is essentially a binary image file. This gets
"reduced" to an x-y dataset which is then used for analysis using a
scriptable software framework that has been built in house (http://
mantidproject.org) and I have written some of the scripts for doing
this.

One route for analysis is to build model structures, generally based
on existing PDB files, generate predicted scattering patterns from the
data and then compare to experiment. The building process can be
incorporating loops (using eg Modeller) or some form of MD or just a
parameter sweep through a set of bond rotations and translations. Some
of this workflow I have under control, some of it I haven't
implemented yet, but most of it is based on a fairly disparate set of
existing tools and involves generation of standard file formats (PDB
and SasXML). Currently I do everything manually.

So two useful workflow for me would be:

Reduction:
1. Take input image data for relevant experimental runs (GUI
implemented in Mantid/Python)
2. Generate reduced data (existing python code running within Mantid
framework)
3. Write out reduced data and record of the processing (as above, but
currently provenance info is poor)
4. Attempt to connect with other information on samples etc...
(currently not done)

Modelling:
1. Pull down relevant PDB files
2. Generate model structures (a number of different approaches here
including manual model building, automatic loop building [Modeller],
and sets of bond rotations and translation [PyMOL? MQTT?])
3. Clean up structures to remove non-physical members (clashes etc.
not actually done this yet)
4. Generate predicted scattering patterns (ATSAS suite, I've written
some python wrappers/parsers for this)
5. Compare predicted patterns to an experimental pattern and rank
6. Visualise results, rinse and repeat.

So making sense of any part of this is helpful to me and there are an
interesting set of further wrinkles here, vis reaching back into the
experimental lab workflow, working with other groups in a
collaborative manner and the benefits of modular workflows that would
enable different means of structural modelling for instance. We also
have control over most of the software pieces here as well so
integrating with Lark shouldn't be a problem.

Finally there is the interesting question of how this plays with our
existing lab notebook and some ongoing work where we're thinking about
generic workflow descriptions for some of these processes so there is
a background understanding of the data models, provenance, and
description frameworks already which we can compare and contrast with
how Lark is configured.

The other thing is that sharing of best practice in this space is
extremely poor so giving people tools to share workflows has some real
potential user benefits and the user community is reasonably command
line oriented so there is an opportunity to drive uptake quite
rapidly.

Cheers

Cameron
Reply all
Reply to author
Forward
0 new messages