Topspin Script for Exporting HSQC data as SMART compatable format

117 views
Skip to first unread message

Joseph Egan

unread,
Jan 22, 2020, 5:15:01 PM1/22/20
to SMARTNMR
Hello Everyone,

I realize that a large number of users are using MNova, but there are some who prefer to work in Topspin. To make it as easy as possible for these users to be able to import their data, I've created a small script that utilizes Topspin's native python environment to do the heavy lifting.

Note: you can try changing the directories, but I found that sometimes the permissions don't line up too well. To circumvent this, the script puts peak lists a folder in the Bruker directory on your hard drive named SMART Peak Lists.You must change the directory to the relevant Mac directory if you are not on windows.

Step by Step on how to get these to work:
  1. Navigate to your Topspin directory (in windows, this is "C:Bruker")
  2. Create a new folder called "SMART Peak Lists"
  3. Open Topspin and type 'edpy' - hit enter
  4. In this dialog, go to the top right corner where it says "Source = [ combo bar filled with a few things] "
  5. Select the option in the combo box that ends with /py/user
  6. Go to File->New and name the script what you want (Hint: you can directly call the script from the input bar from now on, so call it something easy - I called mine SMART.py)
  7. Copy the code into the box and hit save.
  8. Open a data set that is peak picked to your satisfaction and type the name of your script into the input bar and hit enter. (I.E. 'SMART')
  9. If you do NOT get an error message, it should say 'SMART.py: finished' in the lower left corner.
  10. Navigate to the directory where your lists are stored.
  11. You'll see the sample name of your open data set followed by "_Peaks.csv"
  12.  Submit them into the website and rejoice.
After the first success, you'll be able to simply open a data set and type 'SMART' and your lists will be generated and given the right sample name, ready for the SMART server.

The Script:

import csv, os

curdat
= CURDATA()
Peaklist = GETPEAKSARRAY()
SMART_PATH
= os.path.abspath("C:\Bruker\SMART Peak Lists") #This is what you change if you are on MAC or wish to give the directory a different name.
Dest_Path = os.path.join(SMART_PATH,str(curdat[0])+'_Peaks.csv')
with open(Dest_Path, 'w') as csvfile:
    fieldnames
= ['1H', '13C']
    writer
= csv.DictWriter(csvfile, fieldnames=fieldnames,lineterminator = '\n')
    writer
.writeheader()
   
for peak in Peaklist:
        proton
= round(peak.getPositions()[0],2)
        carbon
= round(peak.getPositions()[1],1)
        writer
.writerow({'1H':proton,'13C':carbon})


I have a TON of these small scripts for topspin, and have tried to automate peak picking (with some limited success) so if you are interested in automating  part of your workflow lets discuss! And if you can't get this to work, let me know and I'll try to help.

Cheers,
Joe



beowu...@gmail.com

unread,
Jan 22, 2020, 5:26:38 PM1/22/20
to SMARTNMR

Hi Joe,

Thank you very much for your script! This is most helpful. I was hoping somebody could help achieve this feature. Could we put your method on our documentation and credit you? In the future, are we able to add you to our coauthorship when we publish the next SMART paper?

Best,

Chen

Hyunwoo Kim

unread,
Jan 22, 2020, 5:30:58 PM1/22/20
to Joseph Egan, smar...@googlegroups.com
Hi MADBYTE guy!
How are you?

I'm Hyunwoo and we've met at ASP2019 before.

That's good suggestion!
We always think about importing the raw data directly to SMART!

By the way, my question is.. 
Is there anyway to use your script on original python itself? (I tried to find the way.. but failed..)
If so, we can integrate it directly to upload process..

Best,
Hyunwoo


--
You received this message because you are subscribed to the Google Groups "SMARTNMR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smartnmr+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smartnmr/21351c68-6393-4140-8831-db4a4092f1cb%40googlegroups.com.

Joseph Egan

unread,
Jan 22, 2020, 6:04:57 PM1/22/20
to SMARTNMR
Hey Chen and Hyunwoo,

Great to see this project is getting the attention it deserves (everyone is buzzing about it!), and I hope you both are well. Absolutely take it for the documentation, I'm all about helping out.

Regarding converting things around through raw python - absolutely you can do it, but you have to approach it a different way. This script works in Topspin's native python because it calls on some functions they've got installed into the environment - namely CURDATA and GETPEAKSARRAY - which do not have any place in a normal python environment.

As you both know, I've been trying to automate as much as I can to allow for people to use MADByTE when the time comes. Because of this, I have some scripts that convert the peak.txt file (or peak.xml, depending on which way the user exports) into usable formats.

To me, there's two really easy ways to think about implementing it:
  1. Users are able to drag their peak.txt file directly into your online portal
    • This would require a back end script that simply parses through the peak.txt file for relevant data and constructs the csv or dataframe - depending on your back end framework.
  2. Create a script that users can call on to convert an entire data directory's peak picked spectra into a new directory with the correctly formatted files.
    • I have a similar approach to this in MADByTE because I wanted to give users control over what experiments get carried through and which are one off NMRs and should just be separated. The user would just have to run the convert script and all the relevant peak.txt files in a given data directory would be converted into the correct format.
It really depends on how much you want the user to know about python. The first is easy if you are already running a python script on the SMART import - and I'd be happy to help. The second is easiest for you to simply give people instructions on, but might field questions with users who don't use python.

Cheers,
Joe
To unsubscribe from this group and stop receiving emails from it, send an email to smar...@googlegroups.com.

SMART NMR

unread,
Jan 23, 2020, 4:48:49 AM1/23/20
to SMARTNMR
Hi Joe,

Thank you very much. We will try out your program and let you know!

Best,

Chen

Joseph Egan

unread,
Jan 23, 2020, 1:07:10 PM1/23/20
to SMARTNMR
Hello Everyone,

Regarding Hyunwoo's question about a stand alone python script to do the conversion - I have simply refactored one that we used for MADByTE data parsing that now outputs compatible files for use in SMART.

It uses Pandas, but can easily be re-worked to not use it if it causes any trouble. To use this script, you must use the topspin command 'convertpeaklist txt' beforehand. It will prompt the user for an ID, which it then attaches to the final file. It wouldn't take long to scale it up to do whole lists of samples.

import pandas as pd
import os,csv


###for use after using the topspin command 'convertpeaklist txt' - places a file called peaks.txt in the proc folder of the exeriment.
#All peak outputs are called peak.txt by default from topspin
File = "peak.txt" # Change this to the file destination (or simply click and drag the peak.txt into the directory with the script)
print('Please enter the name of the sample')
ID
= input() #This will rename the final file so the user can keep track of what sample they have converted around.

def row_dict(filename):
   
###row_dict was written by Jeff van Santen ###
   
Rows = dict()
   
with open(filename) as p:
        reader
= csv.reader(p, delimiter=" ")
       
for row in reader:
            row
= [x for x in row if x]
           
if "#" in row or not row:
               
continue
           
else:
               
try:
                   
Rows[row[0]] = [row[3],row[4]]
               
except:
                   
pass
   
return Rows
Rows = row_dict(os.path.join(File))
HSQC_Data
= pd.DataFrame.from_dict(Rows, orient='index',columns = ['1H','13C']).astype('float')
HSQC_Data
= HSQC_Data.sort_values(by=['1H'],ascending = True).round({'1H':2,'13C':1})
HSQC_Data
.to_csv(ID+'_SMART_Peak_List.csv',index=False)


SMART NMR

unread,
Jan 23, 2020, 8:31:12 PM1/23/20
to SMARTNMR
Thank you very much Joe! The original Topspin 4.0.6 peak list export function was not easy to use, compared with the script you wrote.

Best,

Chen

mwa...@gmail.com

unread,
Jan 28, 2020, 2:13:15 PM1/28/20
to SMARTNMR
Thanks Joe,

Its definitely in our best interests to support the widest range of inputs (within reason). It seems very straightforward to implement such an input for TopSpin. Do you happen to have any example files you could share with us for us to test our implementation before releasing it live? 

Thanks!

Ming

Joseph Egan

unread,
Jan 28, 2020, 2:50:51 PM1/28/20
to SMARTNMR
Hey Ming,

Absolutely I do! Just as a heads up, we run Avance III instruments here, so we also use topspin 3.6.1 - topspin 4 is for the newer magnets (as far as I understand) and I have not messed around with the file structure from that yet.

Regarding Topsin's data structure, when the command 'convertpeaklist txt' is used, it formats the peak picked list as a .txt file and places it in the 'pdata' folder within the sample directory.

So, just as a quick breakdown of how it's stored:
  • NMR_Data_Directory
    • Sample_Name 
      • experiment number
        •  pdata
          • Proc_number
            • peak.txt
Further, when a user manually picks peaks in topspin, it annotates each peak picking event as a different section. This was a slight problem, since you have to read line by line and get rid of the meta-data between each line of peak information. You can see this when you open the peak.txt file I've provided. The file is for Erythromycin (and when tested, returns a very nice cosine score!) - which is also listed in the metadata as NAME=HND_Erythromycin - it would be very simple to use this as a default naming scheme if the users are going to drag in a whole pile of 'peak.txt' files.

If you need more info/file/scripts/cookies/coffee, let me know!

Cheers,
Joe
peak.txt

mwa...@gmail.com

unread,
Feb 6, 2020, 5:36:00 PM2/6/20
to SMARTNMR
Joe,

Thanks!

We'll try to include it in the next version!

Ming

mwa...@gmail.com

unread,
Feb 20, 2020, 6:39:32 PM2/20/20
to SMARTNMR
Hi Joe,

We just pushed Top Spin support live. Try it out, your test seems to work, but I don't have more test cases. Let me know if it works for you and if there is any failures. 

Thanks so much for the help!

Ming

jmeg...@gmail.com

unread,
Feb 20, 2020, 7:20:56 PM2/20/20
to mwa...@gmail.com, SMARTNMR

Hey Ming,

 

Looks great! I tried the auto peak picked text files and manually annotated text files and all the cases passed. The thing to keep in mind is that the manual peak picking in topspin outputs each separate peak picking event as a different section with different headers – which is why it isn’t as simple as cut and paste. This way, users can use the ‘notes/annotations’ areas of the peak picking options in topspin and it looks like it’s completely unaffected.

 

I ran Erythromycin, Betulinic Acid, and Staurosporine through and in each case, it got pretty close! (All my spectra are in DMSO, so I don’t expect there to be a perfect match)

 

Cheers,

Joe

--

You received this message because you are subscribed to the Google Groups "SMARTNMR" group.

To unsubscribe from this group and stop receiving emails from it, send an email to smartnmr+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smartnmr/9f501faa-e1ab-4a75-983a-e365bb24a789%40googlegroups.com.

beowu...@gmail.com

unread,
Feb 20, 2020, 7:33:31 PM2/20/20
to SMARTNMR
Thank you very much Joe! The feature is awesome!

Chen

To unsubscribe from this group and stop receiving emails from it, send an email to smar...@googlegroups.com.

Raphael Reher

unread,
Feb 20, 2020, 7:41:53 PM2/20/20
to SMARTNMR
Hi Joe and Ming,

thanks two both of you.
Our NMR facility manager just send me a peaklist from a more recent version of TopSpin in the .xml format.
Can we make these files work, too?
I attached the file.

Best,
Raphael


On Thursday, February 20, 2020 at 4:20:56 PM UTC-8, jmeg...@gmail.com wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to smartnmr+unsubscribe@googlegroups.com.

peaklist.xml

Raphael Reher

unread,
Feb 20, 2020, 10:28:08 PM2/20/20
to SMARTNMR
I'll make a video tutorial for raw data processing in TopSpin and MNova.
with the latest TopSpin 4.08 it is rather straightforward.

Best,
Raphael
Reply all
Reply to author
Forward
0 new messages