Python Script for Easy Conversion of NMR Data for SMART

299 views

Skip to first unread message

Joseph Egan

unread,

Mar 25, 2020, 8:49:46 PM3/25/20

to SMARTNMR

Hello Everyone,

I've got a small new script that takes all three major processing software peak outputs and does an easy convert for use in SMART to reduce the amount of data handling you need to do.

See the code below, or simply download it from the attached file. Note: it requires Python and Pandas (If you have anaconda installed on your machine, you'll be fine). It should be noted that for the Topspin portion, this is not nessisary anymore since the integration of the code with SMART.

Using TOPSPIN:Peak pick your data and type the command "convertpeaklist txt"

Navigate to your processed data folder (usually in SampleName/experiment_number/proc_number
The file you're looking to convert is the peak.txt file. (on windows, simply SHIFT+right-click and you can copy as path)
Use this as the path when prompted by the script

Using Delta: Peak pick your data and save your peak list (from the peak table option under Analyze - click the floppy disk)

Remember where you save this file and what it is called
Use this as the path when prompted by the script

Using MestreNova:Peak pick your data and 'save as'

Select 'Script: NMR Peak Table' as your output type
Save file and remember the path to the file (or use the right click trick outlined above)
Use this as the path when prompted by the script

If anybody wants, I can construct a GUI and a batch processing option, but unless there is demand I'll keep it on the back burner.

Cheers, and happy Wednesday!

-Joe

### Written by Joseph Egan for the SMART NMR team to use as they see fit. Jeffery van Santen wrote the function row_dict fx to allow for ease of use ###
### Version 1.0 ##
### Suggestions? Pass them along by emailing Joe at MADBy...@gmail.com###
import pandas as pd
import os,csv

### Functions ###
#### For JEOL Data: ######
def Jeol_Data_Converter(ID,File_In):
    Working_File = pd.read_csv(File_In)
    Working_File = Working_File[['X','Y']]
    Working_File.columns = ['1H','13C']
    Working_File = Working_File.round({'1H':2,'13C':1}).sort_values(by=["1H"])
    Working_File.to_csv(ID+'_SMART_Peak_List.csv',index=False)


### For Topspin Data: #####
###for use after using the topspin command 'convertpeaklist txt' - places a file called peaks.txt in the proc folder of the exeriment.
#All peak outputs are called peak.txt by default from topspin
def Topspin_Data_Converter(ID,File_In):
    def row_dict(filename):
    ###row_dict was written by Jeff van Santen for use in MADByTE ###
    Rows = dict()
    with open(filename) as p:
        reader = csv.reader(p, delimiter=" ")
        for row in reader:
            row = [x for x in row if x]
            if "#" in row or not row:
                continue
            else:
                try:
                    Rows[row[0]] = [row[3],row[4]]
                except:
                    pass
    return Rows
    Rows = row_dict(os.path.join(File_In))
    HSQC_Data = pd.DataFrame.from_dict(Rows, orient='index',columns = ['1H','13C']).astype('float')
    HSQC_Data = HSQC_Data.sort_values(by=['1H'],ascending = True).round({'1H':2,'13C':1}).sort_values(by=["1H"])
    HSQC_Data.to_csv(ID+'_SMART_Peak_List.csv',index=False)

### For MestRenova Data: ###
def Mestrenova_Data_Converter(ID,File_In):
    data = pd.read_csv(os.path.join(File_In),delimiter='\t',skiprows=1)
    data.columns=['13C','1H','Intensity','MT1','MT2','MT3','MT4','MT5','MT6','MT7'] #Mestrenova likes to make thigs difficult with column parsing...
    data = data[['1H','13C']]
    HSQC_Data = data.copy().astype("float").round({'13C':1,"1H":2}).sort_values(by=["1H"], ascending=True)
    HSQC_Data.to_csv(ID+'_SMART_Peak_List.csv',index = False)

### User interface ###
print('Please enter the name of the sample:')
ID = input()
print('Please type the PATH to the file (including the name of the file)\n Alternatively, if this script is in the same directory, simply type the name of the file:')
File_In = input()
print('What type of data is this? Type "T" for Topspin, "M" for MestreNova, or "D" for Delta')
Datatype = input()
if Datatype == 'D':
    Jeol_Data_Converter(ID,File_In)
elif Datatype == 'T':
    Topspin_Data_Converter(ID,File_In)
elif Datatype == 'M':
    Mestrenova_Data_Converter(ID,File_In)
else:
    print('File Type not recognized.')
print('Formatting Done.')