Hey, it looks like you're off to a great start with Python. Congratulations!
1. Using csv.reader and csv.writer will simplify your code and will work more reliably:
import csv
reader = csv.reader(open('TestInput.csv'))
for row in reader:
print row
2. I think you are finding your problem hard because the data is not in the held in the right "structure" in your program. Right now you are storing a dictionary keyed by gene name with sums of the other columns. This needs to change because you want the program to do more stuff now. You need to instead store a dictionary keyed by gene name with the values of the other columns:
old_dict = {'AGPAT9': ['2', '4']}
new_dict = {'AGPAT9': { 'Column A': ['1.5', '0.5'], 'Column B': ['1', '3'] }}
Notice how the values are not summed in the "new"? There is an extra dictionary there now that preserves more information so that you retain enough information to calculate the median. Unfortunately this will complicate how you store and extract information in your dict:
To store the data:
dict = {}
row = [1, 5, 5]
for i, val in enumerate(row[]):
# The first column gets special treatment because it is our gene name
if i == 0:
GeneName = val
continue
# This part is just data structure maintenance...
if GeneName not in dict.keys:
dict[GeneName] = {}
column_name = headers[i]
if column_name not in dict[GeneName]:
dict[GeneName][column_name] = []
# Now we can actually store data...
dict[GeneName][column_name].append(value)
To read the data:
for gene_name, gene_data in sorted(dict):
for column_name, column_values in sorted(gene_data):
total = sum(column_values)
average = total / float(len(column_values))
median = median(column_values) # You have to write your own function for this
See how much easier it is to do your calculations when you have your data structures right? This is the hard part of writing computer programs and is usually a process of trial and error.
4. This is just a style tip - don't name your dictionary "dict". There is a python method called dict() and you are inadvertently making it impossible to call that method in your program. If you really want to use "dict", use "dict_". Anybody with experience in Python will know why you did that. (the same thing happens with the list() method)
If you have more questions, feel free to ask!
Jerry