|
|
Vincent Davis | |
Vincent-
Actually I do have one question, which is in regards to the best way
to do ANCOVA using Python. Right now I'm using R for it, but being
able to do it with Python would enable me to integrate it into my
program much better. I'm currently doing data analysis on microarrays
as well. I will check out bioconductor, but so far I have been able to
do everything I needed with just python. It is just so much easier to
pull pieces of data from multiple files and piece them together in
Python.
I will incorporate the suggestions you made. I was racking by brains
on how I can apply this function to ANOVAs with more than three sets
of values. I'll come back if I have any problems on this.
The qcritical values table was from a website and I verified it with
the table found in "Biostatistical Analysis" by Zar(this is also the
same place that I got the equations for the Tukey test). I still don't
know how I can incorporate a table within Python. I can do
dictionaries, but how do you make a table that you can access using a
column and row number?
I know there are a few improvements, but not sure how to do them. I've
only been programming Python for two months(I'm also just as new to
statistics). I will make the changes Vincent pointed out, and I will
add an explanatory portion like you mentioned to this as well so we
can have some clear cut documentation.
|
|
Thank you for the idea on the csv file. I'll try that out, as well as
try and understand larry.
"My impression from the quick reading of the Wikipedia page, was that
the test starts with the largest difference and works to the smaller
ones, which I thought you were doing. But, as I said, I didn't try to
understand all the details. "
I will double check the wiki page, however I will use standard
equations that I find in "Biometry" by Sokal, and "Biostatistical
Analysis" by Zar. They are fairly standard biostat publications.
"Welcome aboard and sorry about the delay in approving your initial
message. "
Thank you much. I do hope to contribute as much as I can with my tight
schedule.
Vincent -
"Most of what I have done has been comparing individual probes which
is very
difficult and not so fruitful. I agree that python is a good choice as
the
data is difficult to deal wuth and get into the format yuo need.
I do have a quantile-normalization written in python if you have need.
"
That sounds great, I'm sure I will need that in near future. I have
been doing comparative genomics using microarray data. Basically data
dredging :). The only significant algorithm I wrote is the Tukey test.
The rest are just simple loops. However if you or anyone needs help
with how I piece files from different sets of data, I'd be glad to
help.
"Is this a one time project use you have for this or are you trying to
build
something to share?
If you are reading in large files I found pickling a fast way to save
and
open the data. "
A brief bit about myself. I just recently go my bachelors in genomics,
and am now going to start my quantitative biology. Most of the
research work I was doing was wet lab, but I got pulled off to do some
informatics projects. I am not sure myself how long I will be doing
this, but I will contribute to the best of my ability.
I should be having the Tukey test ready this weekend. I'll try and fix
it up so that it can take multiple samples, with different sample
lengths as well.
Regards,
Vikas
> larrt,http://larry.sourceforge.net/
> ...
>
> read more »- Hide quoted text -
An additional application and reference from the irc chanel:
[17:08] <bdesk> Hi, I want to do a kruskal wallis test in python so
that i can compare the ranks of groups.
[17:08] <bdesk> I found this
http://www.scipy.org/doc/api_docs/SciPy.stats.stats.html#kruskal
[17:08] <bdesk> but it looks like it just returns a statistic and
pvalue that says whether all the groups are the same as each other.
[17:09] == sophacles
[~soph...@99-20-142-25.lightspeed.dctril.sbcglobal.net] has quit
[Remote host closed the connection]
[17:09] <bdesk> whereas I want more detail, as shown at the bottom of
the output here http://www.texasoft.com/winkkrus.html
[17:10] <bdesk> Can I use scipy to do this, or do I have to use something else?
[17:14] <josefpktd> bdesk: in your reference do you mean the different
ranksums for each group or the "Tukey Multiple Comp. Difference" ?
[17:15] <bdesk> I mean whichever part I would need to infer the little
ascii bars at the bottom
[17:18] <bdesk> josefpktd: so yes, i want the whole table that has the
"Tukey Multiple Comp. Difference" column as well as the Q and critical
q (.05).
[17:19] <josefpktd> bdesk: "graphical representation of the Tukey
multiple comparisons test", you could do the pairwise tests in a loop
of all pairs, but I think Tukey does a correction of the p-values for
multiple comparison
Josef