Running PF2

chris blair

unread,

Jan 15, 2017, 6:36:36 PM1/15/17

to PartitionFinder

Hi Rob,

I am trying to run PartitionFinder2 on OSX and am getting some errors regarding python dependencies.

Apparently I do have anaconda and python 2.7. Any thoughts? Thanks

Chris

Christophers-MacBook-Pro-2:partitionfinder-2.1.1 christopherblair$ python --version

Python 2.7.12 :: Continuum Analytics, Inc.

Christophers-MacBook-Pro-2:partitionfinder-2.1.1 christopherblair$ anaconda --version

anaconda Command line client (version 1.4.0)

Rob Lanfear

unread,

Jan 15, 2017, 6:40:40 PM1/15/17

to PartitionFinder

Hi Chris,

This might be caused by a few things. First things first, can you follow these steps:

1. Type 'python' at the commandline, and copy and paste what you get.

2. Copy and paste the errors about dependencies that you are getting.

Cheers,

Rob

chris blair

unread,

Jan 15, 2017, 8:37:43 PM1/15/17

to PartitionFinder

Hi Rob,

I figured it out. I think it was because I had Miniconda2 installed and that is where python was coming from (thus lacking some of the dependencies). I did have a version of Anaconda as well. I just uninstalled both and reinstalled Anaconda.

On a related note, do you have any thoughts/recommendations for analyzing a data set containing >3000 data blocks? Im running PartitionFinder with the rcluster defaults now, but am receiving a warning message stating that I may want to try something different (10,769,949 subsets).

Chris

Rob Lanfear

unread,

Jan 15, 2017, 9:40:12 PM1/15/17

to PartitionFinder

Hi Chris,

Thanks for updating us. It's useful to have this here, because I'm sure this will be a common problem.

In case there are others who are stuck on this issue, here are three potential solutions if you are having dependency problems:

1. Do as Chris did and uninstall any anaconda/miniconda installations, then re-install the correct version of anaconda

2. Use a Conda python environment as described here: https://groups.google.com/forum/#!topic/partitionfinder/zQDYzrFf0Bw

3. Install your dependencies by hand like this conda install numpy pandas pytables pyparsing scipy scikit-learn

Cheers,

Rob

P.S. To answer your question Chris. My advice for a really large dataset is:

1. Do as you are, trying rcluster with default settings first.

2. Look at the improvements in the AICc score: as the algorithm progresses, these will tend closer and closer to zero. Most datasets have a long tail of small improvements, and you can use the changes to guesstimate how long it might take to finish.

If it's looking too slow, either of these will help, i'd try them sequentially first...:

1. Try reducing --rcluster-max (see the manual)

2. Try using the rclusterf algorithm (search = rclusterf; see the manual for details).

Cheers,

Rob

chris blair

unread,

Jan 17, 2017, 9:37:45 PM1/17/17

to PartitionFinder

Hi Rob,

Thanks for the suggestions. The original analysis with defaults is still running. It's been going for a few days now. I'm trying to wait it out....

I am a bit confused regarding the --rcluster-max option. The manual states that the value for this will be the larger of either N (e.g. the default of 1000) or 10 times the number of data blocks. Thus, if my data set has 3000 blocks, wouldn't changing N to 100 make no difference?

Chris

On Sunday, 15 January 2017 18:36:36 UTC-5, chris blair wrote:

Rob Lanfear

unread,

Jan 18, 2017, 5:15:08 AM1/18/17

to PartitionFinder

Hi Chris,

The default is not 1000. It's the larger of 1000 or 10x the number of datablocks. Probably I should just have written this as '10x the number of datablocks with a minimum of 1000'.

If you change it at the commandline, it gets set to that value. You can make it whatever number you like. E.g. if you set it to 1, then the rcluster algorithm will consider just a single scheme at each step (not advisable, but you get the point).

Cheers,

Rob

--
You received this message because you are subscribed to the Google Groups "PartitionFinder" group.
To unsubscribe from this group and stop receiving emails from it, send an email to partitionfinder+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Rob Lanfear
Ecology, Evolution, and Genetics,

The Australian National University,

Canberra

www.robertlanfear.com

Reply all

Reply to author

Forward