Hello everyone,
I am currently having issues with running out of memory while trying to bootstrap. I was wondering if anyone could please look at my code and tell me how to better allocate memory or set up a loop to break up my data into something more manageable for this purpose? I’ve made an attempt at checkpointing and making smaller batch sizes for bootstrap runs, but haven’t had success. I’ve also attached screenshots of the memory available for the computer I will be doing this on.
Thanks for any help,
Nick
#bootstraps
#place all bootstrap files from RAxML in a folder called bootstraps and work in that with command line, not julia, to make boots.txt
ls /mnt/Symsym/nick/phylonetworks/spruceup.phylonetworks.1hybrid.no.outgroup/bootstraps/RAxML_bootstrap.Locus_* > boots.txt
cp boots.txt ../ #copy it into the working directory that julia is in
# now in julia do the following
#read bs trees:
bootTrees = readBootstrapTrees("boots.txt")
### Julia crashed mid bootstrapping so I need to reload net1
using PhyloNetworks
# Load the network from the output file
net1 = readTopology("net1.out")
# Verify the loaded network
println(net1)
# Continue with further steps such as bootstrapping or plotting
## # Add 4 processors to speed things up # I think on the google group I saw that bootstrapping only ever uses 1 processor anyways, so skip this.
## using Distributed
## addprocs(4) # Adjust the number of processors as needed
## @everywhere using PhyloNetworks
## bootnet = bootsnaq(net1, bootTrees, hmax=5, nrep=100, runs=10, filename="bootsnaq1_raxmlboot")
## bootnet = bootsnaq(net1, bootTrees, hmax=5, nrep=100, runs=10, filename="bootsnaq1_raxmlboot") # this should have worked but I ran out of memory.
########## the first runs using 4 and 1 processors got killed so I tried the follwing code to checkpoints progress and free up memory but I still ran out of memory..
# using Distributed # this & the next 2 lines add processors but I ran out of memory so I am skipping to have only 1 processor
# addprocs(4) # Adjust the number of processors to 4
# @everywhere using PhyloNetworks
using JLD2 # For saving and loading checkpoints
using Serialization # Alternative if you prefer binary serialization
# Define the number of bootstrap replicates per batch (checkpoint)
nrep_total = 100
batch_size = 1 # Save every 1 replicate
hmax = 5
runs = 10
filename = "bootsnaq1_raxmlboot"
# Define checkpoint file
checkpoint_file = "bootsnaq_checkpoint.jld2"
# Load previous checkpoint if it exists
if isfile(checkpoint_file)
@info "Loading previous checkpoint from $checkpoint_file"
bootnet, completed_reps = JLD2.load(checkpoint_file, "bootnet", "completed_reps")
else
@info "Starting fresh bootstrap estimation"
bootnet = HybridNetwork() # Initialize with your network or empty one
completed_reps = 0
end
# Run the remaining batches
while completed_reps < nrep_total
nrep_current = min(batch_size, nrep_total - completed_reps)
# Run bootsnaq for the current batch
bootnet = bootsnaq(net1, bootTrees, hmax=hmax, nrep=nrep_current, runs=runs, filename=filename)
# Update the number of completed replicates
completed_reps += nrep_current
# Save checkpoint
@info "Saving checkpoint after $completed_reps replicates"
JLD2.save(checkpoint_file, "bootnet" => bootnet, "completed_reps" => completed_reps)
# Clear the previous bootnet if not needed
bootnet = nothing # Release memory from the previous bootnet
# Trigger garbage collection
GC.gc() # Run garbage collection to free up memory
end
@info "Bootstrap estimation complete."