A small bash script to concatenate phyloxml trees

34 views
Skip to first unread message

Nicolas Rochette

unread,
Aug 18, 2011, 9:41:40 AM8/18/11
to phyloXML
Hello all,

I have written a little script to concatenate a series a phyloxml
trees, and I thought, why not share it ?

Regards,

Nicolas Rochette

Individual trees are expected to be written in the following form,
which is I think the only/main one :
<?xml ... >
<phyloxml ... >
<phylogeny ... >
...
</phylogeny>

Usage :

Joins several phyloxml trees into a single file.
Usage:
exe* -o PHYLOXML_OUT [PHYLOXML_TREE ...]
In the output, trees are named after the basename of their original
file.

And code :

#!/bin/bash

usage=\
'Joins several phyloxml trees into a single file.
Usage:
exe* -o PHYLOXML_OUT [PHYLOXML_TREE ...]
In the output, trees are named after the basename of their original
file.
'

# Process arguments
out=''
trees=('')
if [[ "$1" == "-o" ]] && [[ "S2" ]]; then
if [[ ! -e "$2" ]] && (>"$2"); then
out="$2"
rm "$out"
trees=("$@")
trees=(${trees[@]:2})
if [[ "${#trees[@]}" -eq 0 ]]; then
printf "$usage"
exit
fi
elif [[ -e "$2" ]]; then
printf "Output '$2' already exists. Abort.\n"
exit
else
printf "Could not write to output '$2'. Abort.\n"
exit
fi
else
printf "$usage"
exit
fi

# Join trees to '$out'
for tree in "${trees[@]}"; do
if [[ ! -e "$tree" ]]; then
printf "Warning! Tree '$tree' does not exist.\n"
continue
fi
[[ ! -e "$out" ]] && head -n2 "$tree" > "$out" # '$out' is created
at the first existing tree
awk 'NR==3' "$tree" >> "$out"
name=$(basename "$tree")
printf " <name>$name</name>\n" >> "$out"
awk 'NR>3' "$tree" | grep -v '</phyloxml>' >> "$out"
done
[[ -e "$out" ]] && printf '</phyloxml>\n' >> "$out" # Conditional, in
case none of the trees exist

Christian Zmasek

unread,
Aug 18, 2011, 10:56:44 PM8/18/11
to phyl...@googlegroups.com
Thank you for posting this script!

-Christian-
Reply all
Reply to author
Forward
0 new messages