Hi Stephane,
I recognize the general utility of having validation tool. However, I feel this is somewhat low priority, as the tools themselves will identify the exact cause of many common problems. As such, we may consider something like this for a future release, but I would have to consider it somewhat low priority given our current development goals.
Thanks for the suggestion,
- Aaron
quinlanlab.org
On Nov 16, 2012, at 4:51 AM, Stéphane Plaisance <stephane.plaisa...@vib.be> wrote:
> Hi Aaron,
> Would it be possible to add such a command to bedtools to diagnose all possible erroneous records and put them in a file or on screen.
> A curative mode would even be nicer, cleaning the file at the same time as storing the error triggering lines.
> I from time to time get errors while running bedtools commands that relate to wrong delimiters (not anymore ;-) ) or 'end less than start'.
> I made a primitive and quite inefficient bash script to isolate the later after generating start=0 lines by liftover.
> Thanks
> Stephane
> ###
> checkbed ()
> {
> # does not check for other IFS than 'tab'
> if [[ $1 == *.bed.gz ]]; then
> name=$(basename $1 ".bed.gz");
> zcat $1 | gawk –v name=${name} 'BEGIN{FS="\t"; OFS="\t"}
> {if ($2==0 || $3<$2) print $0>name"-errors.bed"; else print $0; }'
> else
> name=$(basename name ".bed");
> cat $1 | gawk –v name=${name} 'BEGIN{FS="\t"; OFS="\t"}
> {if ($2==0 || $3<$2) print $0>name"-errors.bed"; else print $0; }'
> fi
> }