I have been estimating IBD estimates on every growing exome datasets (now up to >40K exomes) and I have seen very similar occurrence in the data. I have sorted things out and developed my own custom pipeline to clean both individuals and SNPs from the dataset. Depending on how your dataset was collected, it is not entirely unreasonable to find several individuals with five 3rd degree relatives. If those IBD estimates look clean, and the samples are not flagged/removed for other reasons, I would be included to believe those relationships, especially if they are closer than 3rd degree and the Z2 values is close to 0 (except for full-sibs). You may try using PRIMUS (
https://primus.gs.washington.edu/primusweb) to see if you can reconstruct those relationships into pedigrees. I would be very interested to see the Z0 vs Z1 plot as it should look something like the IBD0 vs IBD1 plot (Figure S5) in the PRIMUS paper (
http://www.cell.com/ajhg/abstract/S0002-9297(14)00427-3).
In my dataset, I too found several individuals who are 3rd degree or closer relatives to 10s of thousands of others in the dataset, which is not possible. I often find that these samples are problematics and should be removed. One common problem is that they often have high heterozygosity rates compare to the other samples in the dataset. PLINK's --het option will give you the het rates and you can make a histogram of the output in R. Inflated het rates are often due to some form of contamination. Another reason I have found overly related samples is high SNP missingness in those samples. I would recommend removing samples with >10% missing calls (--mind 0.1), but make sure to run this option after you have ran the --geno. Other filters I have found useful are --het 0.1, --geno 0.05, --hwe 0.000001. Many people recommend removing SNPs in LD, but I often don't and I haven't seen any problems yet. I find that with exomes, the more good SNPs you can keep in the dataset, the better of you will be.
I also calculate reference allele frequencies for the population by using an unrelated subset of the data. I have found that the GCTA genetic relationship matrix calculations from PLINK are much less accurate at predicting relationships than my cleaned IBD estimates. I calculate naive IBD estimates, remove individuals with >100 relatives with pi_hat > 0.1875, prune the dataset to an unrelated set using the IBD estimates, then use the unrelated set to obtain reference allele frequencies (using the --freq option in PLINK). I then calculate IBD estimates by reading in the reference allele frequencies with the --read-freq option in PLINK. If the population appears to be admixed, then I also remove ancestry informative markers.
Much of this pipeline has been built out and automated in the public version of PRIMUS (call prePRIMUS or PRIMUS IBD pipeline), but you may find it easier and/or better to do the filtering on your own.