Hi Ronaldo,
Yes, there are a large number of possibilities there!
I'd suggest that:
1. For this application, you would be best to go directly to IDTxl, which wraps higher level algorithms around JIDT for these purposes. For a given target, it will gradually learn the multivariate set of sources, optimising the lag for each along the way. The final conditional TEs it returns for each source will be the conditional TE from that source conditioned on the parent set (without that particular source included).
2. Otherwise, without going and learning the set of parents for each source first, it's a little undefined what you want to / should do. You could compute conditional TE from every source to the target, conditioned on every other source. But as you say, how to set the lag for each conditional. You could set them in a pairwise fashion - finding the lag that maximises TE from the given conditional source to the target, and then using that lag for that conditional in the calculation for every other source to that target. That probably makes the most sense. But it still ignores the possibility that the lag you might consider could be changed once you start to consider it in conjunction with the other sources.
hope that helps,