--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/6650a324-4d10-40fc-809e-f8c5232f618a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
final GranularitySpec granularitySpec = ingestionSchema.getDataSchema().getGranularitySpec();
final int targetPartitionSize = ingestionSchema.getTuningConfig().getTargetPartitionSize();
final TaskLock myLock = Iterables.getOnlyElement(getTaskLocks(toolbox));
final Set<DataSegment> segments = Sets.newHashSet();
final Set<Interval> validIntervals = Sets.intersection(granularitySpec.bucketIntervals().get(), getDataIntervals());
if (validIntervals.isEmpty()) {
throw new ISE("No valid data intervals found. Check your configs!");
}
for (final Interval bucket : validIntervals) {
final List<ShardSpec> shardSpecs;
if (targetPartitionSize > 0) {
shardSpecs = determinePartitions(bucket, targetPartitionSize, granularitySpec.getQueryGranularity());
} else {
int numShards = ingestionSchema.getTuningConfig().getNumShards();
if (numShards > 0) {
shardSpecs = Lists.newArrayList();
for (int i = 0; i < numShards; i++) {
shardSpecs.add(new HashBasedNumberedShardSpec(i, numShards, jsonMapper));
}
} else {
shardSpecs = ImmutableList.<ShardSpec>of(new NoneShardSpec());
}
}
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACW6ntebTcggyu2Of6NFtrU5uH9u6e-6Si2sdmMiM0mPQ2TPmQ%40mail.gmail.com.
Nishant,
1st time is not related to targetPartitionSize but to getting valid intervals. targetPartitionSize related is an extra pass that i am already avoiding by using targetPartitionSize=-1. I have highlighted the 2 calls below. getDataIntervals does a full pass on the data.final GranularitySpec granularitySpec = ingestionSchema.getDataSchema().getGranularitySpec();
final int targetPartitionSize = ingestionSchema.getTuningConfig().getTargetPartitionSize();
final TaskLock myLock = Iterables.getOnlyElement(getTaskLocks(toolbox));
final Set<DataSegment> segments = Sets.newHashSet();
final Set<Interval> validIntervals = Sets.intersection(granularitySpec.bucketIntervals().get(), getDataIntervals());
if (validIntervals.isEmpty()) {
throw new ISE("No valid data intervals found. Check your configs!");
}
for (final Interval bucket : validIntervals) {
final List<ShardSpec> shardSpecs;
if (targetPartitionSize > 0) {
shardSpecs = determinePartitions(bucket, targetPartitionSize, granularitySpec.getQueryGranularity());
} else {
int numShards = ingestionSchema.getTuningConfig().getNumShards();
if (numShards > 0) {
shardSpecs = Lists.newArrayList();
for (int i = 0; i < numShards; i++) {
shardSpecs.add(new HashBasedNumberedShardSpec(i, numShards, jsonMapper));
}
} else {
shardSpecs = ImmutableList.<ShardSpec>of(new NoneShardSpec());
}
}Thanks,Sandeep
On Mar 3, 2016, at 10:22 AM, Nishant Bangarwa <nishant.bangarwa@metamarkets.com> wrote:
IndexTask does the first pass to determine the number of shards,If you already know that the data is small enough and will never be sharded, you can set the targetPartitionSize in tuningConfig to -1.This will avoid the first pass.
On Wed, Mar 2, 2016 at 1:52 AM <sbho...@netskope.com> wrote:
Hello,I have noticed that the indexing task reads the complete dataset 2 times when creating a non partitioned segment:-1st time to get the data intervals-2nd time to actually create the segmentIs there a way to reduce this to a single pass?Thanks,
Sandeep--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/6650a324-4d10-40fc-809e-f8c5232f618a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/b7a04ac4-8beb-44f4-9aee-41275addcfd1%40googlegroups.com.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3660f37b-4bee-44be-ba1b-3338aedd9ce4%40googlegroups.com.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/5233a453-6e5b-493b-9ac7-e7ebea33da64%40googlegroups.com.
Nishant,
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/6650a324-4d10-40fc-809e-f8c5232f618a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACW6ntebTcggyu2Of6NFtrU5uH9u6e-6Si2sdmMiM0mPQ2TPmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/b7a04ac4-8beb-44f4-9aee-41275addcfd1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3660f37b-4bee-44be-ba1b-3338aedd9ce4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/fN-AP-dEVQo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/5233a453-6e5b-493b-9ac7-e7ebea33da64%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/0F279CF3-A3C9-4307-A84B-45F934E20345%40netskope.com.
HadoopDruidIndexer
runs hadoop jobs in order to separate and index data segments. It takes advantage of Hadoop as a job scheduling and distributed job execution platform. It is a simple method if you already have Hadoop running and don’t want to spend the time configuring and deploying the Indexing service just yet.To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYAXa_XR%3DBTLrW1GE2A-4w0aXm3iaeZKy7pFSyFof9fcow%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/1D5B95DF-0C7A-41C1-B37A-39DDB1E08ADC%40netskope.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACZNdYDs6X98hXCk10MwX21sUVQ4ZoEzB0wWa9gai1g1P%3DcdzA%40mail.gmail.com.