How to control jobs on a NiFi cluster / How is it supposed to work?

248 views
Skip to first unread message

Harry Swijnenburg

unread,
Oct 23, 2017, 6:30:50 AM10/23/17
to Kylo Community
Goodmorning,

Currently I have Kylo 0.8.3 running on 1 node (kylo1) and Nifi 1.3.0 running on two separate nodes (Nifi1 and Nifi2) in a cluster.
Both Nifi nodes are reporting to Active MQ

The feed is created and scheduled on both nifi nodes. This causes both jobs starts running.

On nifi1 the jobs runs correct but the job fails expectedly on nifi2. 

2017-10-23 11:10:21 INFO  http-nio-8420-exec-10:NiFiTemplateCache:121 - Returning Cached NiFi template a29f0df2-6003-4d4a-afe9-15f63c6bead7, GeneriekePipeline_v02
2017-10-23 11:10:21 INFO  http-nio-8420-exec-10:RegisteredTemplateService:457 - Merging properties for template GeneriekePipeline_v02 (4351ebd7-1f54-499c-834f-5ea7d6913a06)
2017-10-23 11:10:43 INFO  http-nio-8420-exec-8:NiFiTemplateCache:121 - Returning Cached NiFi template a29f0df2-6003-4d4a-afe9-15f63c6bead7, GeneriekePipeline_v02
2017-10-23 11:10:44 INFO  http-nio-8420-exec-8:TemplateCreationHelper:622 - Versioning Process Group dev_nds_v2 
2017-10-23 11:10:44 INFO  http-nio-8420-exec-8:TemplateCreationHelper:625 - Disabled Inputs for dev_nds_v2 
2017-10-23 11:10:44 INFO  http-nio-8420-exec-8:TemplateCreationHelper:629 - Stopped Input Ports for dev_nds_v2, 
2017-10-23 11:10:44 INFO  http-nio-8420-exec-8:TemplateCreationHelper:648 - Renamed ProcessGroup to  dev_nds_v2 - 1508749844614, 
2017-10-23 11:10:49 INFO  http-nio-8420-exec-8:AbstractNiFiProcessorsRestClient:59 - About to update the schedule for processor: 08d039ec-e899-32bb-a217-0494aabb4116 to be CRON_DRIVEN with a value of: 0 15 11 1/1 * ? * 


2017-10-23 11:15:00 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: b8a479c6-9065-4801-9360-1209b5b6c8b3,  2 events from the thinkbig.feed-manager queue 
2017-10-23 11:15:01 INFO  DefaultMessageListenerContainer-1:JpaBatchJobExecutionProvider:510 - Created new Job Execution with id of 24 and starting event ProvenanceEventRecordDTO{eventId=124, processorName=setEnvironment, componentId=4ac91a88-ef6a-33fc-bbc0-27b0c9202265, flowFile=400dcafa-ecd9-488d-a011-c9a45afab8ee, eventType=ATTRIBUTES_MODIFIED, eventDetails=null, isFinalJobEvent=false, feed=dev_pipelines.dev_nds_v2} 
2017-10-23 11:15:01 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution setEnvironment on Job: 24 using event 124 
2017-10-23 11:15:01 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution InitPipeline on Job: 24 using event 123 
2017-10-23 11:15:01 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: cce2016d-b239-4dc2-a329-103fbe39e83d,  2 events from the thinkbig.feed-manager queue 
2017-10-23 11:15:03 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: 349d77a7-4b6d-4950-9c9c-42aead39a8ad,  6 events from the thinkbig.feed-manager queue 
2017-10-23 11:15:04 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution nds_check_metadata on Job: 24 using event 125 
2017-10-23 11:15:04 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution checkSucces_2 on Job: 24 using event 130 
2017-10-23 11:15:04 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution kinit on Job: 24 using event 129 
2017-10-23 11:15:04 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution checkSucces_1 on Job: 24 using event 127 
2017-10-23 11:15:06 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: dc41feb0-c622-4356-b78d-8c9d0eb19310,  2 events from the thinkbig.feed-manager queue 
2017-10-23 11:15:06 INFO  DefaultMessageListenerContainer-1:JpaBatchJobExecutionProvider:357 - Finishing Job: 24 with a status of: COMPLETED for event: 132 
2017-10-23 11:15:07 INFO  DefaultMessageListenerContainer-1:JpaBatchStepExecutionProvider:141 - New Step Execution Job_Failed on Job: 24 using event 132 
2017-10-23 11:15:07 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: f2e72d35-fdbb-40ce-aa0f-a4b2e5eab669,  6 events from the thinkbig.feed-manager queue 
2017-10-23 11:15:25 INFO  DefaultMessageListenerContainer-1:ProvenanceEventReceiver:144 - About to process batch: 8a9beb7a-1ba9-439f-be9f-147ab0bd5f2e,  3 events from the thinkbig.feed-manager queue 
 

Question 1:

In kylo dashboard I see the feedback of 1 flow. In this case the flow from nifi2 (the failed one). The feedback from nifi1 is not showed. Why?

Question 2:

a. Is it possible to schedule a feed on a specific NiFi node? 
b. When creating a feed I found the option  "Execution Node" on the schedule tab which offers me two options: "All nodes" and Primary node. Are this nifi nodes? And if yes, where to set te primary node? 

Question 3:

Do I have to cluster Kylo as well? Because NiFi is handling the schedule and is doing all the wordk  I decided to cluster only Nifi and leave Kylo stand alone.

Kind regards,

Harry
 
JobDetail_failed from nifi2.PNG

Harry Swijnenburg

unread,
Oct 23, 2017, 8:03:51 AM10/23/17
to Kylo Community
When running again i see feedback from both servers. (see attached picture). I didn't change anything. 

dev_pipelines.dev_nds_v2 started twice on 11:15 and 13:15. Only the run at 13:15 shows the correct feedback while run 11:15 only shows the feedback from 1 server. 


Question 1 seems to be answered, but the feedback is not always complete



JobDetail_from 2 nifi servers returns.PNG

Scott Reisdorf

unread,
Oct 23, 2017, 8:32:02 AM10/23/17
to Kylo Community
Question 1:
  When you cluster NiFi Zookeeper is used to manage/elect the Primary Node.  Each node in the cluster will get an exact replica of the templates
. How do you have it scheduled to run on All Nodes or just the Primary Node?
  Kylo should have received/finished the job that started on 13:20 as it just gets Provenance data from NiFi (no matter the node) via JMS.   Can you share logs as around that time?   
   When you drill into the "Running" job on that time does it have any step execution data?

Question 2:
  NiFi uses Zookeeper to do this.  Visit the NiFi Clustering Configuration section here for more information:
  https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#clustering

Question 3:
 No Kylo doesnt need to be clustered.  Clustering Kylo is used for HA in case a Kylo node goes down.  It is unrelated to NiFi Clustering. 


I also noted your execution routed to a step called "Job Failed".  If this should have failed your Job you can get it to do it and show up as a "failure" in operations manager by making sure the NiFi connection to that processor has the word "failure" in it.


Reply all
Reply to author
Forward
0 new messages