Simple pipeline with Conditional plugin

509 views
Skip to first unread message

Mario Giuffrida

unread,
May 24, 2021, 8:56:31 AM5/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com
Hello CDAP team,

I am writing a simple pipeline using the Conditional Plugin, but I am faced with an exception.

The Pipeline in subject simply reads from a csv file on a GCP Bucket, then goes into the Conditional plugin (whose conditional always evaluates to 'true') and then ends up into the Trash step.

image.png

When running in Preview mode I see the following Exception:

2021-05-24 12:43:36,263 - ERROR [spark-submitter-phase-3-a26e99a3-bc8d-11eb-9543-5246c680631b:o.a.s.i.i.SparkHadoopWriter@94] - Aborting job job_202105241243361486595782720515159_0003.
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: /data/preview/namespaces/default/data/conn-0/9738e131-bc8d-11eb-9728-5246c680631b/data
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:329) ~[hadoop-mapreduce-client-core-2.9.2.jar:na]

Find attached the pipeline and the full error trace. Could you please help us troubleshooting this issue?

Thanks and regards
Mario
oc-test-conditional-cdap-data-pipeline.json
default-917e5b8b-bc8d-11eb-9bd2-02cab3f8a7db.log

Andrew Lisi

unread,
Sep 7, 2021, 1:22:55 AM9/7/21
to CDAP User
Hi Mario,

Did you ever figure this out? I am running into the same issue.

Thanks,
Andy

Mario Giuffrida

unread,
Sep 7, 2021, 9:24:59 AM9/7/21
to cdap...@googlegroups.com
Hi Andrew,

We changed our pipeline shortly after this question, this was not needed anymore.

Regards,
Mario

-----------------------------
The information contained in this e-mail message is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or is not the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this message in error, please notify us immediately by telephone or reply by e-mail and then promptly delete the message. Thank-you.

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/566c5e8d-c812-4417-ba3b-5394e1318360n%40googlegroups.com.


--
Mario Giuffrida
LiveRamp | Engineering

Andrew Lisi

unread,
Sep 7, 2021, 2:56:59 PM9/7/21
to CDAP User
Ah ok, thank you for your response Mario. I will continue testing but so far it seems the conditional plug plugin is broken.

Sagar Batchu

unread,
Sep 7, 2021, 3:55:54 PM9/7/21
to cdap...@googlegroups.com
Hi Andrew,

What version of CDAP are you running?



--
Sagar Batchu 
LiveRamp | Engineering

Andrew Lisi

unread,
Sep 7, 2021, 5:53:47 PM9/7/21
to CDAP User
I upgraded to 6.5 last week.

Baptiste Benet

unread,
Dec 10, 2021, 8:44:38 AM12/10/21
to cdap...@googlegroups.com
Hello CDAP team,

I'm running into the same issues as Mario and Andrew: running CDAP 6.5.1, and using a very simple conditional:
image.png

The error is the same as described by Mario, and the logs are attached to the email.

Could you please share a pipeline example that works with a conditional plugin? Or even better, suggest a few solutions to this issue? I'm not sure if it's a known bug, or if few people use the Conditional plugin.

Best,
Baptiste


downloadLogs.txt

Vitalii Tymchyshyn

unread,
Dec 10, 2021, 7:52:44 PM12/10/21
to CDAP User
Hi.

What are you trying to achieve with this plugin?
The problem I see is that you are connecting it to a Sink, but the plugin by itself does not produce any data, thus the error.

Baptiste Benet

unread,
Dec 12, 2021, 3:06:42 AM12/12/21
to cdap...@googlegroups.com
Interesting, I thought the conditional plugin passes data along. What I'm trying to do is the following: based on the conditional's output, either fail the pipeline (with the FailPipeline plugin) or go on with another branch of the pipeline (Output).

Since you said that the Conditional plugin does not pass any data, I'm guessing I should use it like such:

image.png

I think I'm not fully getting it, because trying to run the above pipeline throws the following error: Stage in the pipeline 'Fail conditional yes' is on the branch of condition 'Conditional'. However it also has following incoming paths: 'Input->Fail conditional yes'. Different branches of a condition cannot be inputs to the same stage.

If you have an example pipeline that uses the Conditional plugin, I think it would greatly help me understand the usage.

Thank you,
Baptiste

Le sam. 11 déc. 2021 à 01:52, 'Vitalii Tymchyshyn' via CDAP User <cdap...@googlegroups.com> a écrit :
Hi. What are you trying to achieve with this plugin? The problem I see is that you are connecting it to a Sink, but the plugin by itself does not produce any data, thus the error. On Friday, December 10, 2021 at 5:44:38 AM UTC-8 Baptiste Benet ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd

Baptiste Benet

unread,
Dec 13, 2021, 11:24:14 AM12/13/21
to cdap...@googlegroups.com
Hey there,

After looking at the code, I understand that a stage that is on the output branch of a condition cannot have another branch as input. 

If I understand correctly, since a conditional does not pass data along, there cannot be any GCS File output following the Conditional plugin? I should modify the above pipeline like such:
1. remove connection "filter unique cats -> fail conditional yes"
2. remove connection "conditional -> Output"

Finally, even by doing this, I am running into the error we described earlier: User class threw exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/HAUtil at org.apache.twill.filesystem.FileContextLocationUtil.lookupInHAUtil(FileContextLocationUtil.java:42)

Best,
Baptiste

Sagar Kapare

unread,
Dec 14, 2021, 3:21:06 PM12/14/21
to cdap...@googlegroups.com
Hi,

Conditional node determine which branch to execute depending on the result of the conditional expression. 
It passes the output of the previous stage to the first stage after the condition node as an input.
For example, in the screenshot below for the pipeline I had, it passed the output of the Database stage to the GCS stage when the condition evaluated to true.

image.png

I tried running with the spark as an engine and it failed for me too with the same exception that you mentioned. Will file a bug for this. However as a workaround can you try running it with MapReduce as an engine?

Thanks and Regards,
Sagar 

Baptiste Benet

unread,
Dec 17, 2021, 8:43:32 AM12/17/21
to cdap...@googlegroups.com
Hi Sagar,

Thank you so much for the response and the details. I just logged this ticket, I hope that was the right way to create it: https://cdap.atlassian.net/browse/PLUGIN-1015.

I'll give it a shot over the weekend, we'll see if I can make any progress. If not, I'd be happy to pair with you to try and solve it.

Best,
Baptiste

Le mar. 14 déc. 2021 à 21:21, 'Sagar Kapare' via CDAP User <cdap...@googlegroups.com> a écrit :
Hi, Conditional node determine which branch to execute depending on the result of the conditional expression.  It passes the output of the previous stage to the first stage after the condition node as an input. For example, in the screenshot ZjQcmQRYFpfptBannerStart
Reply all
Reply to author
Forward
0 new messages