Building CDAP Artifacts and Plugins from github Source Code

400 views
Skip to first unread message

Bhupesh Goel

unread,
Jul 3, 2018, 1:11:19 PM7/3/18
to CDAP User
Hello, 

I am trying to install CDAP Sandbox from its source code as provided at https://github.com/caskdata/cdap

I could build the source code and start CDAP server through eclipse IDE via running StandaloneMain class in Eclipse IDE. I could also start CDAP CLI again through Eclipse IDE via CLIMain class. Though i could run both CLI and Server after disabling UI on StandaloneMain class only.

But when i executed query to list all system artifacts on CDAP CLI, it returned empty artifact list and didn't show cdap-data-pipeline as system artifact. Attaching screenshot for the same. 


Is there anything i am missing? I need to make some changes in CDAP codebase and install CDAP from modified codebase. So, installing CDAP from its bundled ZIP file is not an option. 

So, if anyone can either help me or point me to any documentation available for how can i build/setup Artifacts and Plugins in CDAP from its source code, that would be really helpful.

Thanks,

Screen Shot 2018-07-03 at 10.00.35 PM.png

edwi...@google.com

unread,
Jul 3, 2018, 1:41:49 PM7/3/18
to CDAP User
Hi Bhupesh,

Thank you for trying out CDAP.

Regarding your issue, are you trying to make the CLI call immediately after CDAP starts up? When CDAP first started, it takes some time for the system to load all systems artifacts. Can you give it couple of minutes before you execute the query? Also can you confirm that in your standalone directory, under artifacts folder, you can see the cdap-data-pipelines JAR file?

Thanks,
Edwin Elia

edwi...@google.com

unread,
Jul 3, 2018, 2:02:24 PM7/3/18
to CDAP User
Bhupesh,

On additional note, just building SDK will not make data pipeline work completely, since you will need add the plugins. You can check out this cdap-build repository for reference to build and combine plugins from other repositories.

Also, just out of curiosity, what kind of changes are you planning to make on CDAP codebase?

Best,
Edwin Elia

Bhupesh Goel

unread,
Jul 3, 2018, 2:07:55 PM7/3/18
to CDAP User
Hi Edwin,

Thanks for quick reply.

I waited for around 3-4 minutes but still didn't get cdap-data-pipeline as listed artifact in CDAP CLI. Also i don't see cdap-data-pipeline JAR anywhere under CDAP code directory as shown in attached file. 

I guess i need to execute specific steps for building Artifacts and Plugins from codebase. Just executing steps provided at https://docs.cask.co/cdap/4.3.1/en/developer-manual/getting-started/sandbox/zip.html#running-cdap-from-within-an-ide will not be sufficient.
jar_file_list

edwi...@google.com

unread,
Jul 3, 2018, 2:19:24 PM7/3/18
to CDAP User
Bhupesh,

Yes, the steps in the docs is only for the base SDK for CDAP. To build a complete CDAP, you can see the cdap-build repository.

Best,
Edwin Elia

Bhupesh Goel

unread,
Jul 3, 2018, 2:23:16 PM7/3/18
to CDAP User
Ok. Let me try out steps provided at https://github.com/caskdata/cdap-build

Thanks, Edwin for quick responses.

Mohit Gupta

unread,
Jul 30, 2018, 1:14:14 AM7/30/18
to cdap...@googlegroups.com
Hi Edwin, All,

I am not able to build the latest release/5.0 source code. I tried following 2 ways :
  1. Using the src code : https://github.com/caskdata/cdap
  2.  

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.
To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/93f3f926-7c1f-4ecd-8ae8-9f9fc82759e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mohit Gupta

unread,
Jul 30, 2018, 1:37:33 AM7/30/18
to cdap...@googlegroups.com
Hi Edwin, All,

I am not able to run/build the latest release/5.0 source code. I tried following 2 ways :

1. Using the src code : https://github.com/caskdata/cdap (branch release/5.0)

I tried building as specified in section - Build CDAP Sandbox distribution ZIP with additional system artifacts and the build was successful. However, when I run the sandbox created in  `cdap-standalone/target/cdap-sandbox-5.0.0-SNAPSHOT.zip`, it launches UI but reports error for `missing artefacts` when clicked on tabs – Preparation and Analytics. Also, the pipeline canvas does not have any artefact. 

Am I missing anything here, isn’t this .zip the same distro as shared for other released versions? Are there any list of steps to execute for running this sandbox from this created zip – I just executed : `bin/cdap sandbox start`? It looks like some other executable(s) also need to be run – please can you provide complete steps for running full sandbox.


2. Using the specified cdap-build repository (branch – develop and release/5.0)

Here, I followed the process as described in this repo :
  Note : Readme.md says `init` and `update —remote` not required if used option `—recursive` which I did above.
$ mvn clean install -DskipTests -f apache-sentry 
   <== This fails giving missing definition for org.eclipse.jetty.jetty-server.DispatchType class in the specified jetty-server dependency version. I tried switching `apache-sentry` submodule to `master` branch where I did see change of version for this as well as the change in calling code itself, but that also fails
export MAVEN_OPTS="-Xmx3056m -XX:MaxPermSize=128m
   mvn install -DskipTests -B -am -pl cdap/cdap-api -P templates
   mvn install -DskipTests -B -am -f cdap/cdap-app-templates -P templates
$ mvn package -P examples,templates,dist,release,rpm-prepare,rpm,deb-prepare,deb,tgz,unit-tests \ 
   -Dgpg.passphrase=${GPG_PASSPHRASE} -Dgpg.useagent=false \
   -Dadditional.artifacts.dir=$(pwd)/app-artifacts   
   <== This also fails somewhere in security module only. 
   (Note that I intentionally removed `-Dsecurity.extensions.dir=$(pwd)/security-extensions` option as specified in Readme.md assuming it will not get built due to failed  `apache-sentry` submodule. However, it fails even when     this option is used)
    

Pls help me build the current compilable version of 5.0 code.


Regards
Mohit

PS : Pls ignore below msg, somehow I hit some short-key for send-mail while drafting the mail itself.

Ali Anwar

unread,
Jul 30, 2018, 4:35:06 PM7/30/18
to cdap...@googlegroups.com
Hi Mohit.

Your approach #1 is on the right track. A lot of the system artifacts are built from a separate repository and packaged alongside the sandbox distribution zip.
Use the README.rst in this repo to build the plugin artifacts: https://github.com/caskdata/hydrator-plugins/tree/release/2.0.
Then, use the command in the BUILD.rst of the cdap repo to "Build CDAP Sandbox distribution ZIP with additional system artifacts", with -Dadditional.artifacts.dir=</path/to/hydrator> pointing to the hydrator-plugins repository.

Regards,
Ali Anwar

shanka...@google.com

unread,
Jul 30, 2018, 5:26:50 PM7/30/18
to CDAP User
Hi Mohit/Bupesh,

Please follow the following steps,

1) Clone the hydrator-plugins repository, https://github.com/caskdata/hydrator-plugins
3) Build CDAP Sandbox using the following command, provide hydrator plugins path appropriately

MAVEN_OPTS="-Xmx1024m" mvn clean package -pl cdap-standalone,cdap-app-templates/cdap-etl,cdap-app-templates/cdap-data-quality,cdap-app-templates/
cdap-program-report,cdap-examples -am -amd -DskipTests -P examples,templates,dist,release,unit-tests -Dadditional.artifacts.dir=<path_to_hydrator_plugins_dir>

4) This will build the sandbox zip to `cdap-standalone/target/cdap-sandbox-5.1.0-SNAPSHOT.zip`
5) You can unzip this zip file and start using the CDAP sandbox 
6) If you want to use Intellij to run CDAP and use the pipelines, you need to copy the artifacts directory from the ../cdap-sandbox-5.1.0/artifacts to the <cdap_sources_home> path before you run the StandaloneMain from IDE.

Thanks
Shankar

Mohit Gupta

unread,
Jul 31, 2018, 1:22:53 AM7/31/18
to cdap...@googlegroups.com
Thanks Ali and Shankar for response. 

Looks like Ali suggested release/2.0 branch(as per link in his other mail) for hydrator-plugins for the compiling CDAP-5.0, I hope it is the compatible one as I also observed it from last merge commit?.

Also, does it also include artefacts (mmds-apps and wrangler-service) for using newly added Analytics and existing Preparation functionality as I could not locate their code – I want to analyse and try out the Analytics functionality specifically. And if some documentation link for it is also available – pls point me to it.

I will try out the specified steps and update back.

Very much thanks!!

Ali Anwar

unread,
Jul 31, 2018, 2:09:39 PM7/31/18
to cdap...@googlegroups.com
The hydrator-plugins repo does include wrangler-service.
However, it doesn't include mmds. You can package that from the release/1.0 branch of the mmds repo: https://github.com/cask-solutions/mmds.git

Mohit Gupta

unread,
Aug 1, 2018, 1:00:13 AM8/1/18
to cdap...@googlegroups.com
Thanks Ali. I had tried it yesterday and I could get the Analytics up and running.

Thanks for all help and congrats CDAP team on official release of 5.0 :-) 

Reply all
Reply to author
Forward
0 new messages