Wrangler Service Issues / Data prep

Skip to first unread message

Kushil Dodhia

unread,
Jun 24, 2021, 9:58:17 AM6/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
Hi,

We are currently using a k8s distributed version of CDAP, we are using the 6.5 release version. Currently we are unable to initiate the wrangler service.

We are seeing the following error: (both in data prep pod logs and the wrangler service logs, accessed via CDAP UI)

java.util.concurrent.ExecutionException: io.cdap.cdap.spi.data.TableNotFoundException: System table 'StructuredTableId{name='app_app_upgrade'}' not found.
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[na:1.8.0_292]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) ~[na:1.8.0_292]
at io.cdap.cdap.internal.app.runtime.distributed.AbstractProgramTwillRunnable.run(AbstractProgramTwillRunnable.java:272) ~[na:na]
at io.cdap.cdap.k8s.runtime.KubeTwillLauncher.run(KubeTwillLauncher.java:117) [io.cdap.cdap.cdap-kubernetes-6.5.0-SNAPSHOT.jar:na]
at io.cdap.cdap.master.environment.k8s.MasterEnvironmentMain.doMain(MasterEnvironmentMain.java:127) [na:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_292]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_292]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_292]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_292]
at io.cdap.cdap.master.environment.k8s.MasterEnvironmentMain.main(MasterEnvironmentMain.java:62) [na:na]
io.cdap.cdap.spi.data.TableNotFoundException: System table 'StructuredTableId{name='app_app_upgrade'}' not found.
at io.cdap.cdap.spi.data.sql.SqlStructuredTableContext.getTable(SqlStructuredTableContext.java:54) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.BasicSystemServiceContext.lambda$null$0(BasicSystemServiceContext.java:89) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.getEntityUpgradeState(UpgradeStore.java:163) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.lambda$getEntityUpgradeState$3(UpgradeStore.java:131) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.lambda$run$0(TransactionRunners.java:141) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.BasicSystemServiceContext.lambda$run$1(BasicSystemServiceContext.java:88) ~[na:na]
at io.cdap.cdap.spi.data.sql.SqlTransactionRunner.run(SqlTransactionRunner.java:74) ~[na:na]
at io.cdap.cdap.spi.data.sql.RetryingSqlTransactionRunner.run(RetryingSqlTransactionRunner.java:64) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.BasicSystemServiceContext.run(BasicSystemServiceContext.java:88) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.run(TransactionRunners.java:141) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.getEntityUpgradeState(UpgradeStore.java:130) ~[na:na]
at io.cdap.wrangler.service.DataPrepService.initialize(DataPrepService.java:99) ~[na:na]
at io.cdap.wrangler.service.DataPrepService.initialize(DataPrepService.java:53) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$6(AbstractContext.java:602) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:562) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:599) ~[na:na]
at io.cdap.cdap.internal.app.services.ServiceHttpServer.initializeService(ServiceHttpServer.java:162) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.http.AbstractServiceHttpServer.startUp(AbstractServiceHttpServer.java:174) ~[na:na]
at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) ~[com.google.guava.guava-13.0.1.jar:na]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_292]

We noticed the dataprep pod spins up and after approx 1min it will terminate and a new pod spins up in its place. This is causing a never ending loading screen when trying to enable/access wrangler.

Any assistance here will be greatly appreciated.

Thanks,
Kushil Dodhia

Albert Shau

unread,
Jun 24, 2021, 12:16:01 PM6/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
Hi Kushil,

This is odd, the table is supposed to get automatically created when the wrangler app is updated. Not sure how it can get in this state. Can you try deleting the dataprep app (With rest, DELETE /v3/namespaces/system/apps/dataprep)? CDAP should automatically recreate it and re-trigger the table creation.

Regards,
Albert

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/5AEB4D49-6214-4732-B6AC-03F0345EC6DA%40liveramp.com.

Kushil Dodhia

unread,
Jun 24, 2021, 12:58:17 PM6/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
Hi Albert, 

Thank you for your message, after trying to call the endpoint you suggested, we are facing the problem that the service is running and thus cannot be deleted 

'application:system.dataprep.-SNAPSHOT' could not be deleted. Reason: The following programs are still running: service
Additionally we have also tried deleting the pods/deployment and the CDAPMaster CRD and re deploying in the hopes that this would re-trigger this table creation. Alas no luck

Regard,
Kush 

Albert Shau

unread,
Jun 24, 2021, 1:35:40 PM6/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
Table creation happens when the app is deployed, so deleting the pods/deployment won't trigger it. You can stop the service through rest with POST /v3/namespaces/system/apps/dataprep/services/service/stop, then try deleting the app as described earlier.

Kushil Dodhia

unread,
Jun 24, 2021, 2:18:46 PM6/24/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
Hi Albert, 

Sorry to keep troubling, we have tried to stop the service and delete as described. We however are getting the same issue.

After stopping and successfully deleting the service via the API Rest calls provided, the service/pod starts up again and continues to terminate with a different but similar error. 

(Strangely this time the error is due to a failure to read and relation not existing, rather than the previous "Not Found" ) 



2021-06-24 17:46:53,993 - INFO  [main:i.c.c.i.a.r.d.AbstractProgramTwillRunnable@242] - Starting program run program_run:system.dataprep.-SNAPSHOT.service.service.0e78287e-d514-11eb-b810-9e4ae15378a5
2021-06-24 17:46:55,234 - ERROR [ServiceHttpServer STARTING:i.c.c.i.a.r.ProgramControllerServiceAdapter$1@92] - Service Program 'service' failed.
java.lang.RuntimeException: java.io.IOException: Failed to read from table app_app_upgrade with keys [Field{name='namespace', type='STRING', value='system'}, Field{name='generation', type='LONG', value='0'}, Field{name='entity_type', type='STRING', value='CONNECTION'}]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.propagateThrowable(TransactionRunners.java:223) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.propagate(TransactionRunners.java:210) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.run(TransactionRunners.java:144) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.getEntityUpgradeState(UpgradeStore.java:130) ~[na:na]
at io.cdap.wrangler.service.DataPrepService.initialize(DataPrepService.java:99) ~[na:na]
at io.cdap.wrangler.service.DataPrepService.initialize(DataPrepService.java:53) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$6(AbstractContext.java:602) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:562) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:599) ~[na:na]
at io.cdap.cdap.internal.app.services.ServiceHttpServer.initializeService(ServiceHttpServer.java:162) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.http.AbstractServiceHttpServer.startUp(AbstractServiceHttpServer.java:174) ~[na:na]
at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) ~[com.google.guava.guava-13.0.1.jar:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_292]
Caused by: java.io.IOException: Failed to read from table app_app_upgrade with keys [Field{name='namespace', type='STRING', value='system'}, Field{name='generation', type='LONG', value='0'}, Field{name='entity_type', type='STRING', value='CONNECTION'}]
at io.cdap.cdap.spi.data.sql.PostgresSqlStructuredTable.readRow(PostgresSqlStructuredTable.java:534) ~[na:na]
at io.cdap.cdap.spi.data.sql.PostgresSqlStructuredTable.read(PostgresSqlStructuredTable.java:89) ~[na:na]
at io.cdap.cdap.spi.data.common.MetricStructuredTable.read(MetricStructuredTable.java:74) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.getEntityUpgradeState(UpgradeStore.java:165) ~[na:na]
at io.cdap.wrangler.store.upgrade.UpgradeStore.lambda$getEntityUpgradeState$3(UpgradeStore.java:131) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.lambda$run$0(TransactionRunners.java:141) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.BasicSystemServiceContext.lambda$run$1(BasicSystemServiceContext.java:88) ~[na:na]
at io.cdap.cdap.spi.data.sql.SqlTransactionRunner.run(SqlTransactionRunner.java:74) ~[na:na]
at io.cdap.cdap.spi.data.sql.RetryingSqlTransactionRunner.run(RetryingSqlTransactionRunner.java:64) ~[na:na]
at io.cdap.cdap.internal.app.runtime.service.BasicSystemServiceContext.run(BasicSystemServiceContext.java:88) ~[na:na]
at io.cdap.cdap.spi.data.transaction.TransactionRunners.run(TransactionRunners.java:141) ~[na:na]
... 10 common frames omitted
Caused by: org.postgresql.util.PSQLException: ERROR: relation "app_app_upgrade" does not exist
  Position: 15
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) ~[na:na]
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) ~[na:na]
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) ~[na:na]
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[na:na]
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[na:na]
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143) ~[na:na]
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106) ~[na:na]
at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) ~[org.apache.commons.commons-dbcp2-2.6.0.jar:2.6.0]
at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122) ~[org.apache.commons.commons-dbcp2-2.6.0.jar:2.6.0]
at io.cdap.cdap.spi.data.sql.PostgresSqlStructuredTable.readRow(PostgresSqlStructuredTable.java:527) ~[na:na]
... 20 common frames omitted

Please find the list of tables in our DB for the above instance 

cdap=> \dt
                  List of relations
 Schema |            Name             | Type  | Owner
--------+-----------------------------+-------+-------
 public | app_connections             | table | cdap
 public | app_connections_store       | table | cdap
 public | app_data                    | table | cdap
 public | app_dataprep_config         | table | cdap
 public | app_delta_drafts            | table | cdap
 public | app_drafts                  | table | cdap
 public | app_oauth                   | table | cdap
 public | app_schema_registry_entries | table | cdap
 public | app_schema_registry_meta    | table | cdap
 public | app_workspaces              | table | cdap
 public | app_workspaces_store        | table | cdap
 public | application_specs           | table | cdap

Many thanks 
Kush 


Albert Shau

unread,
Jun 25, 2021, 1:37:41 PM6/25/21
to cdap...@googlegroups.com, eng-squa...@liveramp.com, Sagar Batchu
There is some meta table that holds information about which tables actually exist. I suspect that table now has a row in it that indicates the table exists when in reality it does not. So the code gets past that check (which was failing earlier), but now fails in the postgres code when it realizes there is no underlying table. I'm not sure how it can get like this, do you see anything in the app-fabric logs about the table? 

At this point it may be easiest to just create the table manually... The schema is at https://github.com/data-integrations/wrangler/blob/develop/wrangler-storage/src/main/java/io/cdap/wrangler/store/upgrade/UpgradeStore.java#L61, it has 5 columns, 3 of which are the primary key.

Reply all
Reply to author
Forward
0 new messages