Hue 3. 9 - Empty Oozie workflows

144 views
Skip to first unread message

Tom Stewart

unread,
Apr 27, 2016, 9:58:32 AM4/27/16
to Hue-Users
Has anyone else experienced issues with empty workflows in Oozie? One of our development groups state when they go into /oozie/editor/workflow/list/ it won't load. If they wait like 5-10 minutes, once in a while it returns about 25,000 empty workflows.

mysql> select count(*) from oozie_job where name = "" and description="" ;
+----------+
| count(*) |
+----------+
|    87532 |
+----------+
1 row in set (0.06 sec)

mysql>

Curious if I can delete these somehow for cleanup, but might need help on any additional relationships I need to remove beyond the oozie_job table. The interesting part is that I looked at the last_modified range and it seems to have all happened within a one or two month window but has not happened for the past few months. They are old entries I need to clean up.

I'd also state that I have six clusters and I saw this on three of them, but none of my personal ones. So it is definitely behavior caused by some user or application interaction that is isolated in some fashion. But on the two with the largest issues, the counts of these empty workflows are quite large (10's of thousands).

Romain Rigaux

unread,
May 9, 2016, 12:57:35 PM5/9/16
to Tom Stewart, Hue-Users
This is based on the Hue versiob < 3.8 (1 year+ ago), then the Oozie app was revamped and only one table is used and should be much faster

http://gethue.com/new-apache-oozie-workflow-coordinator-bundle-editors/

--
You received this message because you are subscribed to the Google Groups "Hue-Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hue-user+u...@cloudera.org.

Tom Stewart

unread,
May 12, 2016, 10:25:36 AM5/12/16
to Hue-Users, stewart...@yahoo.com
Are there whole tables then which are no longer utilized that we can drop to clean any of this up?

Romain Rigaux

unread,
May 12, 2016, 8:13:07 PM5/12/16
to Tom Stewart, Hue-Users
You can truncate them

oozie_*

The new Oozie documents are in

select * from desktop_document2 where type = 'oozie-workflow2' limit 5;

Tom Stewart

unread,
May 20, 2016, 10:08:40 AM5/20/16
to Hue-Users, stewart...@yahoo.com
Curious when we get this message for workflows:

"This workflow was imported from an old Hue version, save it to create a copy in the new format or open it in the old editor"

Does that imply that is pulling from the oozie_ tables still and if I re-save it will go into desktop_document2? Just want to know if the migration from the old tables to the new requires us to go into each workflow and re-save, before we run any kind of truncate on the oozie_ tables to discard the old storage format data (so we don't lose jobs in truncate process).

I might have the answer - I searched desktop_document2 for my userid and saw 4 Oozie workflows there. But in the web UI I see 7. I picked on of the old ones and re-saved, then I saw it show up in desktop_document2. Is there anything we can run from a SQL perspective to migrate these for users so they don't have to manually? I've got the additional issue of all the blank ones I'm trying to get rid of - and since the users can't get into Oozie Workflows web UI (due to all the bad ones) they can't migrate by re-saving in the UI.

Tom Stewart

unread,
May 20, 2016, 10:17:49 AM5/20/16
to Hue-Users, stewart...@yahoo.com
Is there a query against the oozie_ tables I can run that would give me a near equivalent of this? This is a list of all workflows by userid. If I could get that for the old format I could compare and see at least what has been migrated or not, then use that  to contact users to re-save the workflows before truncate.

select desktop_document2.id,username,name from desktop_document2 left join auth_user on desktop_document2.owner_id = auth_user.id where type = 'oozie-workflow2' order by username;
Reply all
Reply to author
Forward
0 new messages