xml parse - dynamic xml format handling

52 views
Skip to first unread message

Leona Chen

unread,
Dec 2, 2021, 6:17:38 AM12/2/21
to CDAP User
Hello,

one of my clients pushes data via xml to our system which we built a xml parse pipeline to parse the data for further processing. i am using wrangler's xml to json function to handle this at the moment.

however, the client xml file does not have a fixed structure, so there might be 20 tags or component sometimes and 30 in another time, and this is expected. my current pipeline configuration via wrangler can only handle  a fixed format, so i am seeking a solution so a pipeline can handle different format dynamically. 

I have provided the following data files for your reference:
1. a word doc with some useful screenshot to explain what i meant above
2. two sample xml files containing different formats

any help from anyone would be greatly appreciated as always.

thanks
Leona

Leona Chen

unread,
Dec 2, 2021, 6:19:31 AM12/2/21
to CDAP User
sorry forgot to attach the files
cdap_query_0212.docx
cdap_query_file2.xml
cdap_query_file1.xml

Vitalii Tymchyshyn

unread,
Dec 10, 2021, 8:29:26 PM12/10/21
to CDAP User
What kind of problem do you experience?
Conversion would produce a JSON array if there are multiple values.

Leona Chen

unread,
Dec 13, 2021, 5:17:33 AM12/13/21
to CDAP User

Hello,

as you can see on two sample XML files I provided. file 1 has 49 rows of data in total, whereas file2 has 40 rows of data. if I created a pipeline using file1 structure, it will fail on file2 the layers in json file are different between two files. my XML files do expect to change like this, so I was wondering if there is any way to handle this via dynamic configuration.

thanks
Leona

Vitalii Tymchyshyn

unread,
Dec 13, 2021, 8:34:04 PM12/13/21
to cdap...@googlegroups.com
Please share the failure details. Often you can use JEXL expressions to handle this, but it's hard to advise without knowing the specific failure.

--
You received this message because you are subscribed to a topic in the Google Groups "CDAP User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cdap-user/5wCTi-NDRG8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cdap-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/0db82471-8d29-4f8d-b9f1-83ac9a9de906n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages