Anyone have a data/basic xml parser?

75 views
Skip to first unread message

John R. Stokka

unread,
Dec 2, 2025, 3:56:00 PM (2 days ago) Dec 2
to Pick and MultiValue Databases
I'm looking for an xml parser to pick apart an imported data item.

Thought I would see if anyone had such a beast so I don't have to reinvent the wheel.

Thanks in advance. :)

Kevin King

unread,
Dec 3, 2025, 8:26:16 AM (yesterday) Dec 3
to mvd...@googlegroups.com
I have some extraction subroutines that use an XPATH-type retrieval.  

--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mvdbms/3204b971-3011-4a58-8bb0-d717c403cab7n%40googlegroups.com.

John Stokka

unread,
Dec 3, 2025, 9:06:57 AM (yesterday) Dec 3
to mvd...@googlegroups.com
post them up here if you don't mind sharing. :)

In Christ,

John R. Stokka


Dave Pomroy

unread,
Dec 3, 2025, 10:23:01 AM (yesterday) Dec 3
to mvd...@googlegroups.com

Kevin King

unread,
Dec 3, 2025, 1:36:38 PM (yesterday) Dec 3
to mvd...@googlegroups.com
The functions are specific to U2, but the subroutines are pretty generic, as I recall.

Let me write up some documentation for you and then I'll post.

Dick Thiot

unread,
Dec 3, 2025, 2:14:00 PM (yesterday) Dec 3
to mvd...@googlegroups.com
It is a lot easier to use a modern language like Python or Javascript to do the work and the coding is easier too.  You can use AI to generate the code.

Dick Thiot
RDM Infinity

John Stokka

unread,
Dec 3, 2025, 2:41:34 PM (yesterday) Dec 3
to mvd...@googlegroups.com
Yes, I could do it in c# in my sleep in no time, and find 100 routines on github no doubt. .NET probably has an xml parser built in actually at this point.
But this is a legacy system and we will be doing it old school. ;)

In Christ,

John R. Stokka

Tom Marracci

unread,
Dec 3, 2025, 2:47:50 PM (yesterday) Dec 3
to mvd...@googlegroups.com
For what it's worth, here's what I use:

DECODE.XML
001 SUBROUTINE (JSON,XML)
002 *
003 * DECODE AN XML DOCUMENT INTO A JSON STRUCTURE
004 *
005 * USE STATE MACHINE TO TRAVERSE THE XML TEXT
006 * USE CHUNKS AND ARRAYS TO COMBAT PICK STRING HANDLING
007 *
008 *
009 *
010 * DECODE XML STATE MACHINE AND CONSTANTS
011 *
012 * USE STATE MACHINE TO TRAVERSE THE XML TEXT
013 * USE CHUNKS AND ARRAYS TO COMBAT PICK STRING HANDLING
014 *
015 * CONSIDER CHANGE FOR STRINGS:
016 * MAKE LIST OF SYMBOLS TO REPLACE
017 * MARK BEGINNING OF STRING
018 * EXTRACT STRING AT THE END
019 * AND CHANGE ENCODED SYMBOLS
020   TFLG=INDEX(SYSTEM(15),'T',1)
021 * CHARACTER CLASSES
022   EQU C_ALPHA TO 1
023   EQU C_DIGIT TO 2
024   EQU C_UNDER TO 3
025   EQU C_SLASH TO 4
026   EQU C_OPEN TO 5
027   EQU C_CLOSE TO 6
028   EQU C_BANG TO 7
029   EQU C_WHITE TO 8
030   EQU C_DQUOTE TO 9
031   EQU C_HASH TO 10
032   EQU C_AMP TO 11
033   EQU C_SQUOTE TO 12
034   EQU C_COLON TO 13
035   EQU C_SEMICOLON TO 14
036   EQU C_EQUAL TO 15
037   EQU C_QUES TO 16
038   EQU C_LBR TO 17
039   EQU C_RBR TO 18
040   EQU C_ESC TO 19
041   EQU C_DASH TO 20
042   EQU C_ETC TO 21
043   EQU C_LENGTH TO 21
044   DIM CHARS(128)
045   MATPARSE CHARS FROM '0,0,0,0,0,0,0,0,0,8,8,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,7,9,10,21,21,11,12,21,21,21,21,21,20,21,4,2,2,2,2,2,2,2,2,2,2,13,14,5,15,6,16,0,1,1,1,1,1,1,1,1,1
    ,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,17,19,18,21,21,21,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,21,21,21,21' USING ','
046   CHARS(SEQ('_')+1) = C_ALPHA
047 *
048 * STATES
049   EQU S_START TO 1
050   EQU S_ENTER TO 2
051   EQU S_CMT TO 3
052   EQU S_CMT2 TO 4
053   EQU S_CMT3 TO 5
054   EQU S_CMT4 TO 6
055   EQU S_CMT5 TO 7
056   EQU S_PREMB TO 8
057   EQU S_PREMB1 TO 9
058   EQU S_KEY TO 10
059   EQU S_BOOL1 TO 11
060   EQU S_BOOL2 TO 12
061   EQU S_VALUE TO 13
062   EQU S_END1 TO 14
063   EQU S_ENDKEY TO 15
064   EQU S_SYM1 TO 16
065   EQU S_SYM2 TO 17
066   EQU S_SYM3 TO 18
067   EQU S_ATTR TO 19
068   EQU S_PROCBOOL TO -2
069   EQU S_PROCVALU TO -3
070   EQU S_PROCSYM TO -4
071   EQU S_LENGTH TO 19
072   EQU S_DSTRING TO 20; * DOUBLE QUOTED STRING
073   EQU S_SSTRING TO 21; * SINGLE QUOTED STRING
074   STATE.TXT = 'START,ENTER,CMT,CMT2,CMT3,CMT4,CMT5,PREM,PREMB1,KEY,BOOL1,BOOL2,VALUE,END1,ENDKEY,SYM1,SYM2,SYM3,ATTR'
075 *
076 * TRANSITIONS
077 *       A-Z    0-9    _    /    <    >    !    <SP>   "    #    &    '    :    ;    =    ?    [    ]    \    -     ETC
078 *START                         ENT             START
079 *ENTER  KEY                              CMT   ENTER                                    PREMB
080 *CMT                                                                                                        CMT2
081 *CMT2                                                                                                       CMT3
082 *CMT3                                                                                                       CMT4
083 *CMT4                                                                                                       CMT5
084 *CMT5                              START
085 *PREMB                                                                                 PREMB1
086 *PREMB1                            START
087 *KEY     KEY    KEY   KEY          VALUE       BOOL1                     KEY
088 *BOOL1                    BOOL2    VALUE
089 *BOOL2                             PROCBOOL
090 *VALUE                        END1                             SYM1
091 *END1                    ENDKEY
092 *ENDKEY                            PROCVALU
093 *SYM1  SYM2                                               SYM3
094 *SYM2  SYM2                                                                 PROCSYM
095 *SYM3  SYM3   SYM3                                                          PROCSYM
096 *ATTR                     BOOL2     VALU
097 *
098   DIM STATES(S_LENGTH,C_LENGTH)
099   IF TFLG AND 0 THEN
100 PRINT 'BUILDING STATE ARRAY'
101     MAT STATES = 0
102 * SETUP STATES THAT LOOP UNTIL EXIT CHARACTER IS FOUND
103     FOR I = 1 TO C_ETC
104       STATES(S_CMT3,I) = S_CMT3
105       STATES(S_CMT4,I) = S_CMT3
106       STATES(S_CMT5,I) = S_CMT3
107       STATES(S_PREMB,I) = S_PREMB
108       STATES(S_VALUE,I) = S_VALUE
109       STATES(S_ENDKEY,I) = S_ENDKEY
110       STATES(S_ATTR,I) = S_ATTR
111       STATES(S_BOOL1,I) = S_ATTR
112     NEXT I
113     STATES(S_START,C_OPEN) = S_ENTER
114     STATES(S_START,C_WHITE) = S_START
115     STATES(S_ENTER,C_ALPHA) = S_KEY
116     STATES(S_ENTER,C_UNDER) = S_KEY
117     STATES(S_ENTER,C_BANG) = S_CMT
118     STATES(S_ENTER,C_WHITE) = S_ENTER
119     STATES(S_ENTER,C_QUES) = S_PREMB
120     STATES(S_ENTER,C_SLASH) = S_ENDKEY
121     STATES(S_CMT,C_DASH) = S_CMT2
122     STATES(S_CMT2,C_DASH) = S_CMT3
123     STATES(S_CMT3,C_DASH) = S_CMT4; * SPECIAL HANDLING HERE - ANYTHING ELSE IS SKIPPED
124     STATES(S_CMT4,C_DASH) = S_CMT5
125     STATES(S_CMT5,C_CLOSE) = S_START
126     STATES(S_PREMB,C_QUES) = S_PREMB1; * ANYTHING ELSE IS SKIPPED
127     STATES(S_PREMB1,C_CLOSE) = S_START
128     STATES(S_KEY,C_ALPHA) = S_KEY
129     STATES(S_KEY,C_DIGIT) = S_KEY
130     STATES(S_KEY,C_UNDER) = S_KEY
131     STATES(S_KEY,C_COLON) = S_KEY
132     STATES(S_KEY,C_CLOSE) = S_VALUE
133     STATES(S_KEY,C_WHITE) = S_BOOL1
134     STATES(S_BOOL1,C_WHITE) = S_BOOL1
135     STATES(S_BOOL1,C_SLASH) = S_BOOL2
136     STATES(S_BOOL1,C_CLOSE) = S_VALUE
137     STATES(S_BOOL2,C_CLOSE) = S_PROCBOOL
138     STATES(S_VALUE,C_OPEN) = S_END1
139     STATES(S_END1,C_SLASH) = S_ENDKEY
140     STATES(S_END1,C_ALPHA) = S_KEY
141     STATES(S_END1,C_UNDER) = S_KEY
142     STATES(S_ENDKEY,C_CLOSE) = S_PROCVALU
143     STATES(S_VALUE,C_AMP) = S_SYM1
144     STATES(S_SYM1,C_ALPHA) = S_SYM2
145     STATES(S_SYM1,C_HASH) = S_SYM3
146     STATES(S_SYM2,C_ALPHA) = S_SYM2
147     STATES(S_SYM2,C_SEMICOLON) = S_PROCSYM
148     STATES(S_SYM3,C_ALPHA) = S_SYM3
149     STATES(S_SYM3,C_DIGIT) = S_SYM3
150     STATES(S_SYM3,C_SEMICOLON) = S_PROCSYM
151     STATES(S_ATTR,C_SLASH) = S_BOOL2
152     STATES(S_ATTR,C_CLOSE) = S_VALUE
153     IF TFLG THEN
154     MATBUILD TMP FROM STATES USING ','
155     CALL FORMS.STREAM(1)
156     PRINT TMP
157     CALL FORMS.STREAM(0)
158     END
159   END ELSE
160     MATPARSE STATES FROM '0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,10,15,0,0,3,2,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    ,0,0,0,5,0,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,7,5,5,5,5,5,5,1,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,9,8,8,8,8,8,0,0,0,0
    ,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,10,10,0,0,13,0,11,0,0,0,0,10,0,0,0,0,0,0,0,0,19,19,19,12,19,13,19,11,19,19,19,19,19,19,19,19,19,19,19,19,19,0,0,0,0,0,-2,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    ,0,13,13,13,13,14,13,13,13,13,13,16,13,13,13,13,13,13,13,13,13,13,10,0,10,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,15,15,15,15,-3,15,15,15,15,15,15,15,15,15,15,15,15,15,15,15,17,0,0,0,0,0
    ,0,0,0,18,0,0,0,0,0,0,0,0,0,0,0,17,0,0,0,0,0,0,0,0,0,0,0,0,-4,0,0,0,0,0,0,0,18,18,0,0,0,0,0,0,0,0,0,0,0,-4,0,0,0,0,0,0,0,19,19,19,12,19,13,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19' U
    SING ','
161   END
162 * MODES
163 *
164 * TYPES (FROM PHP)
165 * AND USED IN RESULT TO INDICATE TYPE OF EACH ELEMENT
166   EQU IS_NULL TO         0
167   EQU IS_LONG TO         1
168   EQU IS_DOUBLE TO       2
169   EQU IS_BOOL TO         3
170   EQU IS_ARRAY TO        4
171   EQU IS_OBJECT TO       5
172   EQU IS_STRING TO       6
173 *
174 * SPECIAL SYMBOLS
175 *
176   DIM SYMBOLS(4)
177   SYMBOLS(1) = 'amp]lt]gt]quot'
178   SYMBOLS(2) = '&]<]>]"'
179   SYMBOLS(3) = 'a]am]amp]l]lt]g]gt]q]qu]quo]quot'
180   SYMBOLS(4) = '0123456789ABCDEFabcdef'
181 *
182 *
183   EQU CHUNK.SIZE TO 512
184 *
185 * INITIALIZE RESULTING OBJECT IN RBUF
186   RSIZE = 1000
187   DIM RBUF(RSIZE)
188   RBUF = ''
189   RP = 1
190   RBUF(1) = IS_OBJECT
191 *
192 * INITIALIZE STRING BUFFER TO 16 SEGMENTS
193   SSIZE = 16
194   DIM SBUF(SSIZE)
195 *
196 * INITIALIZE RUNTIME PARAMETERS
197   STACK = ''
198   XLEN = LEN(XML)
199   STATE = S_START
200   KEY = ''
201   GOSUB RESET.VAL
202   SYM = ''
203   ATTR = ''
204   JSON = ''
205   STR.STATE = ''
206 * PROCESS IN CHUNKS TO COMBAT PICK STRING HANDLING
207   FOR I = 1 TO XLEN STEP CHUNK.SIZE
208     IXML = XML[I,CHUNK.SIZE]
209     XXLEN = LEN(IXML)
210     FOR J = 1 TO XXLEN
211       CH = IXML[J,1]
212       IF CH = CHAR(254) THEN CH = CHAR(10)
213       CHVAL = SEQ(CH)
214       IF CHVAL > 128 THEN CHCLS = C_ETC ELSE CHCLS = CHARS(CHVAL+1)
215       IF CHCLS = 0 THEN CHCLS = C_ETC
216       IF STATE = S_ATTR AND CHCLS = C_DQUOTE THEN
217         STR.STATE = STATE
218         STATE = S_DSTRING
219         CONTINUE
220       END ELSE IF STATE = S_ATTR AND CHCLS = C_SQUOTE THEN
221         STR.STATE = STATE
222         STATE = S_SSTRING
223         CONTINUE
224       END ELSE IF STATE = S_DSTRING THEN
225         IF CHCLS = C_DQUOTE THEN STATE = STR.STATE
226         CONTINUE
227       END ELSE IF STATE = S_SSTRING THEN
228         IF CHCLS = C_SQUOTE THEN STATE = STR.STATE
229         CONTINUE
230       END ELSE
231         NEXT.STATE = STATES(STATE,CHCLS)
232       END
233       IF NEXT.STATE > 0 THEN
234         ON NEXT.STATE GOSUB DO.START,DO.ENTER,DO.CMT,DO.CMT2,DO.CMT3,DO.CMT4,DO.CMT5,DO.PREMB,DO.PREMB1,DO.KEY,DO.BOOL1,DO.BOOL2,DO.VALUE,DO.END1,DO.ENDKEY,DO.SYM1,DO.SYM2,DO.SYM3,DO.ATT
    R
235         STATE = NEXT.STATE
236       END ELSE IF NEXT.STATE < 0 THEN
237         ON ABS(NEXT.STATE)-1 GOSUB DO.PROCBOOL,DO.PROCVALU,DO.PROCSYM
238       END ELSE
239 DO.ERROR:
240         PRINT 'ERROR'
241         PRINT '@ ':I:':':J
242         PRINT CH,CHVAL,CHCLS
243         PRINT STATE:' ':FIELD(STATE.TXT,',',STATE)
244         PRINT IXML[J,32]
245         JSON=''
246         RETURN
247       END
248     NEXT J
249   NEXT I
250   IF RP THEN MATBUILD JSON FROM RBUF,1,RP
251 *
252   RETURN
253 DO.START:
254   RETURN
255 * START A NEW KEY
256 DO.ENTER:
257   KEY = ''
258   ATTR = ''
259   GOSUB RESET.VAL
260   RETURN
261 *
262 * SKIP OVER A COMMENT
263 DO.CMT:
264   RETURN
265 DO.CMT2:
266   RETURN
267 DO.CMT3:
268   RETURN
269 DO.CMT4:
270   RETURN
271 DO.CMT5:
272   RETURN
273 *
274 * SKIP OVER THE XML PREAMBLE
275 DO.PREMB:
276   RETURN
277 DO.PREMB1:
278   RETURN
279 *
280 * ACCUMULATE A KEY
281 DO.KEY:
282   IF CH = ':' THEN
283     KEY = ''; * DISCARD NAMESPACE
284   END ELSE
285     KEY := CH
286   END
287   RETURN
288 *
289 * PROCESS A POSSIBLE BOOLEAN VALUE
290 DO.BOOL1:
291   RETURN
292 DO.BOOL2:
293   RETURN
294 *
295 * KEY OBTAINED
296 * UPDATE THE PATH
297 * AND ACCUMULATE A VALUE
298 DO.VALUE:
299   IF CHCLS = C_CLOSE THEN
300 * CHECK TOP OF THE STACK
301 * IF NOT AN OBJECT, MAKE IT AN OBJECT NOW
302     IF STACK='' THEN M = 1 ELSE M = STACK<1>
303     IF RBUF(M)<1,1> = IS_ARRAY THEN
304       GOSUB ADD.NODE
305       RBUF(M)<1,2,-1> = N
306       M = N
307       GOSUB ADD.NODE
308       RBUF(M) = IS_OBJECT
309       RBUF(M)<1,2> = KEY
310       RBUF(M)<1,3> = N
311     END ELSE
312       IF RBUF(M)<1,1> # IS_OBJECT THEN RBUF(M) = IS_OBJECT
313       GOSUB ADD.NODE
314       GOSUB ADD.KEY
315     END
316     INS N BEFORE STACK<1>
317     GOSUB RESET.VAL
318     KEY = ''
319     ATTR = ''
320   END ELSE
321     GOSUB APPEND.CHAR
322   END
323   RETURN
324 *
325 *
326 * PROCESS END OF A KEY
327 DO.END1:
328   RETURN
329 DO.ENDKEY:
330   IF CHCLS = C_SLASH THEN
331     KEY = ''
332     ATTR = ''
333   END ELSE IF CH = ':' THEN
334     KEY = ''; * DISCARD NAMESPACE
335   END ELSE
336     KEY := CH
337   END
338   RETURN
339 *
340 * PROCESS A KEY WITH BOOLEAN VALUE
341 * OF THE FORM <NAME />
342 * SET NAME TO BE A BOOLEAN WITH TRUE VALUE
343 DO.PROCBOOL:
344   M = STACK<1>
345   IF RBUF(M)<1,1> # IS_OBJECT THEN RBUF(M) = IS_OBJECT
346   GOSUB ADD.NODE
347   GOSUB ADD.KEY
348   RBUF(N) = IS_BOOL
349   RBUF(N)<1,2> = 1
350   KEY = ''
351   ATTR = ''
352   STATE = S_START
353   RETURN
354 *
355 *
356 * PROCESS A SCALAR VALUE
357 * DESTINATION WAS PLACED ONTO STACK BY VALUE
358 * IF IT'S STILL NULL, THEN IT IS A SCALAR THAT WE NEED TO UPDATE
359 * DETERMINE THE TYPE AND UPDATE THE VALUE
360 DO.PROCVALU:
361 * KEY COMES FROM ENDKEY
362 *   IT SHOULD MATCH THE KEY BEING UPDATED BUT WE'RE NOT CHECKING
363 * VAL WAS BUILT UP
364 * TOP OF STACK IS WHERE THIS VALUE GOES
365 * DETERMININE IT'S TYPE
366 * AND UPDATE THE RESULT
367   N = STACK<1>
368   DEL STACK<1>
369   GOSUB GET.VAL
370   IF RBUF(N) = IS_NULL THEN
371     RBUF(N) = IS_STRING
372     IF VAL MATCHES '1N0N' OR VAL MATCHES '-1N0N' THEN
373       RBUF(N) = IS_LONG
374     END ELSE IF VAL MATCHES '1N0N.1N0N' OR VAL MATCHES '-1N0N.1N0N' THEN
375       RBUF(N) = IS_DOUBLE
376     END
377     RBUF(N)<1,2> = VAL
378   END
379   KEY = ''
380   ATTR = ''
381   GOSUB RESET.VAL
382   STATE = S_START
383   RETURN
384 *
385 * HTML ENTITY ENCOUNTERED
386 * PARSE AND CHANGE IT TO THE DESIRED CHARACTER
387 DO.SYM1:
388   SYM = ''
389   RETURN
390 DO.SYM2:
391   SYM := CH
392   LOCATE SYM IN SYMBOLS(3)<1> SETTING P ELSE RETURN TO DO.ERROR
393   RETURN
394 DO.SYM3:
395   IF CH = '#' THEN
396     SYM = CH
397   END ELSE IF INDEX(SYMBOLS(4),CH,1) THEN
398     SYM := CH
399   END ELSE
400     RETURN TO DO.ERROR
401   END
402   RETURN
403 *
404 * COMPLETE TRANSLATION OF HTML ENTITIES
405 * ONLY CERTAIN ENTITIES ARE SUPPORTED
406 * SEE THE SYMBOL ARRAY FOR DETAILS
407 DO.PROCSYM:
408 * DEDUCE SYMBOL
409   IF SYM[1,1]='#' THEN
410     SYM = CHAR(XTD(SYM[2,LEN(SYM)]))
411   END ELSE
412     LOCATE SYM IN SYMBOLS(1)<1> SETTING P ELSE RETURN TO DO.ERROR
413     SYM = SYMBOLS(2)<1,P>
414   END
415   CH = SYM
416   GOSUB APPEND.CHAR
417   SYM = ''
418   STATE = S_VALUE
419   RETURN
420 *
421 * ACCUMULATE A TAGS ATTRIBUTES
422 * JSON HAS NO NEED FOR THESE SO THEY WILL BE DISCARDED
423 DO.ATTR:
424   ATTR := CH
425   RETURN
426 *
427 *
428 * HANDLE QUOTED STRINGS
429 DO.DSTRING:
430 DO.SSTRING:
431 *
432 * APPEND CHAR TO VAL
433 *
434 APPEND.CHAR:
435   IF LEN(SBUF(SP)) GE CHUNK.SIZE THEN
436     SP += 1
437     IF SP GE SSIZE THEN
438       SSIZE *= 2
439       TMP = SBUF
440       DIM SBUF(SSIZE)
441       SBUF = TMP
442     END
443   END
444   SBUF(SP) := CH
445   RETURN
446 *
447 RESET.VAL:
448   VAL = ''
449   SP = 1
450   SBUF = ''
451   RETURN
452 *
453 GET.VAL:
454   MATBUILD VAL FROM SBUF
455   CONVERT CHAR(254) TO '' IN VAL
456   RETURN
457 *
458 * ADD A NEW NODE TO THE RESULT
459 * RESIZE AS NEEDED
460 * RETURNS N - POSITION OF THE NEW NODE
461 ADD.NODE:
462   IF RP+1 > RSIZE THEN
463     RSIZE *= 2
464     DIM RBUF(RSIZE)
465   END
466   RP += 1
467   N = RP
468   RBUF(N) = IS_NULL
469   RETURN
470 *
471 * ADD KEY AS A NODE OR ARRAY
472 * M IS PARENT, N IS NODE
473 * IF ARRAY, UPDATE ARRAY AND IF NECESSARY GET ANOTHER NODE
474 ADD.KEY:
475   LOCATE KEY IN RBUF(M)<1,2> SETTING PV THEN
476     RBUF(N) = RBUF(M)
477     RBUF(M) = IS_ARRAY
478     RBUF(M)<1,2> = N
479     GOSUB ADD.NODE
480     RBUF(N) = IS_OBJECT
481     RBUF(N)<1,2> = KEY
482     RBUF(M)<1,2,2> = N
483     M = N
484     GOSUB ADD.NODE
485     RBUF(M)<1,3> = N
486   END ELSE
487     RBUF(M)<1,2,-1> = KEY
488     RBUF(M)<1,3,-1> = N
489   END
490   RETURN
491 DUMP:
492   FOR DI = 1 TO RP
493     PRINT DI'R%3 ':RBUF(DI)
494   NEXT DI
495   PRINT STR('-',10)
496   RETURN
497 END


This creates a dynamic array with named keys.  Then I have a subroutine that I use to access individual elements of the structure this decode creates:

GET.JSON
L9999
001 SUBROUTINE (JSON,KEY,VAL)
002 *
003 * GET AN ELEMENT, ARRAY, OR OBJECT FROM A JSON OBJECT
004 *
005 * IF JSON IS {} THEN KEY IS STRING (WITH MULTIPLE KEYS SEPARATED WITH DOT (.))
006 *  IF AN ELEMENT IS AN ARRAY, USE THE INDEX INSTEAD OF A NAME
007 *  FOR EXAMPLE
008 *    {carriers:[{id:"x"},{id:"y"},{id:"z"}]}
009 *    COULD USE carriers.0.id TO GET THE ID OF FIRST ELEMENT
010 *    OR ARRAY NOTATION carriers[0].id
011 * IF JSON IS [] THEN KEY IS AN INDEX STARTING AT 0
012 *
013 * DEPENDING ON THE SELECTION, VAL CAN BE A SCALAR, ARRAY, OR OBJECT
014 * IF ARRAY OR OBJECT, THE RESULT IS MODIFIED TO INDEX CORRECTLY TO ITSELF
015 * THUS IT IS ANOTHER JSON OBJECT
016 *
017 * TYPES (FROM PHP)
018 * AND USED IN RESULT TO INDICATE TYPE OF EACH ELEMENT
019   EQU IS_NULL TO         0
020   EQU IS_LONG TO         1
021   EQU IS_DOUBLE TO       2
022   EQU IS_BOOL TO         3
023   EQU IS_ARRAY TO        4
024   EQU IS_OBJECT TO       5
025   EQU IS_STRING TO       6
026 *
027 * FIND THE POSITION OF THE KEY PATH
028 * IF ARRAY, SHOULD BE A NUMERIC ZERO BASED INDEX
029 * SEPERATE COMPONENTS WITH DOTS (.)
030 * IF NOT FOUND, RETURN NULL
031   VAL = ''
032 *
033 * TO SPEED IT UP, MAKE A COPY AS A DIMENSIONED ARRAY
034   M = DCOUNT(JSON,CHAR(254))
035   IF M < 1 THEN RETURN
036   DIM JSONP(M); JSONP = JSON
037   TYPE = JSONP(1)<1,1>
038   N = 1
039   IF TYPE=IS_ARRAY THEN
040     IF KEY MATCHES '1N0N' THEN N = JSONP(N)<1,2,KEY+1> ELSE RETURN
041   END ELSE IF TYPE=IS_OBJECT THEN
042     CONVERT '[]' TO '.' IN KEY; * CONVERT SHORT HAND ARRAY INDEXING TO DOTS
043                                 * AND REMOVE CLOSING BRACKET (])
044     FOR I = 1 TO DCOUNT(KEY,'.')
045       K = FIELD(KEY,'.',I)
046       IF JSONP(N)<1,1> = IS_ARRAY THEN
047         IF K MATCHES '1N0N' THEN
048           N = JSONP(N)<1,2,K+1>; * ZERO INDEXED
049           IF N='' THEN RETURN; * OUT OF RANGE
050         END ELSE
051           RETURN
052         END
053       END ELSE IF JSONP(N)<1,1> = IS_OBJECT THEN
054         LOCATE K IN JSONP(N)<1,2> SETTING Q THEN N = JSONP(N)<1,3,Q> ELSE RETURN
055       END ELSE
056         RETURN
057       END
058     NEXT I
059   END
060 *
061 *
062 * WE FOUND IT. DETERMINE WHAT IT IS AND THE VALUE TO RETURN
063 * FOR ARRAY AND OBJECT, DEREFERENCE THE COMPONENTS
064 * AND REINDEX RELEVANT TO THE RESULT
065   IF N THEN
066     TYPE = JSONP(N)<1,1>
067     BEGIN CASE
068       CASE TYPE=IS_NULL; VAL = ''
069       CASE TYPE=IS_LONG; VAL = JSONP(N)<1,2>
070       CASE TYPE=IS_DOUBLE; VAL = JSONP(N)<1,2>
071       CASE TYPE=IS_BOOL; VAL = JSONP(N)<1,2>
072       CASE TYPE=IS_STRING; VAL = JSONP(N)<1,2>
073       CASE TYPE=IS_ARRAY OR TYPE=IS_OBJECT
074 * ARRAY AND OBJECT CONTAIN ATTRIBUTES INTO THE RECORD
075 * ADJUST THESE TO BE RELEVANT TO THE RESULT
076 * SOME REORDERING MAY TAKE PLACE BUT THAT'S OK
077 * AS LONG AS WE USE THE API TO ACCESS IT
078         VAL = JSONP(N)
079         P = 1; * LAST ELEMENT IN RESULT
080         N = 1; * CURRENT WORKING ITEM
081         LOOP
082           TYPE = VAL<N,1>
083           IF TYPE=IS_ARRAY THEN
084             FOR I = 1 TO DCOUNT(VAL<N,2>,CHAR(252))
085               M = VAL<N,2,I>; * THE OLD INDEX
086               P += 1
087               VAL<P> = JSONP(M)
088               VAL<N,2,I> = P
089             NEXT I
090           END ELSE IF TYPE=IS_OBJECT THEN
091             FOR I = 1 TO DCOUNT(VAL<N,3>,CHAR(252))
092               M = VAL<N,3,I>; * OLD INDEX
093               P += 1
094               VAL<P> = JSONP(M)
095               VAL<N,3,I> = P
096             NEXT I
097           END
098         WHILE VAL<N+1>#'' DO
099           N += 1
100         REPEAT
101     END CASE
102   END
103   RETURN
104 END

Tom



From: mvd...@googlegroups.com <mvd...@googlegroups.com> on behalf of John Stokka <jrst...@gmail.com>
Sent: Wednesday, December 3, 2025 11:40 AM
To: mvd...@googlegroups.com <mvd...@googlegroups.com>
Subject: Re: [mvdbms] Anyone have a data/basic xml parser?

Kevin King

unread,
Dec 3, 2025, 3:04:50 PM (yesterday) Dec 3
to mvd...@googlegroups.com
Hopefully attachments are allowable.  While this code uses a bit of U2 syntax, it requires no platform-specific functionality.


Precision Solutions XML Functions.pdf
API.XML.XPATH
API.XML.INVERT
API.XML.XPATH.SUB
API.XML.SINGLE

Kevin King

unread,
Dec 3, 2025, 3:06:50 PM (yesterday) Dec 3
to mvd...@googlegroups.com
I should also note that my routines leave things in XML until they can be converted to a name/value format.  This works around U2's limitation on the depth limitation of attributes, values, and subvalues.

John Stokka

unread,
Dec 3, 2025, 3:09:58 PM (yesterday) Dec 3
to mvd...@googlegroups.com
Tom,

That's kinda handy, converting the XML to JSON first then you have a json parser... Cool.  Thanks!

In Christ,

John R. Stokka

John Stokka

unread,
Dec 3, 2025, 3:10:29 PM (yesterday) Dec 3
to mvd...@googlegroups.com
Kevin,

That's exactly what I'm looking for!  Thanks so much!

In Christ,

John R. Stokka

On Wed, Dec 3, 2025 at 2:04 PM Kevin King <ke...@precisonline.com> wrote:
Hopefully attachments are allowable.  While this code uses a bit of U2 syntax, it requires no platform-specific functionality.


--
You received this message because you are subscribed to
the "Pick and MultiValue Databases" group.
To post, email to: mvd...@googlegroups.com
To unsubscribe, email to: mvdbms+un...@googlegroups.com
For more options, visit http://groups.google.com/group/mvdbms
---
You received this message because you are subscribed to the Google Groups "Pick and MultiValue Databases" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mvdbms+un...@googlegroups.com.

Bob Dubery

unread,
2:50 AM (14 hours ago) 2:50 AM
to Pick and MultiValue Databases
I tried that with ChatGPT. It was very good at giving me a program that would extract from a piece of XML with a known layout (which I'd already figured out). It's harder (for me and for ChatGPT) to write a generalised tool that just unpacks a piece of XML and presents it to another U2 application.

There are two many programs of the first type at my place of work, I wanted a more generalised and thus reusable solution (though really all you do is move the hard coded identifying of tags to another program). 

I don't feel I can share the code here because it's the IP of the company I currently work for, but it was not difficult to write a program that would present the XML as a "flat" dynamic array. EG 1 attribute per tag: 1st value is tag name, 2nd value is content (if any), 3rd value is sub-valued attribute names, 4th is values per attribute and so on. You lose the structure of the XML (this tag has that as a parent, has these as children) but that turned out to not be much of a problem for my application.

Some years ago I spent a couple of weeks writing a suite of subroutines that gave DOM like functions for reading or generating XML. They had names like GetNextChild, AddNextChild and so on. And maybe this is the trick. Think of the functions you need in plain language and give them good names, then figure out a data structure that will allow all those things to happen, AND THEN write a subroutine per function.

All of what I describe above was done in UniVerse BASIC, or on MVOn, with no resorting to other languages. I don't have the system privileges these days to install other languages or libraries. When I did the DOM-like functions I had root access, I had PERL available, I could install libraries if I wanted, but it was still hard to get back something that fitted well with UniVerse (I was keen that other UV programmers should be able to easily use these tools) and that's why in the end I did it all with what a not terribly recent version of UniVerse gave me. 

So it can be done. The trick is more mental than about programming chops.
Reply all
Reply to author
Forward
0 new messages