I’m not that surprised :).
My guess is that our json reader could be sped up quite a bit. This looks like the heart of the read-json implementation:
(define (read-json* who i jsnull)
;; Follows the specification (eg, at
json.org) -- no extensions.
;;
(define (err fmt . args)
(define-values [l c p] (port-next-location i))
(raise-read-error (format "~a: ~a" who (apply format fmt args))
(object-name i) l c p #f))
(define (skip-whitespace) (regexp-match? #px#"^\\s*" i))
;;
;; Reading a string *could* have been nearly trivial using the racket
;; reader, except that it won't handle a "\/"...
(define (read-string)
(define result (open-output-bytes))
(let loop ()
(define esc
(let loop ()
(define c (read-byte i))
(cond
[(eof-object? c) (err "unterminated string")]
[(= c 34) #f] ;; 34 = "
[(= c 92) (read-bytes 1 i)] ;; 92 = \
[else (write-byte c result) (loop)])))
(cond
[(not esc) (bytes->string/utf-8 (get-output-bytes result))]
[(case esc
[(#"b") #"\b"]
[(#"n") #"\n"]
[(#"r") #"\r"]
[(#"f") #"\f"]
[(#"t") #"\t"]
[(#"\\") #"\\"]
[(#"\"") #"\""]
[(#"/") #"/"]
[else #f])
=> (λ (m) (write-bytes m result) (loop))]
[(equal? esc #"u")
(let* ([e (or (regexp-try-match #px#"^[a-fA-F0-9]{4}" i)
(err "bad string \\u escape"))]
[e (string->number (bytes->string/utf-8 (car e)) 16)])
(define e*
(if (<= #xD800 e #xDFFF)
;; it's the first part of a UTF-16 surrogate pair
(let* ([e2 (or (regexp-try-match #px#"^\\\\u([a-fA-F0-9]{4})" i)
(err "bad string \\u escape, ~a"
"missing second half of a UTF16 pair"))]
[e2 (string->number (bytes->string/utf-8 (cadr e2)) 16)])
(if (<= #xDC00 e2 #xDFFF)
(+ (arithmetic-shift (- e #xD800) 10) (- e2 #xDC00) #x10000)
(err "bad string \\u escape, ~a"
"bad second half of a UTF16 pair")))
e)) ; single \u escape
(write-string (string (integer->char e*)) result)
(loop))]
[else (err "bad string escape: \"~a\"" esc)])))
;;
(define (read-list what end-rx read-one)
(skip-whitespace)
(if (regexp-try-match end-rx i)
'()
(let loop ([l (list (read-one))])
(skip-whitespace)
(cond [(regexp-try-match end-rx i) (reverse l)]
[(regexp-try-match #rx#"^," i) (loop (cons (read-one) l))]
[else (err "error while parsing a json ~a" what)]))))
;;
(define (read-hash)
(define (read-pair)
(define k (read-json))
(unless (string? k) (err "non-string value used for json object key"))
(skip-whitespace)
(unless (regexp-try-match #rx#"^:" i)
(err "error while parsing a json object pair"))
(list (string->symbol k) (read-json)))
(apply hasheq (apply append (read-list 'object #rx#"^}" read-pair))))
;;
(define (read-json [top? #f])
(skip-whitespace)
(cond
[(and top? (eof-object? (peek-char i))) eof]
[(regexp-try-match #px#"^true\\b" i) #t]
[(regexp-try-match #px#"^false\\b" i) #f]
[(regexp-try-match #px#"^null\\b" i) jsnull]
[(regexp-try-match
#rx#"^-?(?:0|[1-9][0-9]*)(?:\\.[0-9]+)?(?:[eE][+-]?[0-9]+)?" i)
=> (λ (bs) (string->number (bytes->string/utf-8 (car bs))))]
[(regexp-try-match #rx#"^[\"[{]" i)
=> (λ (m)
(let ([m (car m)])
(cond [(equal? m #"\"") (read-string)]
[(equal? m #"[") (read-list 'array #rx#"^\\]" read-json)]
[(equal? m #"{") (read-hash)])))]
[else (err (format "bad input~n ~e" (peek-bytes (sub1 (error-print-width)) 0 i)))]))
;;
(read-json #t))
… and my guess is that the JS performance would be similar, if the json reader in JS was written in JS. I think there are probably a lot of provably-unneeded checks, and you could probably get rid of the byte-at-a-time reading.
It would be interesting to see how much faster (if at all) it is to run the TR version of this code.
John
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
racket-users...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.