web-server: how to save database results in memory across all requests?

81 views
Skip to first unread message

Wayne Harris

unread,
Aug 13, 2019, 11:47:12 AM8/13/19
to racket...@googlegroups.com
I'd like to save database results in memory because my database only changes between long time intervals.  By building a minimum application, I see my cache-strategy seems to work per servlet: by opening a new browser, the first request yields a cache-miss.

Looking through the documentation, I got the idea that perhaps I should serve/servlet using

   #:servlet-namespace '("shared-model.rkt")

where shared-model.rkt is the module that talks to the database and implements the caching-strategy.  But adding this keyword to serve/servlet did not make any perceived difference.

What should I do to save results in memory across all requests?

--- server.rkt
--------------------------------------------------------------------------
#lang racket/base
(require
web-server/servlet
web-server/servlet-env
(prefix-in model: "shared-model.rkt"))

(define-values (dispatch url)
  (dispatch-rules
   (("main") db->response)
   (("update" (string-arg)) update->response)))

(define (update->response r s)
  (define str (model:in-memory-db-set-and-get s))
  (displayln str)
  (string->response str))

(define (db->response r)
  (string->response (model:get-in-memory-results)))
;; no more api endpoints ==/==

(define (string->response s)
  (response/full 200 #"Okay" (current-seconds)
                 TEXT/HTML-MIME-TYPE '() (list (string->bytes/utf-8 s))))

(define (file-not-found r)
  (response/xexpr "File not found."))

(module+ main
  (file-stream-buffer-mode (current-output-port) 'line)
  (define (main)
    (displayln "Now serving...")
    (serve/servlet dispatch
                   ;; #:servlet-namespace '("shared-model.rkt")
                   #:stateless? #t
                   #:log-file (build-path "logs/httpd.log")
                   #:port 1111
                   #:listen-ip "127.0.0.1"
                   #:servlet-path "/"
                   #:servlet-regexp #rx""
                   #:extra-files-paths (list (build-path "pub/"))
                   #:server-root-path (build-path "/")
                   #:file-not-found-responder file-not-found))
  (main))
---------------------------------------------------------------------------

--- shared-model.rkt
---------------------------------------------------------------------------
#lang racket/base
(define in-memory-database (make-parameter #f))
(define (in-memory-db-set-and-get x)
  (in-memory-database (format "~a: data to be shared across servlets" x))
  (in-memory-database))

(define (get-in-memory-results [refresh? #f])
  (if refresh?
      (begin (displayln "database: force refresh")
             (in-memory-db-set-and-get "forced"))
      (let ([in-memory-db (in-memory-database)])
        (if in-memory-db
            (begin (displayln "database: cache hit")
                   in-memory-db)
            (begin (displayln "database: cache miss")
                   (in-memory-db-set-and-get "miss"))))))
(provide (all-defined-out))
---------------------------------------------------------------------------

George Neuner

unread,
Aug 13, 2019, 1:17:23 PM8/13/19
to Wayne Harris, racket users

On 8/13/2019 11:47 AM, 'Wayne Harris' via Racket Users wrote:
> I'd like to save database results in memory because my database only
> changes between long time intervals.  By building a minimum
> application, I see my cache-strategy seems to work per servlet: by
> opening a new browser, the first request yields a cache-miss.
> Looking through the documentation, I got the idea that perhaps I
> should serve/servlet using
>
>    #:servlet-namespace '("shared-model.rkt")
>
> where shared-model.rkt is the module that talks to the database and
> implements the caching-strategy.  But adding this keyword to
> serve/servlet did not make any perceived difference.
>
> What should I do to save results in memory across all requests?

AFAIK #:servlet-namespace isn't necessary -  you can share data (via
access functions) among different instances of request handlers just by
requiring the modules (files) that define the common objects wherever
you need them.

If I'm not mistaken about the serve/servlet call in your code below, you
are relying on the application to open the new browser window ...
terminating and restarting the application each time.  That clears your
"cache", guaranteeing that it misses the first time. You should set 
#:launch-browser? #f  , start the application and connect to it *from*
your browser with the URL http://localhost:1111.

That said:

What DBMS are you using?  Server based DBMS like Oracle, Postgresql,
MySQL, SQLServer, etc.  automatically cache query results in case the
same query is run again.  If the server is well provisioned memory-wise,
it can take a long time for a popular query to age out the cache.   If
your application is co-resident (on the same machine) with the server,
caching results yourself would be duplicating effort.

Hope this helps,
George

Wayne Harris

unread,
Aug 13, 2019, 2:25:01 PM8/13/19
to George Neuner, racket...@googlegroups.com
On Tuesday, August 13, 2019 2:17 PM, George Neuner <gneu...@comcast.net> wrote:

> On 8/13/2019 11:47 AM, 'Wayne Harris' via Racket Users wrote:
>
> > I'd like to save database results in memory because my database only
> > changes between long time intervals.  By building a minimum
> > application, I see my cache-strategy seems to work per servlet: by
> > opening a new browser, the first request yields a cache-miss.
> > Looking through the documentation, I got the idea that perhaps I
> > should serve/servlet using
> > #:servlet-namespace '("shared-model.rkt")
> > where shared-model.rkt is the module that talks to the database and
> > implements the caching-strategy.  But adding this keyword to
> > serve/servlet did not make any perceived difference.
> > What should I do to save results in memory across all requests?
>
> AFAIK #:servlet-namespace isn't necessary -  you can share data (via
> access functions) among different instances of request handlers just by
> requiring the modules (files) that define the common objects wherever
> you need them.

I think it's what I'm doing below --- I require shared-model.rkt and call

(get-in-memory-results)

when I need it.

> If I'm not mistaken about the serve/servlet call in your code below, you
> are relying on the application to open the new browser window ...
> terminating and restarting the application each time.  That clears your
> "cache", guaranteeing that it misses the first time. You should set 
> #:launch-browser? #f  , start the application and connect to it from
> your browser with the URL http://localhost:1111.

The documentation doesn't say that #:launch-browser? would have this effect. Nevertheless, I tried it with #:launch-browser? #f, but the behavior was the same.

(define (main)
(displayln "Now serving...")
(serve/servlet dispatch
#:launch-browser? #f
#:stateless? #t
#:log-file (build-path "logs/httpd.log")
#:port 1111
#:listen-ip "127.0.0.1"
#:servlet-path "/"
#:servlet-regexp #rx""
#:extra-files-paths (list (build-path "pub/"))
#:server-root-path (build-path "/")
#:file-not-found-responder file-not-found))

A browser window didn't open. With Chrome, I visited localhost:1111/main a first time and refreshed it. (So I got a cache miss followed by a cache hit.) With Firefox, I did the same and got a cache miss on the first request. I expected a cache hit.

$ racket share.rkt
Now serving...
Your Web application is running at http://localhost:1111.
Stop this program at any time to terminate the Web Server.
database: cache miss
database: cache hit
database: cache miss
database: cache hit

> That said:
>
> What DBMS are you using? 

I'm using sqlite and in other applications I store lists on disk. These are very small applications and it's useful not to depend on other software and libraries.

> [...] Server based DBMS like Oracle, Postgresql,
> MySQL, SQLServer, etc.  automatically cache query results in case the
> same query is run again.  If the server is well provisioned memory-wise,
> it can take a long time for a popular query to age out the cache.   If
> your application is co-resident (on the same machine) with the server,
> caching results yourself would be duplicating effort.

Thanks for the info!

> > --- server.rkt
> > --- shared-model.rkt

Jay McCarthy

unread,
Aug 13, 2019, 4:08:04 PM8/13/19
to Wayne Harris, racket...@googlegroups.com
Hi Wayne,

Your `in-memory-database` is a parameter. Parameters are
thread-specific storage [1]. Every request in the web-server is
handled by a different thread, so I think this will not work how you
think it should.

Jay

1. https://docs.racket-lang.org/reference/parameters.html#%28form._%28%28lib._racket%2Fprivate%2Fmore-scheme..rkt%29._parameterize%29%29

--
Jay McCarthy
Associate Professor @ CS @ UMass Lowell
http://jeapostrophe.github.io
Vincit qui se vincit.
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/Xuxg1t5egDEkFLVsH-Ch9KZyTqtLjhATNOPD-HeXZgcOc2mXr8qlXS3fZTZRh90xd2kvhdILhNjKwUllUirBcnTrEzPDTiw_hJDVEp5FjpM%3D%40protonmail.com.

George Neuner

unread,
Aug 13, 2019, 4:46:29 PM8/13/19
to Wayne Harris, racket users

On 8/13/2019 2:24 PM, 'Wayne Harris' via Racket Users wrote:
> With Chrome, I visited localhost:1111/main a first time and refreshed it. (So I got a cache miss followed by a cache hit.) With Firefox, I did the same and got a cache miss on the first request. I expected a cache hit.
>
> $ racket share.rkt
> Now serving...
> Your Web application is running at http://localhost:1111.
> Stop this program at any time to terminate the Web Server.
> database: cache miss
> database: cache hit
> database: cache miss
> database: cache hit

Sorry ... I don't use parameters much and had to look up them up. The
problem is that each browser request is handled by a separate thread, so
the 2 browsers are using different threads and different instances of
the parameter (which defaults to #f = miss).

TCP connections are (relatively) heavy weight, so browsers try to keep
server connections open for subsequent requests.  When you refresh the
page relatively quickly, it's likely you get an already existing handler
thread back again.  If you wait a bit before refreshing the page - so
the connection closes - you may see multiple misses even with just one
browser.

The answer is don't use parameters for this purpose - use some other
object such as a hash table or other thread-safe object that is defined
in terms of with get/set accessors.  Something I have used for this in
the past is:

;;
;;  mutual exclusion shared object
;;
(define-syntax getter/setter!
  (syntax-rules ()
    ((getter/setter!)
     ; -- start template

     (let* [
            (mtx (make-semaphore 1))
            (var null)
           ]
       (lambda x
         (call-with-semaphore mtx
           (lambda ()
             (cond
                 ([null? x] var)
                 ([pair? x] (set! var (car x)) var)
                 (else (error))
                 ))))
         )

     ; -- end template
     )))


You define shared objects as getter/setter!  and then use them similar
to Racket's parameters, except that there is only one instance of each
object shared by all its users.  E.g.,

If you were to add the macro above into "shared-model.rkt" and use it like:

    (define in-memory-database (getter/setter!))
    (in-memory-database #f)  ;; set initial value

then it all should act as you expect.  Since the macro actually creates
a closure (which is a function), you can directly export the shared
"object" from the module without needing other wrapper functions to use it.


Hope this helps,
George

Wayne Harris

unread,
Aug 13, 2019, 7:53:47 PM8/13/19
to Jay McCarthy, racket...@googlegroups.com
Is there an example somewhere showing how this could be done? My wish is to have one thing (a thread or something) periodically updating data (say every 30 minutes) and all servlets handling http connections reading. It is perfectly fine for me if while something writes the data, everything else is blocked waiting.

Having said that, I think this might be getting out of my league. (I'm reading about events in the hope I can find a way.) I have very basic understanding of Racket's primitives; never done anything involving threads or any kind of concurrency. If there's no trivial way to do that, I'll leave it for some other future project.

On Tuesday, August 13, 2019 5:07 PM, Jay McCarthy <jay.mc...@gmail.com> wrote:

> Hi Wayne,
>
> Your `in-memory-database` is a parameter. Parameters are
> thread-specific storage [1]. Every request in the web-server is
> handled by a different thread, so I think this will not work how you
> think it should.
>
> Jay
>
> 1. https://docs.racket-lang.org/reference/parameters.html#(form._((lib._racket%2Fprivate%2Fmore-scheme..rkt)._parameterize))

Jay McCarthy

unread,
Aug 13, 2019, 8:16:59 PM8/13/19
to Wayne Harris, racket...@googlegroups.com
I think it is pretty simple.

```
(define (make-periodically-updating-value compute1)
(define the-data (box (compute1)))
(define the-updater-t
(thread
(λ ()
(let loop ()
(set-box! the-data (compute1))
(sleep interval)
(loop)))))
(λ ()
(unbox the-data)))

(define get-the-data/cache
(make-periodically-updating-value get-the-data/for-realsies))
```

Box mutation is atomic, so you don't need locks or anything. It would
be more complicated if you want to not compute it initially.

--
Jay McCarthy
Associate Professor @ CS @ UMass Lowell
http://jeapostrophe.github.io
Vincit qui se vincit.

Wayne Harris

unread,
Aug 13, 2019, 8:40:00 PM8/13/19
to Jay McCarthy, racket...@googlegroups.com
Thanks! Here's what I did in shared-model.rkt:

(define *db* (box #f))
(define (get-in-memory-results)
(let ([in-mem (unbox *db*)])
(if in-mem
(begin (displayln "database: cache hit")
in-mem)
(begin (displayln "database: cache miss")
(set-box! *db* "initial data")
(unbox *db*)))))

(define (set-in-memory-results! s)
(set-box! *db* s))

The only place in

https://docs.racket-lang.org/reference/boxes.html

that mentions the word "atomic" is in box-cas!

https://docs.racket-lang.org/reference/boxes.html#%28def._%28%28quote._~23~25kernel%29._box-cas%21%29%29

How would I know box mutation is atomic?

Is structure mutation atomic? After sending my previous message, I
remembered reading about structure mutation in How to Design Programs.
I looked it up again and had written this following code, which also
worked as I expected --- but I don't know if the mutation is atomic.

(define-struct database (data) #:mutable #:transparent)
(define *db* (make-database #f))

(define (get-in-memory-results)
(let ([in-mem (database-data *db*)])
(if (string? in-mem)
(begin (displayln "database: cache hit")
in-mem)
(begin (displayln "database: cache miss")
(set-database-data! *db* "fresh data")
(database-data *db*)))))

(define (set-in-memory-results! s)
(set-database-data! *db* s))

On Tuesday, August 13, 2019 9:16 PM, Jay McCarthy <jay.mc...@gmail.com> wrote:

> I think it is pretty simple.
>
> (define (make-periodically-updating-value compute1)
> (define the-data (box (compute1)))
> (define the-updater-t
> (thread
> (λ ()
> (let loop ()
> (set-box! the-data (compute1))
> (sleep interval)
> (loop)))))
> (λ ()
> (unbox the-data)))
>
> (define get-the-data/cache
> (make-periodically-updating-value get-the-data/for-realsies))
>
>
> Box mutation is atomic, so you don't need locks or anything. It would
> be more complicated if you want to not compute it initially.
>
> -------------------------------------------------------------------------------------------------------------------------------------

Philip McGrath

unread,
Aug 13, 2019, 8:51:19 PM8/13/19
to Wayne Harris, Jay McCarthy, racket...@googlegroups.com
The relevant chapter of the reference is "Concurrency and Parallelism" (https://docs.racket-lang.org/reference/threads.html):
All constant-time procedures and operations provided by Racket are thread-safe because they are atomic. For example, set! assigns to a variable as an atomic action with respect to all threads, so that no thread can see a “half-assigned” variable. Similarly, vector-set! assigns to a vector atomically. The hash-set! procedure is not atomic, but the table is protected by a lock; see Hash Tables for more information.

The way that `box-cas!` is special is that—like a hardware compare-and-set operation—the compound operation is atomic, so no other thread/future can interfere between the "compare" step and the "set" step.

-Philip

Wayne Harris

unread,
Aug 13, 2019, 9:45:26 PM8/13/19
to Philip McGrath, Jay McCarthy, racket...@googlegroups.com
On Tuesday, August 13, 2019 9:51 PM, Philip McGrath <phi...@philipmcgrath.com> wrote:

The relevant chapter of the reference is "Concurrency and Parallelism" (https://docs.racket-lang.org/reference/threads.html):
All constant-time procedures and operations provided by Racket are thread-safe because they are atomic. For example, set! assigns to a variable as an atomic action with respect to all threads, so that no thread can see a “half-assigned” variable. Similarly, vector-set! assigns to a vector atomically. The hash-set! procedure is not atomic, but the table is protected by a lock; see Hash Tables for more information.

How can I tell a procedure is constant-time?

Wayne Harris

unread,
Aug 14, 2019, 1:46:05 PM8/14/19
to George Neuner, racket users
Very cool solution. It works as expected. Thanks for sharing! That gives
us closures which are thread-global. Cool!
Reply all
Reply to author
Forward
0 new messages