Hi,
I have to populate a triple store with a big number of data (~38k records x 12) and there is a deadly narrow bottleneck - IO operations speed. To fix it I did:
1. To avoid threads overflow I put all compute results into channel.
2. Loading data in chunks is better than single transaction for single record
I tried to do by creating channel with poputale-all traversal but it seems doesn't work properly. In the following mock example it works when the chunk size is equal the data vector (i.e. 6): "value: [of made is fruit soup Berry]" - for now I do not care the order.
(let [q (a/chan 500 (partition-all 6))
in ["Berry" "soup" "is" "made" "of" "fruit"]]
(a/go-loop [j (a/<! q)]
(when j
(println "value: " j)
(recur (a/<! q))))
(doseq [itm in]
(a/go (a/>! q itm))))
I cannot see any problem. How can I solve it? In the following example chunk size should be max 6? I expected partition-all will work the same way as itself:
(partition-all 5 ["Berry" "soup" "is" "made" "of" "fruit"]) ==>
(("Berry" "soup" "is" "made" "of") ("fruit"))