Implementing CNN, help needed

93 views

Skip to first unread message

Ly Zhenyi

unread,

Dec 4, 2017, 10:12:50 PM12/4/17

to clojure-cortex

Hi,

I am trying to implement a convolutional network with cortex following examples on coursera convolutional-model-application. However the accuracy I got was much worse than that with tensorflow in coursera tutorial.

The following is my implementation:

(ns ca.cnn.convolution-model-application
  (:require
   [think.image.patch :as patch]
   [think.hdf5.core :as hdf5]
   [cortex.nn.layers :as layers]
   [cortex.experiment.util :as eu]
   [cortex.experiment.train :as experiment-train :refer [load-network]]
   [cortex.experiment.classification :as classification]
   [cortex.optimize.adam :as adam]
   [cortex.nn.network :as network]
   [cortex.nn.execute :as execute]
   [cortex.metrics :as metrics]
   [cortex.util :as util]
   [mikera.image.core :as image]

   [clojure.string :as s]
   [clojure.java.io :as io]
   [clojure.core.matrix :as m])
  (:import
   [java.awt.image BufferedImage Raster DataBufferByte]
   [java.awt Point]
   [java.io ByteArrayInputStream]
   [javax.imageio ImageIO ]))

(def network-file "convolution-model-application.nippy")
(def local-sign-ds-root "/home/garfield/tmp/sign")
(def dataset-root "/home/garfield/projects/python/deep.learning/4.convolutional-neural-networks/w1/datasets/")

(def log println)
(defn load-saved-network [network-file-path]
  (when (.exists (io/file network-file-path))
    (load-network network-file-path)))

(defn preprocess-image [img-file]
  (patch/image->patch
   (image/resize (image/load-image img-file) 64 64)
   :datatype :float
   :colorspace :rgb
   :normalize true))

(defn load-image-with-label [^java.io.File img-file]
  (let [fname (.getName img-file)]
    {:data (preprocess-image img-file)
     :label (as-> fname ?
                (s/split ? #"\.")
              (second ?)
              (read-string ?)
              (eu/label->one-hot [0 1 2 3 4 5] ?))}))

(defn load-all-images-as-samples [^java.io.File root-dir]
  (map
   load-image-with-label
   (filter #(.isFile %) (file-seq root-dir))))

(defn bytes-to-image
  "Convert bytes to BufferedImage"
  ([width height bytes]
   (bytes-to-image width height 3 bytes))
  ([width height n-channels bytes]
   (let [img (BufferedImage. width height BufferedImage/TYPE_3BYTE_BGR)
         data (.. img getRaster getDataBuffer getData)]
     (System/arraycopy bytes 0 data 0 (count bytes))
     img)))

(defn save-image [^BufferedImage img f]
  (io/make-parents f)
  (with-open [bos (io/output-stream f)]
    (ImageIO/write img "jpg" bos)))

(defn load-dataset-from-hdf5 []
  (let [test-root (hdf5/child-map (hdf5/open-file (str dataset-root "test_signs.h5")))
        train-root (hdf5/child-map (hdf5/open-file (str dataset-root "train_signs.h5")))]
    {:train-set-x-orig (->> train-root :train_set_x hdf5/->clj)
     :train-set-y-orig (-> train-root :train_set_y hdf5/->clj)
     
     :test-set-x-orig (->> test-root :test_set_x hdf5/->clj)
     :test-set-y-orig (-> test-root :test_set_y hdf5/->clj)}))

(defn show-label-in-hdf5 [label-data index]
  (nth label-data index))

(defn export-train-test-images
  "Export all images from hdf5 to local disk"
  [^String exported-root]
  (let [dt (load-dataset-from-hdf5)
        image-size (* 64 64 3)
        train-ds (get-in dt [:train-set-x-orig :data])
        train-labels (get-in dt [:train-set-y-orig :data])
        test-ds (get-in dt [:test-set-x-orig :data])
        test-labels (get-in dt [:test-set-y-orig :data])
        ds-to-images (fn [ds-type data-ds label-ds]
                       (dorun
                        (map-indexed
                         (fn [idx img-bytes]
                           (let [img (bytes-to-image 64 64 3 (byte-array img-bytes))]
                             (save-image
                              img
                              (io/file exported-root ds-type
                                       (str idx "."(show-label-in-hdf5 label-ds idx) ".jpg")))))
                         (partition image-size data-ds))))]
    (ds-to-images "train" train-ds train-labels)
    (ds-to-images "test" test-ds test-labels)))


;; (export-train-test-images local-sign-ds-root)
;; (defonce dt (load-dataset))
;; (show-image (get-in dt [:train-set-x-orig :data]) 2)
;; (show-label-in-hdf5 (get-in dt [:train-set-y-orig :data]) 0)

(def load-dataset
  (memoize
   (fn []
     (let []
       {:train-dataset (load-all-images-as-samples (io/file local-sign-ds-root "train"))
        :test-dataset (load-all-images-as-samples (io/file local-sign-ds-root "test"))}))))

(def network-layers
  [(layers/input 64 64 3 :id :data)
   (layers/convolutional 1 0 1 8)
   (layers/relu)
   (layers/max-pooling 8 4 8)
   (layers/convolutional 1 0 1 16)
   (layers/relu)
   (layers/max-pooling 4 2 4)
   (layers/linear 6)
   (layers/softmax :id :label)])


(defn train
  ([]
   (train {}))
  ([{:keys [epoch-count batch-size]
     :or {epoch-count 100
          batch-size 64}}]
   (let [network  (or (load-saved-network network-file) network-layers)
         dataset  (load-dataset)
         ds (:train-dataset dataset)
         train-ds  (take 950 ds)
         test-ds (drop 950 ds)]
     (log "training using batch size of" batch-size "epoch count of" epoch-count)
     (experiment-train/train-n
      network
      train-ds
      test-ds
      :network-filestem (first (clojure.string/split network-file #"\."))
      :optimizer (adam/adam :alpha 0.003)
      :batch-size batch-size
      :epoch-count epoch-count))))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; testing
(defn guess [network entries]
  (map
   #(->> % :label util/max-index)
   (execute/run network entries)))

(defn accuracy-test
  [network ds]
  (let [test-results (execute/run network ds)
        test-actual (map (comp util/max-index :label) ds)
        test-pred  (guess network ds)
        ;;_ (log (m/shape test-actual) (m/shape test-pred))
        accuracy (metrics/accuracy test-actual test-pred)]
    {:accuracy accuracy}))

(defn predict [img-path]
  (let [network (load-saved-network network-file)
        img (image/resize (image/load-image img-path) 64 64)
        in (preprocess-image img)]
    (log (m/shape (map #(/ % 128) in)))
    (image/show img)
    (let [r (execute/run network [{:data in}])]
      (log r)
      (-> r
          first
          :label
          util/max-index))))

(comment
  ;; compute accuracy using trained network
  (let [dataset (load-dataset)
        train-ds (:train-dataset dataset)
        test-ds (:test-dataset dataset)
        network (load-saved-network network-file)]
    (log "Train accuracy" (:accuracy (accuracy-test network train-ds)))
    (log "Test accuracy" (:accuracy (accuracy-test network test-ds)))))

I got the following results, comparing to the train/test accuracy of 0.94/0.78 in coursera tutorial.

Loss for epoch   1: (current) 1.77159273 (best) null [new best]
....
Loss for epoch 100: (current) 1.57378446 (best) 1.57268571

Train accuracy 0.5731481481481482
Test accuracy 0.5916666666666667

Am I doing something fundamentally wrong?

Ly Zhenyi

unread,

Dec 6, 2017, 8:08:02 PM12/6/17

to clojure-cortex

I changed the model to the following and got much better results (0.95/0.87 for train/test accuracy with 10 epoch counts). Still don't know why the previous model did not work well.

(def network-layers
     [(layers/input 64 64 3 :id :data)

      (layers/convolutional 1 0 1 64)
      (layers/batch-normalization)
      (layers/relu)
      (layers/max-pooling 8 0 8)

Chris Nuernberger

unread,

Dec 18, 2017, 11:23:52 AM12/18/17

to clojure-cortex

Nice work sticking with it!

If batch-norm helped that much then I would say potentially the inputs were out of the ideal statistical range for networks which is mean-zero and variance of 1. Did the tensorflow pathway do any whitening/normalization of the input data?

I doubt the changes to the max-pooling helped or hindered the issue and I would also be surprised if the conv-layer changes helped all that much but perhaps.

Here is what may have happened:

You start with numbers generally positive. This means the relu then doesn't act as a piecewise function but is more of a passthrough; your first NN section had effectively no activation. Adding in the batch-normalization meant that the relu actually worked and the network then had the piecewise linear function it needs to effectively approximate arbitrary functions.

You quickly found one of the major problems of NN-based machine learning; lots of black magic.

In order to break this down I would do this:

1. Start with old network and do a simple normalization of the inputs (mean center, variance of one but across entire dataset; not per-parameter. So you should have exactly 1 mean and 1 variance scale factor). Does this change the problem?

2. Start with old network and use the new selu activation that Karin implemented. Does this change the problem?