Hi,
I'm building my openface docker image which is based on CUDA8.0 and Ubuntu16.04:
https://hub.docker.com/r/nightseas/openface/I got a failed test case when running unit test script, and it looks like the test was asserted when training loss < 0.3
Could anyone tell me the reason and how to locate and solve the issue?
Thanks and BR.
Xiaohai Li
Logs:
```
root:~# cd openface/
root:~/openface# ./run-tests.sh
ttests.openface_api_tests.test_pipeline ... ok
tests.openface_batch_represent_tests.test_batch_represent ... ok
tests.openface_demo_tests.test_compare_demo ... ok
tests.openface_demo_tests.test_classification_demo_pretrained ... ok
tests.openface_demo_tests.test_classification_demo_pretrained_multi ... ok
tests.openface_demo_tests.test_classification_demo_training ... ok
tests.openface_neural_net_training_tests.test_dnn_training ... FAIL
======================================================================
FAIL: tests.openface_neural_net_training_tests.test_dnn_training
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(*self.arg)
File "/root/openface/tests/openface_neural_net_training_tests.py", line 82, in test_dnn_training
assert np.mean(trainLoss) < 0.3
AssertionError:
-------------------- >> begin captured stdout << ---------------------
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0010.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0010.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0012.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0010.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0012.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0012.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0010.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0010.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0004.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0009.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0006.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0011.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0002.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0003.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0007.jpg ===
=== /root/openface/data/lfw-subset/raw/Adrien_Brody/Adrien_Brody_0005.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0001.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0012.jpg ===
=== /root/openface/data/lfw-subset/raw/Anna_Kournikova/Anna_Kournikova_0008.jpg ===
=== /root/openface/data/lfw-subset/raw/Ann_Veneman/Ann_Veneman_0010.jpg ===
{
cudnn : false
testBatchSize : 800
embSize : 128
cache : "/tmp/OpenFaceTrainingTest-Net-RLtNQP"
cudnn_bench : false
cuda : false
modelDef : "../models/openface/nn4.def.lua"
data : "/tmp/OpenFaceTrainingTest-Img-uMvdxF/aligned"
epochSize : 1
nDonkeys : -1
save : "/tmp/OpenFaceTrainingTest-Net-RLtNQP/1"
nGPU : 1
device : 1
epochNumber : 1
manualSeed : 2
testing : false
alpha : 0.2
nEpochs : 10
peoplePerBatch : 3
imagesPerPerson : 10
lfwDir : "../data/lfw/aligned"
imgDim : 96
retrain : "none"
}
Saving everything to: /tmp/OpenFaceTrainingTest-Net-RLtNQP/1
Creating train metadata
{
sampleSize :
{
1 : 3
2 : 96
3 : 96
}
split : 100
verbose : true
paths :
{
1 : "/tmp/OpenFaceTrainingTest-Img-uMvdxF/aligned"
}
samplingMode : "balanced"
loadSize :
{
1 : 3
2 : 96
3 : 96
}
}
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
34 samples found...... 0/34 ....................] ETA: 0ms | Step: 0ms
Updating classList and imageClass appropriately
[==================== 3/3 ====================>] Tot: 7ms | Step: 2ms
Cleaning up temporary files
nClasses: 3
==> doing epoch on training data:
==> online epoch # 1
+ (nTrips, nTripsFound) = (135, 135)
Epoch: [1][1/1] Time 2.954 tripErr 2.06e-01
Epoch: [1][TRAINING SUMMARY] Total Time(s): 2.99 average triplet loss (per batch): 0.21
==> doing epoch on training data:
==> online epoch # 2
+ (nTrips, nTripsFound) = (135, 134)
Epoch: [2][1/1] Time 3.476 tripErr 3.18e-01
Epoch: [2][TRAINING SUMMARY] Total Time(s): 3.51 average triplet loss (per batch): 0.32
==> doing epoch on training data:
==> online epoch # 3
+ (nTrips, nTripsFound) = (135, 133)
Epoch: [3][1/1] Time 3.222 tripErr 1.75e-01
Epoch: [3][TRAINING SUMMARY] Total Time(s): 3.25 average triplet loss (per batch): 0.17
==> doing epoch on training data:
==> online epoch # 4
+ (nTrips, nTripsFound) = (135, 134)
Epoch: [4][1/1] Time 3.120 tripErr 6.38e-01
Epoch: [4][TRAINING SUMMARY] Total Time(s): 3.15 average triplet loss (per batch): 0.64
==> doing epoch on training data:
==> online epoch # 5
+ (nTrips, nTripsFound) = (135, 112)
Epoch: [5][1/1] Time 3.380 tripErr 1.72e-01
Epoch: [5][TRAINING SUMMARY] Total Time(s): 3.41 average triplet loss (per batch): 0.17
==> doing epoch on training data:
==> online epoch # 6
+ (nTrips, nTripsFound) = (135, 127)
Epoch: [6][1/1] Time 3.055 tripErr 4.27e-01
Epoch: [6][TRAINING SUMMARY] Total Time(s): 3.08 average triplet loss (per batch): 0.43
==> doing epoch on training data:
==> online epoch # 7
+ (nTrips, nTripsFound) = (135, 135)
Epoch: [7][1/1] Time 3.330 tripErr 2.51e-01
Epoch: [7][TRAINING SUMMARY] Total Time(s): 3.36 average triplet loss (per batch): 0.25
==> doing epoch on training data:
==> online epoch # 8
+ (nTrips, nTripsFound) = (135, 129)
Epoch: [8][1/1] Time 3.395 tripErr 2.71e-01
Epoch: [8][TRAINING SUMMARY] Total Time(s): 3.42 average triplet loss (per batch): 0.27
==> doing epoch on training data:
==> online epoch # 9
+ (nTrips, nTripsFound) = (135, 122)
Epoch: [9][1/1] Time 2.930 tripErr 2.35e-01
Epoch: [9][TRAINING SUMMARY] Total Time(s): 2.96 average triplet loss (per batch): 0.24
==> doing epoch on training data:
==> online epoch # 10
+ (nTrips, nTripsFound) = (135, 135)
Epoch: [10][1/1] Time 2.794 tripErr 3.50e-01
Epoch: [10][TRAINING SUMMARY] Total Time(s): 2.82 average triplet loss (per batch): 0.35
--------------------- >> end captured stdout << ----------------------
----------------------------------------------------------------------
Ran 7 tests in 83.837s
```