Same code different result?

131 views
Skip to first unread message

Tong Sun

unread,
Sep 3, 2017, 11:34:51 PM9/3/17
to golang-nuts
Hi, 

I've bumped into another "same code different result" problem -- my `go test` runs fine locally but on Travis, 
it is broken. 

I've verified at least four or five times that all my local code have been pushed to github. Now I've run out of ideas why the same source will have different behavior after compiling into executables on different machines. 

Mine is go 1.9 under Ubuntu 17.04. 

Somebody help please. 

FYI, the tool I'm building would spot similar files within the file system very quickly. 

$ fsimilar
find/file similar
Version 0.1.0 built on 2017-09-03

Find similar files

Options:

  -h, --help            display help information
  -S, --size-given      size of the files in input as first field
  -Q, --query-size      query the file sizes from os
  -i, --input          *input from stdin or the given file (mandatory)
  -p, --phonetic        use phonetic as words for further error tolerant
  -F, --final           produce final output, the recommendations
  -c, --cp[=$FSIM_CP]   config path, path that hold all template files
  -v, --verbose         verbose mode (multiple -v increase the verbosity)

Commands:

  sim   Filter the input using simhash similarity check
  vec   Use Vector Space for similarity check

$ cat test/sim.lstA
test/sim/Audio Book - The Grey Coloured Bunnie.mp3
test/sim/GNU - Python Standard Library (2001).rar
test/sim/PopupTest.java
test/sim/(eBook) GNU - Python Standard Library 2001.pdf
test/sim/Python Standard Library.zip
test/sim/GNU - 2001 - Python Standard Library.pdf
test/sim/LayoutTest.java
test/sim/ColoredGrayBunny.ogg

$ fsimilar sim
Filter the input using simhash similarity check

Usage:
  mlocate -i soccer | fsimilar sim -i

Options:

  -h, --help            display help information
  -S, --size-given      size of the files in input as first field
  -Q, --query-size      query the file sizes from os
  -i, --input          *input from stdin or the given file (mandatory)
  -p, --phonetic        use phonetic as words for further error tolerant
  -F, --final           produce final output, the recommendations
  -c, --cp[=$FSIM_CP]   config path, path that hold all template files
  -v, --verbose         verbose mode (multiple -v increase the verbosity)
  -d, --dist[=3]        the hamming distance of hashes within which to deem similar

$ fsimilar sim -i test/sim.lstA -d 12
       1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf
       1 test/sim/GNU - Python Standard Library (2001).rar

       1 test/sim/GNU - 2001 - Python Standard Library.pdf
       1 test/sim/Python Standard Library.zip

$ fsimilar vec
Use Vector Space for similarity check

Usage:
  { mlocate -i soccer; mlocate -i football; } | fsimilar sim -i | fsimilar vec -i -S -Q -F

Options:

  -h, --help            display help information
  -S, --size-given      size of the files in input as first field
  -Q, --query-size      query the file sizes from os
  -i, --input          *input from stdin or the given file (mandatory)
  -p, --phonetic        use phonetic as words for further error tolerant
  -F, --final           produce final output, the recommendations
  -c, --cp[=$FSIM_CP]   config path, path that hold all template files
  -v, --verbose         verbose mode (multiple -v increase the verbosity)
  -t, --thr[=0.86]      the threshold above which to deem similar (0.8 = 80%)

$ fsimilar vec -i test/sim.lstA -t 0.7
       1 test/sim/GNU - Python Standard Library (2001).rar
       1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf
       1 test/sim/Python Standard Library.zip
       1 test/sim/GNU - 2001 - Python Standard Library.pdf

$ fsimilar vec -i test/sim.lstA -t 0.7 -p
       1 test/sim/Audio Book - The Grey Coloured Bunnie.mp3
       1 test/sim/ColoredGrayBunny.ogg

       1 test/sim/GNU - Python Standard Library (2001).rar
       1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf
       1 test/sim/Python Standard Library.zip
       1 test/sim/GNU - 2001 - Python Standard Library.pdf

I meant, I hope you can try pulling off from remote yourself and try testing it with your local machine, as it would be a useful tool eventually. 

Thanks for helping!

Tong Sun

unread,
Sep 3, 2017, 11:43:17 PM9/3/17
to golang-nuts


On Sunday, September 3, 2017 at 11:34:51 PM UTC-4, Tong Sun wrote:
Hi, 

I've bumped into another "same code different result" problem -- my `go test` runs fine locally but on Travis, 
it is broken. 
 

Just in case somebody question that...

.../go-dedup/fsimilar$ go test -v ./...
=== RUN   TestExec
--- PASS: TestExec (11.01s)
        fsimilar_test.go:63: 

                == Testing Simhash Basic Functions

        fsimilar_test.go:26: Testing sim.lstA.sim:
                        ../fsimilar sim -i sim.lstA -d 12 -vv

        fsimilar_test.go:26: Testing sim.lstB.sim:
                        ../fsimilar sim -i sim.lstB -d 12 -vv

        fsimilar_test.go:26: Testing sim.lstS.sim:
                        ../fsimilar sim -i sim.lstS -d 12 -vv

        fsimilar_test.go:26: Testing test1.sim:
                        ../fsimilar sim -i test1.lst -S -d 6 -vv

        fsimilar_test.go:26: Testing test2.sim:
                        ../fsimilar sim -i test2.lst -S -d 6 -vv

        fsimilar_test.go:73: 

                == Testing Vector Space Basic Functions

        fsimilar_test.go:26: Testing sim.lstA.vec:
                        ../fsimilar vec -i sim.lstA -t 0.7 -vv

        fsimilar_test.go:26: Testing sim.lstB.vec:
                        ../fsimilar vec -i sim.lstB -t 0.7 -vv

        fsimilar_test.go:26: Testing sim.lstS.vec:
                        ../fsimilar vec -i sim.lstS -t 0.7 -vv

        fsimilar_test.go:26: Testing test1.vec:
                        ../fsimilar vec -i test1.lst -S -v

        fsimilar_test.go:26: Testing test2.vec:
                        ../fsimilar vec -i test2.lst -S -v

        fsimilar_test.go:83: 

                == Testing Vector Space Phonetic Functions

        fsimilar_test.go:26: Testing sim.lstA.vec.phonetic:
                        ../fsimilar vec -i sim.lstA -p -t 0.7 -vv

        fsimilar_test.go:26: Testing sim.lstB.vec.phonetic:
                        ../fsimilar vec -i sim.lstB -p -t 0.7 -vv

        fsimilar_test.go:26: Testing sim.lstS.vec.phonetic:
                        ../fsimilar vec -i sim.lstS -p -t 0.7 -vv

        fsimilar_test.go:26: Testing test1.vec.phonetic:
                        ../fsimilar vec -i test1.lst -S -p -v

        fsimilar_test.go:26: Testing test2.vec.phonetic:
                        ../fsimilar vec -i test2.lst -S -p -v

        fsimilar_test.go:93: 

                == Testing Vector Space Finish Functions

        fsimilar_test.go:26: Testing sim.lstA.vec.Finish:
                        ../fsimilar vec -i sim.lstA -t 0.7 -p -F -v

        fsimilar_test.go:26: Testing sim.lstB.vec.Finish:
                        ../fsimilar vec -i sim.lstB -t 0.7 -p -F -v

        fsimilar_test.go:26: Testing sim.lstS.vec.Finish:
                        ../fsimilar vec -i sim.lstS -t 0.7 -p -F -v

PASS
ok      _/home/tong/l/gg/go-dedup/fsimilar      11.016s

Jakob Borg

unread,
Sep 4, 2017, 2:18:07 AM9/4/17
to Tong Sun, golang-nuts
Hi,

It's not especially clear from your mail what your tool does, exactly. But assuming that it calculates hashes of content in some manner, my first guess would be that your test data character set and/or line endings get changed by the git checkin/checkout procedure.

//jb
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Tong Sun

unread,
Sep 4, 2017, 9:13:49 AM9/4/17
to Jakob Borg, golang-nuts
Yes, we can say it is calculating hashes in some manner. However, all the test content so far are pure ascii, which would not change regardless how you are looking at it (unlike unicode), and the hashes is done on only words, i.e., spaces and line endings will not affect the hashing. 

Thanks a lot for your help though, Jakob. 

> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.

Jakob Borg

unread,
Sep 4, 2017, 1:15:10 PM9/4/17
to Tong Sun, golang-nuts
Then I can't guess. I checked out your code, and the tests fail for me in the same way as on travis.

//jb
> > To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Tong Sun

unread,
Sep 4, 2017, 10:28:02 PM9/4/17
to Jakob Borg, golang-nuts
Thanks a lot Jakob for trying it out for me. 

Problem solved
All thanks to Florian Florensen's help!

I turned this bunch of code, 
into a single simple line,

and now the problem is gone:


That's the single change I've made to my entire code base. Everyone is welcome to double-check the claim. 

So apparently the `unicode.IsUpper()` will return different things under different language environments, even for just plain ASCII text. 

Otherwise, I can't explain why there are such huge differences. 



> > To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages