pinpointing the 'corrupt' byte / word / non-list pointer where list-pointer expected in a message

318 views
Skip to first unread message

J

unread,
May 31, 2014, 4:41:35 AM5/31/14
to capn...@googlegroups.com
I'm writing out a capnproto message using the go-capnproto implementation (https://github.com/glycerine/go-capnproto), and when I'm checking it with the c++ 'capnp decode' utility, I'm getting a corrupt message report, which makes me think I've not serialized correctly. Is there any way to get more precise information about the locate of the byte/word that appears to be corrupt? That would make it much easier to track down where I'm going wrong during creation. If there was something I could hack into the error message at layout.c++:1911 that would be really helpful.

Thanks!

Jason

jaten@i7:~/pubgoq:master$ cat .goq/serverstate | capnp decode schema/zjob.capnp Z
( goqserver = (
    nextjobid = 2,
    waitingjobs = [
      ( id = 1,
        msg = initialsubmit,
        aboutjid = 0,
        cmd = "bin/forev.sh",
        args = [],
        out = [],
        env = [],
        host = "",
        stm = 0,
        etm = 0,
        elapsec = 0,
        status = "",
        subtime = 0,
        pid = 0,
        dir = "/home/jaten/dev/cnet/go/src/github.com/glycerine/goq",
        submitaddr = "tcp://10.0.0.6:57957",
        serveraddr = "",
        workeraddr = "",
        finishaddr = [],
        signature = "49d8f04133b00aae0864ca524980ac634c4a0138",
        islocal = false,
        arrayid = 0,
        groupid = 0,
        cancelled = false,
        delegatetm = 0,
        lastpingtm = 0,
        unansweredping = 0,
        sendernonce = 2077616783183494630,
        sendtime = 1401524574225712183 ) ],
    finishedjobscount = 0,
    badsgtcount = 0,
    cancelledjobcount = 0,
    badnoncecount = 0 ) )
*** ERROR DECODING PREVIOUS MESSAGE ***
The following error occurred while decoding the message above.
This probably means the input data is invalid/corrupted.
Exception description: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where list pointer was expected.
Code location: src/capnp/layout.c++:1911
*** END ERROR ***
jaten@i7:~/pubgoq:master$

For reference, the code that does creation through go-capnproto:
https://github.com/glycerine/goq/blob/master/goq.go#L1719

capnp version (from 17 Nov 2013: a88c2b8828f846c1189aa59470a54bb7dbe9a5b2 )

J

unread,
May 31, 2014, 7:32:23 AM5/31/14
to capn...@googlegroups.com
I updated to the tip of capnp (to allow easy checking of line numbers 2053 and 1918), and isolated the problem a little bit.

I'm going on the assumption that this is a problem in the go-capnproto generated serialization code. Still open to any suggestions about the meaning/source of the problem. I continue to work on it...

# this has a problem when written by go-capnproto and checked by capnp decode:
#
# Exception description: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where text was expected.
# Code location: src/capnp/layout.c++:2053
#
struct Zfakejob {
    cmd        @0: Text;
    args       @1: List(Text);
}


# this has a slightly different error message when written by go-capnproto and checked by capnp decode:
#
# Exception description: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where list pointer was expected.
# Code location: src/capnp/layout.c++:1918
#
struct Zfakejob {
    args       @0: List(Text);
    cmd        @1: Text;
}

# okay alone, no problem:
struct Zfakejob {
   args       @0: List(Text);
}

# okay alone, no problem:
struct Zfakejob {
    cmd       @0: Text;
}

// full minimal code to reproduce the issue is here:
https://github.com/glycerine/temp-go-capnproto-ser-problem


J

unread,
May 31, 2014, 8:59:18 AM5/31/14
to capn...@googlegroups.com
I can make the capnp complaints go away by creating and assigning an empty textlist, which makes me think that the default 'constructors' (such as they are in Go) aren't quite doing the right thing...

https://github.com/glycerine/temp-go-capnproto-ser-problem

has my notes.

Andreas Stenius

unread,
May 31, 2014, 5:57:04 PM5/31/14
to capn...@googlegroups.com, J
I posted a reply as an issue on github there:
https://github.com/glycerine/temp-go-capnproto-ser-problem/issues/1

//Andreas
signature.asc

J

unread,
Jun 1, 2014, 7:42:30 AM6/1/14
to capn...@googlegroups.com, j.e....@gmail.com
Thanks Andreas, that helped alot!

There's an easy workaround that users of go-capnproto should be aware of if they encounter this issue; just use this as a rule of thumb:

   If you use List(Text), then be sure to initialize it, even with just an empty list.

This didn't affect reading back List(Text) using go, only reading back with capnp after writing List(Text) with go.

I'd prefer to figure out a deeper fix, but the code for recursively copying structures between segments in go-capnproto is complex and I'm not that familiar with it. I worked with it for a couple hours and then had to move on to other things. It appears that there is some slicing going that is causing List(Text) members not copy across, which may be related to this issue.

- Jason

Kenton Varda

unread,
Jun 1, 2014, 3:42:01 PM6/1/14
to J, Albert Strasheim, capn...@googlegroups.com
It looks to me like the problem is simply that the copying code doesn't explicitly handle null pointers. It is treating an all-zero pointer as a pointer to a zero-sized struct at offset zero, and is "copying" the struct to a new offset which is no longer zero. The right solution is to check if the pointer is null (all-zero) first and in that case simply write a null pointer into the destination.

I think it's really important to fix this. Users shouldn't have to follow obscure rules to avoid data corruption. Maybe Albert can help track down the issue?

-Kenton


--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
Visit this group at http://groups.google.com/group/capnproto.

Jason E. Aten

unread,
Jun 1, 2014, 4:25:59 PM6/1/14
to Kenton Varda, Albert Strasheim, capnproto
I agree that I would really like to have a fix. But it wasn't test-driven code, so I'm having to build up tests to keep from breaking things with any changes; this is slower than I would like, but it seems absolutely critical to not break existing functionality at all, while trying to improve what has been broken for quite a while.

Also desired/correct behavior is not obvious to me. For example, consider the following test in which both Text and List(Text) are sliced off when they should be recursively copied to the new segment, should they not? (A real question).

The test is in struct_test.go at the temp- repo mentioned above (https://github.com/glycerine/temp-go-capnproto-ser-problem.git).

// given the following schema:
struct Counter {
  size  @0: Int64;
  words @1: Text;
  wordlist @2: List(Text);
}

struct Bag {
  counter  @0: Counter;
}

// consider this behavior:

func TestSetBetweenSegments(t *testing.T) {

    exp := CapnpEncode(`(counter = (size = 9, wordlist = ["hello","bye"]))`, "Bag")

    cv.Convey("Given an Counter in one segment and a Bag with text in another", t, func() {
        cv.Convey("we should be able to copy from one segment to the other with SetCounter() on a Bag", func() {

            seg := capn.NewBuffer(nil)
            scratch := capn.NewBuffer(nil)

            // in seg                                                                                                                              
            segbag := NewRootBag(seg)

            // in scratch                                                                                                                          
            xc := NewRootCounter(scratch)
            xc.SetSize(9)
            tl := scratch.NewTextList(2)
            tl.Set(0, "hello")
            tl.Set(1, "bye")
            xc.SetWordlist(tl)

            // copy from scratch to seg                                                                                                            
            segbag.SetCounter(xc)

            buf := bytes.Buffer{}
            seg.WriteTo(&buf)

            act := buf.Bytes()
            fmt.Printf("          actual:\n")
            ShowBytes(act, 10)
            //fmt.Printf("act decoded by capnp: '%s'\n", string(CapnpDecode(act, "Bag")))                                                          
            save(act, "myact")

            fmt.Printf("expected:\n")
            ShowBytes(exp, 10)
            //fmt.Printf("exp decoded by capnp: '%s'\n", string(CapnpDecode(exp, "Bag")))                                                          
            save(exp, "myexp")

            cv.So(act, cv.ShouldResemble, exp)

        })
    })
}

results:

=== RUN TestSetBetweenSegments

  Given an struct with Text and List(Text) in one segment
    assigning it to a struct in a different segment should recursively import jea debug2: Same segment (0xc210082420) writePtr happening: p.value(o\
ff) = 0x1000000000000 to s.Data[off=0]
jea debug2: Same segment (0xc210082450) writePtr happening: p.value(off) = 0x2000100000000 to s.Data[off=0]
jea debug2: Same segment (0xc210082450) writePtr happening: p.value(off) = 0x3200000005 to s.Data[off=32]
jea debug2: Same segment (0xc210082450) writePtr happening: p.value(off) = 0x2200000005 to s.Data[off=40]
jea debug2: Same segment (0xc210082450) writePtr happening: p.value(off) = 0x1600000001 to s.Data[off=24]
 ..... jea need to clone target: key.newval: capn.Object{Segment:(*capn.Segment)(nil), off:0, length:0, datasz:8, ptrs:2, typ:0x1, flags:0x0}
 ..... jea need to clone target: key.newval: capn.Object{Segment:(*capn.Segment)(nil), off:0, length:0, datasz:0, ptrs:0, typ:0x1, flags:0x0}
jea debug2: Same segment (0xc210082420) writePtr happening: p.value(off) = 0x4 to s.Data[off=24]
 ..... jea need to clone target: key.newval: capn.Object{Segment:(*capn.Segment)(nil), off:0, length:2, datasz:0, ptrs:0, typ:0x3, flags:0x0}
jea debug2: Same segment (0xc210082420) writePtr happening: p.value(off) = 0x1600000001 to s.Data[off=32]
jea debug2: Same segment (0xc210082420) writePtr happening: p.value(off) = 0x2000100000000 to s.Data[off=8]
          actual:

          00 00 00 00 07 00 00 00    ==(line 00)> stream header: 1 segment(s), this segment has 7 words
          00 00 00 00 00 00 01 00    ==(line 01)> struct-pointer, data starts at +0 words (line 2). {prim: 0, pointers: 1 words}.
          00 00 00 00 01 00 02 00    ==(line 02)> struct-pointer, data starts at +0 words (line 3). {prim: 1, pointers: 2 words}.
          09 00 00 00 00 00 00 00    ==(line 03)> primitive data for struct on line 2
          04 00 00 00 00 00 00 00    ==(line 04)> struct-pointer, data starts at +1 words (line 6). {prim: 0, pointers: 0 words}.
          01 00 00 00 16 00 00 00    ==(line 05)> list, first element starts 0 words from here (at line 6). Size: pointer, num-elem: 2
          00 00 00 00 00 00 00 00    ==(line 06)> struct-pointer, data starts at +0 words (line 7). {prim: 0, pointers: 0 words}.
          00 00 00 00 00 00 00 00    ==(line 07)> struct-pointer, data starts at +0 words (line 8). {prim: 0, pointers: 0 words}.
          expected:

          00 00 00 00 09 00 00 00    ==(line 00)> stream header: 1 segment(s), this segment has 9 words
          00 00 00 00 00 00 01 00    ==(line 01)> struct-pointer, data starts at +0 words (line 2). {prim: 0, pointers: 1 words}.
          00 00 00 00 01 00 02 00    ==(line 02)> struct-pointer, data starts at +0 words (line 3). {prim: 1, pointers: 2 words}.
          09 00 00 00 00 00 00 00    ==(line 03)> primitive data for struct on line 2
          00 00 00 00 00 00 00 00    ==(line 04)> struct-pointer, data starts at +0 words (line 5). {prim: 0, pointers: 0 words}.
          01 00 00 00 16 00 00 00    ==(line 05)> list, first element starts 0 words from here (at line 6). Size: pointer, num-elem: 2
          05 00 00 00 32 00 00 00    ==(line 06)> list of bytes/Text (pointer to: 'hello' at line 8)
          05 00 00 00 22 00 00 00    ==(line 07)> list of bytes/Text (pointer to: 'bye' at line 9)
          68 65 6c 6c 6f 00 00 00    ==(line 08)> text contents: hello^@
          62 79 65 00 00 00 00 00    ==(line 09)> text contents: bye^@
          ✘


Failures:

  * /home/jaten/temp-go-capnproto-ser-problem/struct_test.go
  Line 359:
  Expected: '[0 0 0 0 9 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 2 0 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 22 0 0 0 5 0 0 0 50 0 0 0 5 0 0 0 34 0 0 0 10\
4 101 108 108 111 0 0 0 98 121 101 0 0 0 0 0]'
  Actual:   '[0 0 0 0 7 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 2 0 9 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 1 0 0 0 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]'
  (Should resemble)!




Jason E. Aten

unread,
Jun 1, 2014, 5:05:36 PM6/1/14
to Kenton Varda, Albert Strasheim, capnproto
correction: Text *is* recursively copied, but List(Text) is sliced off:

func TestSetBetweenSegments(t *testing.T) {

    exp := CapnpEncode(`(counter = (size = 9, words = "abc", wordlist = ["hello","bye"]))`, "Bag")

    cv.Convey("Given an struct with Text and List(Text) in one segment", t, func() {
        cv.Convey("assigning it to a struct in a different segment should recursively import", func() {


            seg := capn.NewBuffer(nil)
            scratch := capn.NewBuffer(nil)

            // in seg                                                                                                                              
            segbag := NewRootBag(seg)

            // in scratch                                                                                                                          
            xc := NewRootCounter(scratch)
            xc.SetSize(9)
            tl := scratch.NewTextList(2)
            tl.Set(0, "hello")
            tl.Set(1, "bye")
            xc.SetWordlist(tl)
            xc.SetWords("abc")


            // copy from scratch to seg                                                                                                            
            segbag.SetCounter(xc)

            buf := bytes.Buffer{}
            seg.WriteTo(&buf)

            act := buf.Bytes()
            fmt.Printf("          actual:\n")
            ShowBytes(act, 10)
            //fmt.Printf("act decoded by capnp: '%s'\n", string(CapnpDecode(act, "Bag")))                                                          
            save(act, "myact")

            fmt.Printf("expected:\n")
            ShowBytes(exp, 10)
            //fmt.Printf("exp decoded by capnp: '%s'\n", string(CapnpDecode(exp, "Bag")))                                                          
            save(exp, "myexp")

            cv.So(act, cv.ShouldResemble, exp)
        })
    })
}


results:


          actual:

          00 00 00 00 08 00 00 00    ==(line 00)> stream header: 1 segment(s), this segment has 8 words

          00 00 00 00 00 00 01 00    ==(line 01)> struct-pointer, data starts at +0 words (line 2). {prim: 0, pointers: 1 words}.
          00 00 00 00 01 00 02 00    ==(line 02)> struct-pointer, data starts at +0 words (line 3). {prim: 1, pointers: 2 words}.
          09 00 00 00 00 00 00 00    ==(line 03)> primitive data for struct on line 2
          05 00 00 00 22 00 00 00    ==(line 04)> list of bytes/Text (pointer to: 'abc' at line 6)
          05 00 00 00 16 00 00 00    ==(line 05)> list, first element starts 1 words from here (at line 7). Size: pointer, num-elem: 2
          61 62 63 00 00 00 00 00    ==(line 06)> text contents: abc^@

          00 00 00 00 00 00 00 00    ==(line 07)> struct-pointer, data starts at +0 words (line 8). {prim: 0, pointers: 0 words}.
          00 00 00 00 00 00 00 00    ==(line 08)> struct-pointer, data starts at +0 words (line 9). {prim: 0, pointers: 0 words}.
          expected:

          00 00 00 00 0a 00 00 00    ==(line 00)> stream header: 1 segment(s), this segment has 10 words

          00 00 00 00 00 00 01 00    ==(line 01)> struct-pointer, data starts at +0 words (line 2). {prim: 0, pointers: 1 words}.
          00 00 00 00 01 00 02 00    ==(line 02)> struct-pointer, data starts at +0 words (line 3). {prim: 1, pointers: 2 words}.
          09 00 00 00 00 00 00 00    ==(line 03)> primitive data for struct on line 2
          05 00 00 00 22 00 00 00    ==(line 04)> list of bytes/Text (pointer to: 'abc' at line 6)
          05 00 00 00 16 00 00 00    ==(line 05)> list, first element starts 1 words from here (at line 7). Size: pointer, num-elem: 2
          61 62 63 00 00 00 00 00    ==(line 06)> text contents: abc^@
          05 00 00 00 32 00 00 00    ==(line 07)> list of bytes/Text (pointer to: 'hello' at line 9)
          05 00 00 00 22 00 00 00    ==(line 08)> list of bytes/Text (pointer to: 'bye' at line 10)
          68 65 6c 6c 6f 00 00 00    ==(line 09)> text contents: hello^@
          62 79 65 00 00 00 00 00    ==(line 10)> text contents: bye^@

Jason E. Aten

unread,
Jun 3, 2014, 9:57:00 AM6/3/14
to Kenton Varda, Albert Strasheim, capnproto
Okay, I had some extra time today, so I tracked these bugs down.

I have a branch, 'writefix', with fixes in it here:

https://github.com/glycerine/go-capnproto/tree/writefix

Albert, and anyone else using go-capnproto, please, please, test the 'writefix' branch for possible regressions.

I didn't see issues in putting go-capnproto through its paces, but I'd like to be sure and not introduce any by mistake. I'll wait for feedback before pulling the fixes into trunk. 

There are a set of new tests in struct_test.go for the specific issues that I addressed.

Tangentially there are some useful utilities in utils2_test.go for interpretting the binary format directly, useful for debugging these kinds of issues. Other language binding authors/implementers may find these useful. See the ShowBytes() routine.

Jason

Reply all
Reply to author
Forward
0 new messages