Extending an existing type by embedding

1,736 views
Skip to first unread message

Tong Sun

unread,
Jan 1, 2018, 9:46:02 PM1/1/18
to golang-nuts
Hi, 

I think I generally understand how embedding (https://golang.org/doc/effective_go.html#embedding) works in GO. 
However, when it comes to the following problem, I'm at lost again. 

I'm trying to extend the `html.Tokenizer` with new methods of my own:

type MyTokenizer struct {
 html
.Tokenizer
}


func
NewMyTokenizer(i io.Reader) *MyTokenizer {
 z
:= html.NewTokenizer(i)
 
return *MyTokenizer(z)
 
return &MyTokenizer{z}
}



so code like

 z := html.NewTokenizer(body)
...
func parseBody(z *html.Tokenizer) {
  tt := z.Next()
...
 testt
:= z.Token()
...


can become:

 z := NewMyTokenizer(body)
...
func
(z *MyTokenizer) parseBody() {
 tt
:= z.Next()
...

 testt
:= z.Token()
...




However, I'm really struggling to make it work as I was expected. 

Somebody help please, what's the proper way to extend a type with new methods of my own, while still able to access all existing methods? 

Further more, how to extend the above even further? --

- I plan to define an interface with a new method `WalkBody()`, in which a "virtual" method of `VisitToken()` is used. 
- Then I plan to define two different type of MyTokenizer, with their own `VisitToken()` methods, so the same `WalkBody()` method defined in MyTokenizer will behave differently for those two different types. 

How to architect above in Go? 

For your convenience, you can use this as an easy start, when demonstrating your architectural solution.


Thx a lot!


Ian Lance Taylor

unread,
Jan 1, 2018, 10:22:16 PM1/1/18
to Tong Sun, golang-nuts
The best way to get help for this is to show us precisely what you
did, ideally in a small complete, stand-alone, example, and tell us
what you expected to happen, and tell us precisely what happened
instead. In this case I don't know what to suggest because you didn't
say what you expect and you didn't say what happened.



> Further more, how to extend the above even further? --
>
> - I plan to define an interface with a new method `WalkBody()`, in which a
> "virtual" method of `VisitToken()` is used.
> - Then I plan to define two different type of MyTokenizer, with their own
> `VisitToken()` methods, so the same `WalkBody()` method defined in
> MyTokenizer will behave differently for those two different types.
>
> How to architect above in Go?

First, think in Go terms, don't think in terms like "virtual method"
that do not exist in Go.

What you want is something like

type TokenVisitor interface {
VisitToken()
}

Then your WalkBody function will take a TokenVisitor, and your
different types will implement different VisitToken methods.

(I see that you said WalkBody method, but you probably want a WalkBody
function instead.)

Ian

Tong Sun

unread,
Jan 1, 2018, 11:14:14 PM1/1/18
to Ian Lance Taylor, golang-nuts
The small complete, stand-alone, example has been provided in OP as

and tell us
what you expected to happen, and tell us precisely what happened
instead. 

What expected to happen: adding the following would work:

type MyTokenizer struct {
 html
.Tokenizer
}


func 
NewMyTokenizer(i io.Reader) *MyTokenizer {
 z 
:= html.NewTokenizer(i)
 
return *MyTokenizer(z)
 
return &MyTokenizer{z}
}
  
In this case I don't know what to suggest because you didn't

say what you expect and you didn't say what happened.

Can't make it work. The

 return *MyTokenizer(z)
 
return &MyTokenizer{z}

Are just the last two attempts that I make, apart from many other failed attempts that I've lost track of, but neither compiles

> Further more, how to extend the above even further? --
>
> - I plan to define an interface with a new method `WalkBody()`, in which a
> "virtual" method of `VisitToken()` is used.
> - Then I plan to define two different type of MyTokenizer, with their own
> `VisitToken()` methods, so the same `WalkBody()` method defined in
> MyTokenizer will behave differently for those two different types.
>
> How to architect above in Go?

First, think in Go terms, don't think in terms like "virtual method"
that do not exist in Go.

What you want is something like

type TokenVisitor interface {
    VisitToken()
}

Then your WalkBody function will take a TokenVisitor, and your
different types will implement different VisitToken methods.

(I see that you said WalkBody method, but you probably want a WalkBody
function instead.)

Jason Phillips

unread,
Jan 1, 2018, 11:39:25 PM1/1/18
to golang-nuts
html.NewTokenizer returns a pointer to a Tokenizer. So, you probably want to embed a pointer:
type MyTokenizer struct {
   
*html
.Tokenizer

}

func NewMyTokenizer(i io.Reader) *MyTokenizer {
 z
:= html.NewTokenizer(i)

 
return &MyTokenizer{z}
}

If for some reason your want/need the Tokenizer value, you'll need to dereference it before making it part of your MyTokenizer structure:

type MyTokenizer struct {
    html
.Tokenizer
}

func NewMyTokenizer(i io.Reader) *MyTokenizer {
 z
:= html.NewTokenizer(i)

 
return &MyTokenizer{*z}
}


Jason

Tong Sun

unread,
Jan 1, 2018, 11:56:59 PM1/1/18
to Jason Phillips, golang-nuts
Bingo! Thanks a lot for your clear explanation Jason! I went with your first choice and it works perfectly. 

Now wish somebody can answer the second part -- extend it even further for two different `VisitToken()` behaviors... 


--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/FRE_A6cNzW8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Lance Taylor

unread,
Jan 2, 2018, 8:57:36 AM1/2/18
to Tong Sun, golang-nuts
On Mon, Jan 1, 2018 at 8:13 PM, Tong Sun <sunto...@gmail.com> wrote:
>
> Are just the last two attempts that I make, apart from many other failed
> attempts that I've lost track of, but neither compiles.

Thanks. It helps to know that the problem is that the code does not compile.

I see that this issue was already answered.


>> First, think in Go terms, don't think in terms like "virtual method"
>> that do not exist in Go.
>>
>> What you want is something like
>>
>> type TokenVisitor interface {
>> VisitToken()
>> }
>>
>> Then your WalkBody function will take a TokenVisitor, and your
>> different types will implement different VisitToken methods.
>>
>> (I see that you said WalkBody method, but you probably want a WalkBody
>> function instead.)
>
>
> I did meant WalkBody method. See:
>
> https://github.com/suntong/lang/blob/master/lang/Go/src/xml/htmlParserTokens.go#L54

I understand that you asked for a WalkBody method. I'm trying to say:
Go is not C++ or Java. Go does not have virtual methods. It has
interfaces. The natural way to use an interface is to write a
function. You can use a method, but to make it work the way you want
you'll have to pass a value of the type into the method. So you may
as well use a function.

Ian

Tong Sun

unread,
Jan 2, 2018, 9:32:57 AM1/2/18
to Ian Lance Taylor, golang-nuts
Oh, now I understand what you meant. So it's a hibrate solution. 
 
Thanks Ian. 

Hmm... wait, I think what I meant to ask was not answered. 
I'll ask again in a different thread. 


Tong Sun

unread,
Jan 2, 2018, 9:45:56 AM1/2/18
to golang-nuts
Thanks to Jason Phillips' help, the above part is solved:

type MyTokenizer struct {
    
*html
.Tokenizer

}

func NewMyTokenizer(i io.Reader) *MyTokenizer {
 z 
:= html.NewTokenizer(i)

 
return &MyTokenizer{z}
}

 What's remaining is, 

Further more, how to extend the above even further? --

- I plan to define an interface with a new method `WalkBody()`, in which a "virtual" method of `VisitToken()` is used. 
- Then I plan to define two different type of MyTokenizer, with their own `VisitToken()` methods, so the same `WalkBody()` method defined in MyTokenizer will behave differently for those two different types. 

How to architect above in Go? 

I've found this afterward, 


I'll digest and try it out, and see how that can solve the above problem, because it builds bottom up (not mid-way up). 

Meanwhile, if someone can explain how to think and solve such problem in Go, that'd be much appreciated. Only at the architectural level would be fine for me, I can try it out myself. The real problem to me is that I have a systematic thinking in OO how to solve such kind of inherit & enhance problem, and there is a practical implementation in place for me, the virtual functions. But when it comes to Go, I still need help for how to think, and how to do

Thanks!
 

For your convenience, you can use this as an easy start, when demonstrating your architectural solution.


Thx a lot!


matthe...@gmail.com

unread,
Jan 2, 2018, 10:26:51 AM1/2/18
to golang-nuts
You want different tokenizer types to be used in the same WalkBody implementation. Interfaces abstract the implementation from the calling computation.

I think Ian did answer your question.

type TokenVisitor interface {
   
VisitToken()
}

func
WalkBody(of TokenVisitor) {
   
// here you call of.VisitToken()
}

type
MyTokenizer1 struct {
   
*html.Tokenizer
}

func
(the MyTokenizer1) VisitToken() {

}

type
MyTokenizer2 struct {
   
*html.Tokenizer
}

func
(the MyTokenizer2) VisitToken() {

}

// call WalkBody somewhere with either MyTokenizer1 or MyTokenizer2

When I wanted to write one list of tests for two different set types I added set.go and set_test.go here: https://github.com/pciet/pathsetbenchmark

Matt
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Tong Sun

unread,
Jan 2, 2018, 2:05:22 PM1/2/18
to matthe...@gmail.com, golang-nuts
Oh Thanks Matt, when Ian explained to me, I didn't think it'd work, but on seeing your code illustration, I now think it works perfectly for this case. 

Just that this function approach will break if there are many base class member fields that need to be accessed, as type methods can access them internally, under the encapsulation. But that's a different story...

Thanks again everyone!


To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

matthe...@gmail.com

unread,
Jan 2, 2018, 3:56:09 PM1/2/18
to golang-nuts
If the calling function needs more than the interface methods (such as access to type-specific struct fields) then an approach is to use an interface type switch for per-type behavior: https://tour.golang.org/methods/16

Matt

Tong Sun

unread,
Jan 2, 2018, 10:52:41 PM1/2/18
to golang-nuts
I gave it a try, but unfortunately, it doesn't work for me. 

To begin with, this works, 

type MyTokenizer struct {
    
*html
.Tokenizer
}

func NewMyTokenizer(i io.Reader) *MyTokenizer {
 z 
:= html.NewTokenizer(i)

 
return &MyTokenizer{z}
}

However,

On Tue, Jan 2, 2018 at 10:26 AM, <matthe...@gmail.com> wrote:
You want different tokenizer types to be used in the same WalkBody implementation. Interfaces abstract the implementation from the calling computation.

I think Ian did answer your question.

type TokenVisitor interface {
   
VisitToken()
}

func
WalkBody(of TokenVisitor) {
   
// here you call of.VisitToken()
}

type
MyTokenizer1 struct {
   
*html.Tokenizer
}

func
(the MyTokenizer1) VisitToken() {

}

type
MyTokenizer2 struct {
   
*html.Tokenizer
}

func
(the MyTokenizer2) VisitToken() {

}

// call WalkBody somewhere with either MyTokenizer1 or MyTokenizer2


As soon as I tried above and turned 

type MyTokenizer struct 

to

type TokenVisitor interface 

everything started to break down. Here is the code that I've put together so far:


I got:

./htmlParserTokens2.go:82:10: z.Next undefined (type TokenVisitor has no field or method Next)

So let me reiterate what I hope to achieve, 

    1. extend the `html.Tokenizer` with new methods of my own
    1. while still able to access all existing html.Tokenizer methods in the mean time
    2. define a function `WalkBody()` (or an interface method)
    3. in which an interface method of `VisitToken()` is used, which will behave differently for different types
    The first two goals can be achieved by "type MyTokenizer struct", but as soon as I change that to the interface type to act as the base for the two different extended types, goal#2 breaks. 
     
    So far this is only a small test example, in which only z.Next(), z.Err(), z.TagName() etc are currently used. But in real life I'll using all the html.Tokenizer methods, and published variables. I.e., 

    I'm not seeing the light at the end of the tunnel where all above 4 goal can be achieved together. 

    Seem to me some compromise has to be made, what is the least compromise to make?
    Anybody can help please?

    Again, the code that I've put together so far is at:


    Thanks


    To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

    Dave Cheney

    unread,
    Jan 2, 2018, 11:33:27 PM1/2/18
    to golang-nuts
    Put your methods on *MyTokenizer.

    Tong Sun

    unread,
    Jan 2, 2018, 11:52:37 PM1/2/18
    to Dave Cheney, golang-nuts
    On Tue, Jan 2, 2018 at 11:33 PM, Dave Cheney <da...@cheney.net> wrote:
    Put your methods on *MyTokenizer.

    Yeah, that works for the struct but not for interface:

    *TokenVisitor is pointer to interface, not interface

     
    To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

    matthe...@gmail.com

    unread,
    Jan 3, 2018, 9:08:26 AM1/3/18
    to golang-nuts
    1. extend the `html.Tokenizer` with new methods of my own
    2. while still able to access all existing html.Tokenizer methods in the mean time

     You have this by struct embedding:

    z := MyTokenizer1{html.NewTokenizer(r)}
    // don’t take the address, a pointer to a struct of pointers doesn’t make sense unless you are changing the pointers
    // z.Next() works here, although maybe you need to do z.Tokenizer.Next()
    // z.printElmt(depth, text) works here

    3. define a function `WalkBody()` (or an interface method)

    “or an interface method” doesn’t make sense to me.

    By assigning z to a TokenVisitor interface var you lose all functionality besides the TokenVisitor methods unless you use a type assertion or type switch to convert the var back to a MyTokenizer1 in htmlWalk.

    Instead of going back to the concrete type with the type assertion I suggest you add the method “Next() html.TokenType” and the other needed methods (printElmt, Err, TagName, VisitToken) to TokenVisitor.

    4. in which an interface method of `VisitToken()` is used, which will behave differently for different types

    You have this by defining different implementations for the TokenVisitor interface methods.

    Why do you need varying types if you are just using html.Tokenizer methods? What is the difference between each type?

    Matt

    Tong Sun

    unread,
    Jan 3, 2018, 9:30:14 AM1/3/18
    to matthe...@gmail.com, golang-nuts
    On Wed, Jan 3, 2018 at 9:07 AM, <matthe...@gmail.com> wrote:

    Why do you need varying types if you are just using html.Tokenizer methods? What is the difference between each type?


    The difference is the VisitToken(), using the same function of `WalkBody()`, but achieving different results. 

    For example, the current output from https://github.com/suntong/lang/blob/master/lang/Go/src/xml/htmlParserTokens2.go is one way of abstracting the html structure, and I also planning to produce text output that close to XML Outline View from Oxygen XML Editor, or convert HTML to .md. 

    All of above involve walking the HTML the same way, but producing results differently. 



    matthe...@gmail.com

    unread,
    Jan 3, 2018, 9:47:02 AM1/3/18
    to golang-nuts
    Ah, this kind of function signature may be better:

    func WalkBody(t *html.Tokenizer, w TokenVisitor) {

    Then you would use the regular *html.Tokenizer methods to do the walk and pass each token to the TokenVisitor to be parsed for output depending on which TokenVisitor was picked.

    Matt

    Tong Sun

    unread,
    Jan 3, 2018, 8:25:15 PM1/3/18
    to matthe...@gmail.com, golang-nuts
    Thanks Matt -- I thought it wouldn't work, but having thinking it over, and over, now I finally make it working.

    Thanks a lot

    --
    You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
    To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/FRE_A6cNzW8/unsubscribe.
    To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

    matthe...@gmail.com

    unread,
    Jan 4, 2018, 9:15:16 AM1/4/18
    to golang-nuts
    Tong, I’m glad this works for you.

    Dave, reading back I see that we’re giving conflicting advice on pointers for MyTokenizer. While my view is that the dereferences are not necessary here, perhaps you have other reasons to have a pointer in this case?

    Thanks,
    Matt
    To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

    Dave Cheney

    unread,
    Jan 4, 2018, 3:49:37 PM1/4/18
    to golang-nuts


    On Friday, 5 January 2018 01:15:16 UTC+11, matthe...@gmail.com wrote:
    Tong, I’m glad this works for you.

    Dave, reading back I see that we’re giving conflicting advice on pointers for MyTokenizer. While my view is that the dereferences are not necessary here, perhaps you have other reasons to have a pointer in this case?

    I was just guessing, I didn't fully understand the problem the OP was asking.
    Reply all
    Reply to author
    Forward
    0 new messages