Are "Generic Functions" Using Protocols Possible?

592 views
Skip to first unread message

Thomas Wetmore

unread,
Aug 5, 2014, 5:38:54 PM8/5/14
to swift-l...@googlegroups.com
More experimentation with Swift, and another demonstration of ignorance, I'm afraid. I write a lot of parsers and other code that uses nodes arranged in trees. I've written the following generic protocol for "nodeness" and have a couple concrete classes that implement it, and they work as expected.

protocol Node {
    typealias NodeAlias: Node
    var tag: String { get set }
    var value: String? { get set }
    var children: [NodeAlias]? { get set }
    var parent: NodeAlias? { get set }
}

Now, I would like to write some "generic" routines that process trees made of any kinds of nodes that implement this protocol. Here is an example of one such routine that would count all the nodes at and below any node. I guess I expected this function to "just work", but the compiler complains with the following:

"Protocol 'Node' can only be used as a generic constraint because it has Self or associated type requirements"

func count (node: Node) -> Int {   // <<---- compiler error (above) at first character of "Node".
    var count = 1
    if let children = node.children { for child in children { count += count(child) } }
    return count
}

When I did this in Objective-C, I wrote a "NodeProto" protocol and an "AbstractNode" class. I put these generic methods in the AbstractNode class and then had all the concrete node classes inherit both the prototype and the abstract class. I was really hoping to get rid of the abstract class (only meaning "abstract" in the sense that objects of the class are not instantiated by convention).

I "seems to me" that these generic functions should work, so I'm hoping I'm just missing something obvious. 

Can you tell I'm looking for some sage advice?

Kevin Greene

unread,
Aug 5, 2014, 5:49:33 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
Does it work if you do this instead:

@class_protocol protocol Node { ... }

I'm not near my mac right now or I would verify that works, but I remember having to do that for a delegate protocol.


--
You received this message because you are subscribed to the Google Groups "Swift Language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swift-languag...@googlegroups.com.
To post to this group, send email to swift-l...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/swift-language/91abca90-842b-4c0f-9b71-e382a5e67065%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Garth Snyder

unread,
Aug 5, 2014, 5:54:58 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
Thomas, take a look here and see if that helps.

The problem is not the protocol per se; it’s the fact that the protocol includes a typealias. Therefore, just specifying the protocol is no longer sufficient to fully identify the object. (I wish I could be clearer about the exact reason for this, but I don’t understand it myself. It seems like you should be able to read the typealias right out of any conformant object, but perhaps that information is only available at compile time.)

You may be able to skirt the issue with a construction such as:

func count<T: Node>(node: T) -> Int { ...

Garth

 
On Aug 5, 2014, at 2:38 PM, Thomas Wetmore <ttwet...@gmail.com> wrote:

I've written the following generic protocol for "nodeness" and have a couple concrete classes that implement it, and they work as expected. Now, I would like to write some "generic” routines...

Marco S Hyman

unread,
Aug 5, 2014, 6:28:14 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
Perhaps making count a generic func is what you want to do..

func count<T: Node>(node: T) -> Int {
// ...
}

Lets see.... OK a simplified version that the playground likes

protocol Node {
var tag: String { get set }
var value: String? { get set }
var children: [Node]? { get set }
var parent: Node? { get set }
}

func count<T: Node>(node: T) -> Int {
var c = 1
if let children = node.children {
for child in children {
c += count(child)
}
}
return c
}

* I removed the typealias just to simplify things.
* Count is generic over types that conform to Node
* You can NOT re-define count inside of the function count and expect
to get the outer definition when invoking count(child).

I haven't actually tried to use this to see if it works other than
noting that this

struct SomeStruct: Node {
var tag: String
var value: String?
var children: [Node]?
var parent: Node?
}


Is also happy in the playground.

Thomas Wetmore

unread,
Aug 5, 2014, 7:44:11 PM8/5/14
to swift-l...@googlegroups.com, ttwet...@gmail.com
Garth, Marco,

Yes, thanks to you both; your approaches work. The below solves my problem. My original "non-solution" looks better to me, and in my opinion "should work", but this generic approach is clean enough for government work. We're not building a church.

protocol Node {
    typealias NodeAlias: Node
    var tag: String { get set }
    var value: String? { get set }
    var children: [NodeAlias]? { get set }
    var parent: NodeAlias? { get set }

Marco S Hyman

unread,
Aug 5, 2014, 8:23:07 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
On Aug 5, 2014, at 4:44 PM, Thomas Wetmore <ttwet...@gmail.com> wrote:

> protocol Node {
> typealias NodeAlias: Node

What is the purpose of that line? You have hard coded NodeAlias as
an alias for Node. Why not just use Node?

Or did you mean

typealias NodeAlias

which is a placeholder for the actual type used when the protocol is
adopted. Look up Associated Types in the Swift Programming Language.

Marc
Message has been deleted

Thomas Wetmore

unread,
Aug 5, 2014, 8:47:36 PM8/5/14
to swift-l...@googlegroups.com, ttwet...@gmail.com
The typealias seems to be needed. Say I implement the following:

class GedcomNode: Node {
    weak var Parent: GedcomNode? { ... }
    ....
}

without the typealias in the protocol the above would generate an error on the GedcomNode? type, requiring me to replace it with just Node?. With the typealias I can implement it as shown here using GedcomNode?, which is as it should be.

I don't truly understand this. And I don't like this. There was an earlier question of mine on this list that elicited this knowledge from someone on the list. The explanation somehow goes that by using the typealias, protocols can be made generic. Though I had thought that by their very nature protocols were already generic.

[In fact it's even worse. If I remove the typealias and change the NodeAlias's to Nodes, the compiler crashes.]

Marco S Hyman

unread,
Aug 5, 2014, 9:13:54 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
On Aug 5, 2014, at 5:47 PM, Thomas Wetmore <ttwet...@gmail.com> wrote:

> The typealias seems to be needed. Say I implement the following:
>
> class GedcomNode: Node {
> weak var Parent: GedcomNode? { ... }
> ....
> }

Your typealias types NodeAlias as a Node, not a GedcomNode. The doc
suggests typelias NodeAlias (not typealias NodeAlias: Node) is the
proper construct. However when I try than with some test code I
trigger a different error in the playground.


protocol Node {
typealias NodeAlias
var tag: String { get set }
var value: String? { get set }
var children: [NodeAlias]? { get set }
var parent: NodeAlias? { get set }
}

func count<T: Node>(node: T) -> Int {
var c = 1
if let children = node.children {
for child in children {
c += count(child)
}
}
return c
}

struct SomeStruct: Node {
var tag: String
var value: String?
var children: [SomeStruct]?
var parent: SomeStruct?
}

The line "c += count(child) gives me the error
'Int' is not identical to 'UInt8'

I haven't a clue where that comes from

Marc

Thomas Wetmore

unread,
Aug 5, 2014, 9:36:36 PM8/5/14
to Marco S Hyman, swift-l...@googlegroups.com
Marco,

I get the same compiler error when building my full executable command line program, when I get rid of the ": Node". In my naive way I chalk it up to "too many things with the same name, e.g., 'count'".

The version as I sent last time is the only one that I have found yet that compiles and runs correctly.

Tom

Thomas Wetmore

unread,
Aug 5, 2014, 10:38:22 PM8/5/14
to swift-l...@googlegroups.com
Ah, but then I tried to write another generic node function to do a depth first traversal that runs a trivial visit function, which leads to yet another indecipherable (by me) compiler message:

// Perform a depth first recurse through a tree of nodes doing something.

func depthFirst<T: Node> (node: T, visit: (T) -> Int) {

    if let children = node.children {

        for child in children { depthFirst(child, visit) }  // 'T.NodeAlias' is not a subtype of 'T'

                                                  ^

    }

    visit(node)

}


So it's back to the drawing board.


Tom


On Tuesday, August 5, 2014 9:36:36 PM UTC-4, Thomas Wetmore wrote:
Marco,

I get the same compiler error when building my full executable command line program, when I get rid of the ": Node". In my naive way I chalk it up to "too many things with the same name, e.g., 'count'".

The version as I sent last time is the only one that I have found yet that compiles and runs correctly.

Tom

On Aug 5, 2014, at 9:13 PM, Marco S Hyman wrote:

Garth Snyder

unread,
Aug 5, 2014, 11:05:46 PM8/5/14
to Thomas Wetmore, swift-l...@googlegroups.com
If this “Parent” is meant to be “parent”, the property defined in the protocol, then it’s not clear to me that what you’re trying to do makes sense. 

Here’s my thinking: the protocol promises a “parent” property of type Node that is both gettable and settable. In other words, you can look at parent and get a Node back, and you can also set parent to any Node. When you define your class’s parent as a GedcomNode, you’re violating the setter part of that contract since only GedcomNodes are acceptable as parents. So it seems reasonable that the compiler would balk.

You could get around this by making your implementations of the Node properties procedural and referencing other variables that are of type GedcomNode, but then you'd still end up with your API disgorging generic Nodes for parents and children, which is probably not what you want.

I suspect that a typealias is in fact the right solution here. But it’s not just a magic incantation to plop into the protocol; you have to define its value for each specific implementation. So here, in schematic form,

protocol Node {
    typealias NodeType
    var parent: NodeType? { get set }
}

class RealNode: Node {
    typealias NodeType = RealNode
    var parent: RealNode?

    init(_ p: RealNode? = nil) {
        parent = p
    }
}

Now RealNode is compliant with Node, but the compiler knows that it always requires or returns a RealNode as its parent.

Garth

Thomas Wetmore

unread,
Aug 6, 2014, 5:39:24 AM8/6/14
to swift-l...@googlegroups.com, ttwet...@gmail.com
Garth,

Thanks, yes, I meant parent, not Parent. Sorry for that confusion.

What I am doing seems clear to me. To meet the Node protocol an implementing type must provide a get/set parent property that must get/set an object meeting the Node protocol. If a RealNode type implements its parent property to get/set a RealNode, that meets the protocol requirement.

You imply that a concrete type that implements the Node protocol must be able to set an instance's parent property to any object that meets the Node protocol. Wouldn't that bring the protocol concept tumbling down?

The reason for this thread was because I was too obtuse to figure out how to write generic functions using any concrete instances that meet a given protocol. I believe it is only in those types of functions that we can require parameters that implement the protocols to be of any implementing real type.

Tom

Garth Snyder

unread,
Aug 6, 2014, 6:30:32 PM8/6/14
to Thomas Wetmore, swift-l...@googlegroups.com
Thomas Wetmore <ttwet...@gmail.com> wrote: You imply that a concrete type that implements the Node protocol must be able to set an instance's parent property to any object that meets the Node protocol. Wouldn't that bring the protocol concept tumbling down?

Well, I don’t know about it ripping a hole in the fabric of the universe, but I agree: it’s not what you want for this application. These last few posts about the behavior of typealias-less protocols have been tangential to your real problem, though, so let me go back to something that may be helpful to you right now.

Your implementation of depthFirst, in combination with the Node protocol as you proposed it in the first message in this thread, is very close to being correct. To recap, here’s a simplified version:

protocol Node {
    typealias NodeType: Node
    var children: [NodeType] { get set }
}

func depthFirst<T: Node> (node: T, visit: (T) -> Int) {
    for child in node.children {
        depthFirst(child, visit)
    }
    visit(node)
}

In fact, the compiler complains about the recursive call to depthFirst() with the error “T.NodeType is not a subtype of T”. (As you note.)

The compiler is right: just because NodeType must be a type of Node doesn’t mean that it is the same type as the object whose children you are accessing. What you really want is a way to say that Node implementations of class X return lists of children whose types are also X. 

And in fact this exists, though it doesn’t seem to be covered in the Swift book:

protocol Node {
    var children: [Self] { get set }
}

Self behaves like a typealias in that it “poisons” the protocol for use as a parameter or return type, so you will still need to define depthFirst() as a generic function. (In fact, you can use exactly the same implementation shown above.)

This should work right now in beta 5.

OK, back to the tangential stuff, just for fun. Remember, here we’re back in the hypothetical world of Node protocols with no Self or typealias references, which I don’t think can be made to work in practice...

To meet the Node protocol an implementing type must provide a get/set parent property that must get/set an object meeting the Node protocol. If a RealNode type implements its parent property to get/set a RealNode, that meets the protocol requirement.

No, it doesn’t. Properties and self-referential protocols disguise the basic issue a bit, so consider this simpler analogy:

protocol Vehicle { … }

protocol Passenger {
    func boardVehicle(v: Vehicle)
}

class Train : Vehicle { … }

class Commuter : Passenger {
    func boardVehicle(t: Train) {
        …
    }
}

The compiler rejects this code because Commuter does not in fact conform to the Passenger protocol. Train is a more specific type than Vehicle, and a Passenger has to be able to board any kind of Vehicle, not just Trains.

Just as not all Vehicles are Trains, not all Nodes are RealNodes. Does this help?

Garth

Thomas Wetmore

unread,
Aug 6, 2014, 7:55:48 PM8/6/14
to swift-l...@googlegroups.com, ttwet...@gmail.com
Garth,

Thanks so much. I will have to read and reread your message a few times before I can meaningfully reply.

But I can share a solution to the depthFirst algorithm that works:

// Perform a depth first recurse through a tree of nodes doing something.
func depthFirst<T: Node> (var node: T, visit: T -> ()) {
    if !node.mark {

        if let children = node.children {
            for child in children {
                depthFirst(child as T, visit)
            }
        }
        visit(node)
        node.mark = true
    }
}

A couple minor changes. 1) I added the notion of a marking nodes to prevent infinite loops on networks of nodes with cycles; 2) I changed the arity of the visit function/closure to have no return type; and 3) I got rid of the error that was blocking me by adding "as T" to the child argument to the recursive call.

I will admit that this was more of a trial and error fix than cerebral cogitation. However, with this implementation I now have a generic depth first traverser that all my concrete implementations of Node can use.

Here's a question, however. Because this algorithm now mutates Nodes by changing the state of the mark predicate values, does this imply that Nodes must be implemented as classes? Could a Node be implemented as a struct and this algorithm still work?

I have moved further and now am working with a Tree protocol, where a Tree is made up of networks of Nodes. This was necessitated by some of the issues of how one can dependably mark and unmark networks of Nodes when the Nodes were left in mixed marked states (some Nodes marked, some not), and you want to 1) guarantee that all Nodes are either marked or unmarked at the end of the marking and unmarking operations; 2) no infinite loops regardless of starting state; and 3) Nodes don't have to implement Hashable or Equatable. In doing so I have come across a new series of gotchas, some now fixed, but regrettably, some not.

Maybe more later when I better grasp what you have written. I am a seat of the pants programmer, and this stuff is giving me a headache.

Tom

Marco S Hyman

unread,
Aug 6, 2014, 8:16:28 PM8/6/14
to swift-l...@googlegroups.com
I wrote..

>
>> protocol Node {
>> typealias NodeAlias: Node
>
> What is the purpose of that line? You have hard coded NodeAlias as
> an alias for Node. Why not just use Node?

because I read "NodeAlias: Node" but thought "NodeAlias = Node"

Oops. Just one of the (many) pitfalls of learning a new language.
Sorry for the noise.

Thomas Wetmore

unread,
Aug 7, 2014, 10:57:01 AM8/7/14
to Garth Snyder, swift-l...@googlegroups.com
Garth,

Have read your email a couple times now. I'm not sure I agree with the following assumption:

On Aug 6, 2014, at 6:30 PM, Garth Snyder wrote:

> The compiler is right: just because NodeType must be a type of Node doesn’t mean that it is the same type as the object whose children you are accessing. What you really want is a way to say that Node implementations of class X return lists of children whose types are also X.

I don't want to make that restriction. I don't see any reason why the depthFirst algorithm should care one way or the other whether all the nodes it visits in one particular execution be of the same concrete type. I can easily foresee a situation where one would want children Nodes of various concrete types. I am not doing that in my application, but I don't see any reason why it shouldn't be if the application calls for it; say a parse tree made up of nodes of many subtypes. The only requirement that depthFirst should "care about" is that every node meet the pure Node protocol. Would you disagree with that?

The solution I came up with that added the "as T" casting seems to have solved the problem fine. I will put together a test case where I build a tree of Nodes of more than one concrete type to see if things still work correctly

> And in fact this exists, though it doesn’t seem to be covered in the Swift book:
>
> protocol Node {
> var children: [Self] { get set }
> }

This is very interesting. Is it described anywhere what the real semantics of Self are in this case?

Tom

Garth Snyder

unread,
Aug 7, 2014, 3:31:46 PM8/7/14
to Thomas Wetmore, swift-l...@googlegroups.com
I don't want to make that restriction [that node trees be homogeneous]. I don't see any reason why the depthFirst algorithm should care one way or the other whether all the nodes it visits in one particular execution be of the same concrete type. 

If the Node protocol contains a typealias or Self, I think you have to do it this way to make it possible to write depthFirst at all. When depthFirst calls itself, Swift has to be able to trace back the argument type to a specific NodeType, and the only place that can come from is T.

“But my current code works just fine, and it doesn’t have this restriction.”  In fact, it does. When you write this:

func depthFirst<T: Node>(node: T) {
    for child in node.children {
        depthFirst(child as T)
    }
}

the “child as T” evaluates to nil if child is not in fact a T. So child nodes are implicitly constrained to be of similar type to their parents.

If you want complete genericity everywhere, it’s easily achieved:

protocol Node {
    var children: [Node] { get set }
}

func depthFirst(node: Node) {
    for child in node.children {
        depthFirst(child)
    }
}

class NodeImplementation : Node {
    var children: [Node] = []
}

The disadvantage is that type information is lost even within the node implementations. You have to cast even your own children to do anything more than Node operations on them. Or you can have a parallel private (non-Node) API that’s type-specific — or any number of other work-arounds — but there will certainly be some minor friction there.

The bottom line is that you just have to make some explicit decisions about the level of type-specificity you want to enforce. If you mix degrees of type specificity, you’ll need glue.

Garth

Garth Snyder

unread,
Aug 7, 2014, 3:41:25 PM8/7/14
to Thomas Wetmore, swift-l...@googlegroups.com
Unknown, captain.

I assume it’s a placeholder that is replaced with a specific type at compile time when an implementation is encountered.

I’d love to read more if anyone knows where…

Garth

Thomas Wetmore

unread,
Aug 8, 2014, 1:43:04 PM8/8/14
to swift-l...@googlegroups.com, ttwet...@gmail.com
Thanks for all the responses. At this point all I can say is that I have learned a lot, but have a lot more to learn. I am a bit frustrated about how difficult it is proving to be to define a simple protocol for Trees and Nodes and to write generic functions that work for any concrete implementations of those protocols.

I wanted to sum up this thread by showing what I believe would be the ideal way to define protocols for Nodes and Trees of Nodes, and include a couple generic algorithms that would work for all concrete implementation of the two protocols. However, the following code generates a great number of compiler errors. In addition any concrete classes that try to implement these protocols also generates many compiler errors.

Experts reading this code may be amused by my naiveté. All I know is that this seems to me to be the right way to write this kind of code.

// Protocol for nodes.

protocol Node {

    var tag: String { get set }

    var value: String { get set }

    var mark: Bool { get set }

    var children: [Node] { get set }

    var parent: Node? { get set }

}


// Protocol for trees of nodes (actually graphs).

protocol Tree {

    var nodes: [Node] { get set }

    var root: Node { get set }

}


// Unmark all nodes in a tree/graph.

func unmark (var tree: Tree) {

    for node in tree.nodes {

        node.mark = false

    }

}


// Depth first traverse a tree/graph.

func depthFirst (var tree: Tree, visit: Node -> ()) {

    unmark(tree)

    depthFirst(tree.root, visit)

}


// Depth first traverse from a node.

func depthFirst (var node: Node, visit: Node -> ()) {

    if !node.mark {

        for child in children {

            depthFirst(child, visit)

        }

        visit(node)

        node.mark = true

    }

}


Reply all
Reply to author
Forward
0 new messages