Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

An added line to Tarjan's SCC algorithm in Muchnik's Advanced Compiler Design

98 views
Skip to first unread message

woon...@gmail.com

unread,
Aug 16, 2018, 8:41:12 AM8/16/18
to
The official errata for the book, Advanced Compiler Design and Implementation
by Steven Muchnik added, among other fixes, this line to the Tarjan's algorithm
to find maximal strongly connected components (SCCs) from a directed graph.

All_SCC U= {{Stack[1]}}

where I replaced with brackets an arrow to mean retrival of an element from a
sequence named Stack.

The full context (p.195) reads as

... some declarations and init code omitted...
for each x from node sets do
if Dfn(x) = 0 then
Strong_Components(x,Succ)
fi
od
All_SCC U= {{Stack[1]}}

The original paper has no such line and I failed to figure out when it is
necessary.

It seems to imply the stack has one node that should be added to the set of
SCCs, but AFAIK the stack should be empty when returning an initial call to
Strong_Components() above. What am I missing here, or is this another added bug
as ones for Tarjan's dominator algorithm?

Thanks in advance.

David Lovemore

unread,
Aug 17, 2018, 10:40:02 AM8/17/18
to
On Thursday, August 16, 2018 at 1:41:12 PM UTC+1, woon...@gmail.com wrote:
> The official errata for the book, Advanced Compiler Design and Implementation
> by Steven Muchnik added, among other fixes, this line to the Tarjan's algorithm
> to find maximal strongly connected components (SCCs) from a directed graph.
>
> All_SCC U= {{Stack[1]}}
>
> where I replaced with brackets an arrow to mean retrival of an element from a
> sequence named Stack. ...


I don't have the book, so I can't see implementation of
Strong_Components. But I believe the description of the algorithm on
wikipedia is correct. I suggest comparing with that.

https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

woon...@gmail.com

unread,
Aug 24, 2018, 10:30:24 AM8/24/18
to
David Lovemore wrote:
> On Thursday, August 16, 2018 at 1:41:12 PM UTC+1, woon...@gmail.com wrote:
> > The official errata for the book, Advanced Compiler Design and Implementation
> > by Steven Muchnik added, among other fixes, this line to the Tarjan's algorithm
> > to find maximal strongly connected components (SCCs) from a directed graph.
> >
> > All_SCC U= {{Stack[1]}}
> >
> > where I replaced with brackets an arrow to mean retrival of an element from a
> > sequence named Stack. ...
>
>
> I don't have the book, so I can't see implementation of
> Strong_Components.

Strong_Components implements an algorithm essentially same to what
Tarjan's original paper specifies.

Strong_Components looks like this:

x is a parameter to denote a flowgraph node

LowLink(x) = Dfn(x) = NextDfn += 1
push x to Stack
for each y from x's successors do
if Dfn(y) = 0 then
Strong_Components(y)
LowLink(x) = min(LowLink(x), LowLink(y))
elif Dfn(y) < Dfn(x) and y is on Stack then
LowLink(x) = min(LowLink(x), Dfn(y))
fi
od
if LowLink(x) = Dfn(x) then // x is root of SCC
SCC is an empty set
while Stack is not empty do
z = top from Stack without popping
if Dfn(z) < Dfn(x) then
add set SCC to another set All_SCC
return
fi
pop from Stack
add z to SCC
od
add set SCC to set All_SCC
fi

> But I believe the description of the algorithm on
> wikipedia is correct. I suggest comparing with that.
>
> https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

Thanks for the reference. I compared and found no differences.

If the line in question had come from the text instead of the errata,
I would not have posted this question. I still wonder why the author
decided to add the line that looks unnecessary.

Thanks.


--
Woong Jun (woong.jun at gmail.com)
http://code.woong.org

David Lovemore

unread,
Aug 27, 2018, 9:21:35 AM8/27/18
to
On Friday, August 24, 2018 at 3:30:24 PM UTC+1, woon...@gmail.com wrote:
> David Lovemore wrote:
> > On Thursday, August 16, 2018 at 1:41:12 PM UTC+1, woon...@gmail.com
wrote:
> > > The official errata for the book, Advanced Compiler Design and
Implementation
> > > by Steven Muchnik added, among other fixes, this line to the Tarjan's
algorithm
> > > to find maximal strongly connected components (SCCs) from a directed
graph.
> > >
> > > All_SCC U= {{Stack[1]}}
> > >
> > > where I replaced with brackets an arrow to mean retrival of an element
from a
> > > sequence named Stack. ...
> [...]
> >
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algori
thm
>
> Thanks for the reference. I compared and found no differences.
>
> If the line in question had come from the text instead of the errata,
> I would not have posted this question. I still wonder why the author
> decided to add the line that looks unnecessary.

It is unnecessary. Thinking through why this is the case I found some
simplifications to the code.

Strong_Components only ever returns anything on the stack if it has found an
edge to an earlier node on the stack. When a node is reached that can't reach
an earlier node, that node and everything that is newer than it on the stack
is removed from the stack and recorded as an SCC.

Importantly in each call from the top level, lowLink[x] will never decrease as
there can't be a reachable y on the stack with a lower Dfn[y]. So, at the end
of the function all the nodes remaining on the stack will be popped as an SCC
and added to All_SCC.

My simplifications are:

1. It is unnecessary to label LowLink on all the nodes as it can be kept as a
local variable and returned from Strong_Components.

2. Instead of recording Dfn(x), we can record Index(x) = index of x when
placed on Stack.
As x is never placed on Stack more than once this is unique to x. When we need
to test for y being on Stack, we can check Stack(Index(y))==y. This means we
have no need for a separate "OnStack" flag/hashtable.

3. The need for an extra check that Index(y) is still an index on the stack
can be avoided. (So we omit "Index(y) < len(Stack)" check before checking
"Stack(Index(y))==y" as it is implied by "Index(y) < lowestFound".)

2 and 3 assume that you have an equality test on nodes.

Here is my Python code for reference:
def TarjanSCC(nodes, Succ):
Dfn = {x:0 for x in nodes}
LowLink = dict()
NextDfn = 0
Stack = []
All_SCC = []
def Strong_Components(x, Succ):
nonlocal NextDfn
nonlocal All_SCC
NextDfn+=1
Dfn[x] = NextDfn
LowLink[x] = Dfn[x]
Stack.append(x)
for y in Succ[x]:
if Dfn[y] == 0:
Strong_Components(y, Succ)
LowLink[x] = min(LowLink[x], LowLink[y])
elif Dfn[y] < Dfn[x] and y in Stack:
LowLink[x] = min(LowLink[x], Dfn[y])
if LowLink[x] == Dfn[x]:
# All nodes on Stack are ascending in Dfn.
# All nodes between when Dfn[x] was pushed and
# top of stack are SCC.
SCC = set()
while Stack:
z = Stack[-1]
if Dfn[z] < Dfn[x]:
All_SCC.append(SCC)
return
Stack.pop()
SCC.add(z)
All_SCC.append(SCC)

for x in nodes:
if Dfn[x] == 0:
Strong_Components(x,Succ)
return All_SCC

def SimpleSCC(nodes, Succ):
index = dict() # undefined if unseen else stack index when on stack
stack = []
all_SCC = []
def lowestReachable(x, succ):
"""Returns lowest index on stack of reachable nodes from x.
All found SCCs are added to all_SCC."""
nonlocal stack
nonlocal all_SCC
lowestFound = len(stack)
index[x] = len(stack)
stack.append(x)
for y in succ[x]:
i = index.get(y)
if i is None: # y not in Index
lowestFound = min(lowestFound, lowestReachable(y, succ))
elif i < lowestFound and stack[i]==y:
lowestFound = i # y is node lower on stack.
if lowestFound == index[x]: # No lower node found.
# Nodes between lowestFound and top of stack are single SCC.
all_SCC.append(set(stack[lowestFound:]))
stack = stack[:lowestFound] # Trim stack to low elements.
return lowestFound

for x in nodes:
if x not in index:
lowestReachable(x,Succ)
return all_SCC

nodes = [1,2,3,4,5,6,7,8]
Succ={1:[2,3], 2:[4,5], 3:[4], 4:[3,7,8], 5:[2], 6:[3], 7:[7], 8:[3,7]}
print(nodes)
print(Succ)
print(TarjanSCC(nodes, Succ))
print(SimpleSCC(nodes, Succ))
0 new messages