Recently I visited one of our most important clients for on-site
debugging and consulting. Their 1394 configuration was pretty
complicated and pretty loaded with traffic, and they were experiencing
some very unusual behavior.
In due course I noticed that there were some hardware discrepancies on
the bus. I used the PHY BASE and PHY PAGED commands that were
introduced in ubCore 5.0 and noticed that several nodes and ports on
the bus were indicating a series of problems.
However this was a big and complex 1394 bus setup, so it was totally
impractical to run the PHY BASE and PHY PAGED commands for each node
and each port respectively in order to locate all the problematic
devices and cables.
So I implemented two new commands for FireCommander.
These are PBQUERY (PHY BASE QUERY) and PPQUERY (PHY PAGED QUERY).
PBQUERY will "scan" all NODES on the bus and check if a specific base
register matches a particular value. The format of this command, as
displayed by FireCommander is:
PBQUERY <register> <HexBitmask> [HexCompareValue]
The <register> parameter can range from 0 to 7. You can see the binary
layout of the base registers by typing PHY BASE 0. A node with Physical
ID equal to zero always exists so this command will always produce
valid results.
The <HexBitmask> parameter contains a one byte bitmask that will be
ANDed (binary AND) with the register value.
The [HexCompareValue] is optional.
If specified then a node is considered to match the query if
(<RegisterValue> AND <HexBitMask>) = [HexCompareValue]
If it is not specified then a node is considered to match the query if
(<RegisterValue> AND <HexBitMask>) is non zero.
Here are some sample queries:
(*) Show all nodes that have the "to" bit (timeout) equal to one:
PBQUERY 5 0x08
(*) Show all nodes that have the "to" bit (timeout) equal to zero:
PBQUERY 5 0x08 0x00
(*) Show all nodes that have the pfd bit (Power Failure Detect) set:
PBQUERY 5 0x10
(*) Show all nodes that have the pfd bit (Power Failure Detect) clear:
PBQUERY 5 0x10 0x00
(*) Show all nodes that have the pfd bit set and the 'to' bit clear:
PBQUERY 5 0x18 0x10
The output of this command looks as follows:
*PhyID 0
*PhyID 2
PhyID 3
PhyID 4
The asterisk on the left indicates that the node has an Inactive Link
Layer. These may be repeater nodes or inactive devices that you want to
ignore.
The 'to' and 'pfd' bits are so important to check when you are
troubleshooting a complex configuration that FireCommander includes two
new commands that actually internally route to PBQUERY. These are the
TIMEOUT and PFD commands.
However if for example you want to check nodes that have both these
bits set then you have to resort to using PBQUERY.
The HexBitmask and HexCompareValue parameters are specified with a 0x
prefix, but they can also be specified using standard binary notation
using the 0b prefix. I.e. 0x63 is the same as 0b01100011.
Because most of us are human and it's easy to get hex or binary
parameters wrong, the code performs a sanity check to make sure that
the HexCompareValue can actually result from ANDing HexBitMask with a
register value. The rule that must hold is:
HexBitMask == HexBitmask | HexCompareValue
If that is not true, then you have made a mistake, either in the
HexBitmask or the HexCompareValue and the results of the query would
always be the empty set.
======================================================
PPQUERY will "scan" all PORTS on the bus and check if a specific paged
register in page zero matches a particular value. The format of this
command, as displayed by FireCommander is:
PPQUERY <register> <HexBitmask> [HexCompareValue] [PortChoice]
PortChoice can be:
Connected (default)
Disconnected
All
The <register> parameter can range from 0 to 7. You can see the binary
layout of the paged registers in page zero by typing PHY PAGED 0 0 0. A
node with Physical ID equal to zero always exists, and it always has
one port at least (port 0) and that in turn always has page zero of the
paged registers.
The <HexBitmask> parameter contains a one byte bitmask that will be
ANDed (binary AND) with the register value.
The [HexCompareValue] is optional.
If specified then a port is considered to match the query if
(<RegisterValue> AND <HexBitMask>) = [HexCompareValue]
If it is not specified then a port is considered to match the query if
(<RegisterValue> AND <HexBitMask>) is non zero.
Here are some sample queries:
(*) Show all connected ports that have the "bm" bit (Beta Mode) equal
to one: PPQUERY 3 0x08
(*) Show all ports that have any port error: PPQUERY 4 0xFF ALL
(*) Show all ports that have the "cu" bit (Connection Unreliable) equal
to one: PPQUERY 3 0b10000000 ALL
(*) Show all ports where a beta loop was detected (ld bit): PPQUERY 5
0x04 ALL
These commands can be extremely useful when troubleshooting a 1394 bus
that exhibits abnormal behaviour.
Below is a first list of step to try when troubleshooting a 1394 bus:
(0) Check for nodes that have the "to" and/or "pfd" bits set in their
base registers. The "to" bit is set when an arbitration state machine
timeout occurs and may be an indication of low level trouble on the
bus. Additionally whenever an arbitration state machine timeout occurs
the PHY chip initiates a bus reset. So if you are experiencing bus
resets out of nowhere then it might be due to an arbitration state
machine timeout. In this case there are usually more than one nodes
that initiated (or think they initiated) the bus reset. You can run the
IBR command to see all the nodes the have the ibr bit set in their
SelfID packets. If you have an older version of FireCommander you can
type the SELFID command and check the i column on the right.
"PFD" stands for Power Failure Detect and this sounds not good at all
in my ears.
Bear in mind that these bits are not monitored by the ubCore drivers
and their behaviour is sticky. Once set, they remain set until somebody
clears them. And the drivers only clear them upon a restart. So if you
are experiencing lots of nodes with the "to" and "pfd" bits set don't
panic :-) Power cycle the bus and everything will return to normal.
(1) Check for connection unreliable bits (PPQUERY 3 0x80 ALL). Usually
such ports will appear The presence of a port with the cu bit set
indicates either a troublesome cable or a troublesome port.
(2) Check for port errors (PPQUERY 4 0xFF ALL).
(3) Check for beta loops that should be there and are not (PPQUERY 5
0x04 ALL). You might think your beta bus has redundancy but it might
already be running on the redundant configuration because a cable or
port went wrong and the bus automatically healed itself. This means
that the beta loop(s) you implemented are not in place any more and the
next cable/port failure will cut the bus in two.
(4) Check the beta mode port status (PPQUERY 3 0x08). You may believe
that all your devices are running in beta mode, but some ports may have
a different opinion on the subject.
(5) Check for hard disabled ports (PPQUERY 5 0x01 ALL).
(6) Check for ports that have the connected (conn) bit set, but rok
(receive ok) is zero (PPQUERY 0 0x06 0x04 ALL).
The operations performed by PBQUERY and PPQUERY are so incredibly
useful that we plan to make this functionality part of the next release
of the FireAPI SDK.
Moreover we will definitely add the same functionality in FireViewer so
that you can visually detect the problematic devices. Watch this group
for news on that.
If you have ubCore 5.x and don't want to wait until the next official
release of ubCore, you can contact sup...@unibrain.com to obtain the
latest release of FireCommander.
This functionality is ONLY available in ubCore 5.x.
Applications developed with older versions of FireAPI SDK can run with
no modification and no recompilation on the ubCore 5.x drivers. When I
say "ubCore 5.x drivers" I also include the UB1394.DLL. Older versions
of UB1394.DLL will not work with the kernel drivers of ubCore 5.x.
Unless you are using a licensed adapter you will probably get the
evaluation dialog box (once for each adapter) and your application will
be permitted to run for about 20 minutes.
One last note: It's pretty difficult to remember all the above examples
and all the registers you might need to test, so you might want to take
advantage of the ability to write FireCommander MACROS and assign
macros of your liking to the commands.
We will try to provide a ready-made macro file with the next official
release of ubCore 5. The article describing how to implement
FireCommander MACROS can be found at
http://groups.google.gr/group/FireAPI/browse_thread/thread/be8dee1f659cb3b0/28fedea698ea862a?lnk=st&q=FireCommander+macros&rnum=1#28fedea698ea862a
Dimitris Staikos
Unibrain