Trying to determine the best graph architecture

aDS

unread,

Jul 23, 2024, 3:30:35 PM7/23/24

to networkx-discuss

Say I have 200 or so cars. Each car has error codes I want to graph. And ultimately visualize and compare similar automobiles based on their error codes. Do I create nodes named Car 1, Car 2 ... Car N... etc., and then create edges as a list of their respective error codes? And only have edges connecting the nodes if they have error codes in common? Then, how do I compare them? Say, Car 1 has error code 1, 2 and 3 and Car 2 has error code 2, 3 and 4? How do I compare them on a graph?

Any help is greatly appreciated!

Sanchit Ram

unread,

Jul 23, 2024, 5:10:11 PM7/23/24

to networkx-discuss

In my experience dealing with graph like structures, comparing different node types is challenging. In your proposed model, Car A has Error Code 1 and Car B has Error Code 1. There are three nodes in this statement - Car A, Car B, and Error Code 1.

I feel like your question is defining "similar automobiles" based on their error codes - I'm not entirely sure why, but suppose we assume you're trying to find cars with a high chance of being faulty. This makes me think that you can define an edge as the number of shared error codes between two cars. So, suppose Car A has Error Codes 1, 2, 3, and Car B has Error Codes 2, 3, 4. Then, there's an edge between Car A and Car B, and the edge weight would be 2.

In this type of a construction, you have one node type, edges between them based on the number of common error codes, and you can achieve some of the more interesting centrality measures, like PageRank. Suppose your car nodes are every model / year combo of Chevrolet, and the error codes are commonly reported failures of parts (Actuator failed, Hub Erosion, whatever). I think the high PageRank nodes would give you cars with highest likelihood of having an error. Fwiw, I'm not entirely sure if this type of a construction is useful. It's important to think of edges as the relationships between two similar nodes when constructing a graph. You can certainly define multiple types of nodes, and work with the resulting graph structure, but then you need to be careful about what your centrality measures are concerning.

Lucas Almeida

unread,

Jul 23, 2024, 5:10:11 PM7/23/24

to networkx-discuss

What do you mean by compare? Compare which cars have the same error code?

Are you using directed graphs or undirected? What you need it to be a graph for?

Using sets ain't a option?

Mason MacPhail

unread,

Jul 24, 2024, 10:57:45 AM7/24/24

to networkx...@googlegroups.com

Honestly, I'm somewhat new to graphing. So, I'm open to using any kind of graph, directed or undirected. My current graph does kind of what you are alluding to, I believe. Here is some example code:

import networkx as nx
import matplotlib.pyplot as plt

cars = ['Car     4','Car     5','Car   106','Car   110']

error_codes = [
['D11', 'D22', 'PowerCycle'],
['010c', '0153', '0551', '05d0', '0601', '0604', '060b', '0620', '0621', 
 '0622', '0623', '0625', '0626', '0628', '0629', '062a', '062b', '062c', 
 '062d', '0631', '0634', '0638', '0639', '063a', '0640', '0641', '0650', 
 '0651', '0652', '0653', '0657', '0658', '0659', '065b', '065c', '065d', 
 '065f', '0663', '0683', '0700', '0704', '0708', '0709', '0740', '0749', 
 '074b', '0802', '0803', '0810', '0811', '0812', '0815', '0851', '0852', 
 '0853', '0855', '0857', '0858', '085a', 'D 8', 'D11', 'D22', 'D24', 
 'PowerCycle'],
['010c', '0154', '0513', '051b', '0521', '05d0', '065d', '0702', '0708', 
 '0709', '0710', '074b', '0811', '0812', '0818', '0853', '085a', '1001', 
 'D11', 'PowerCycle'],
['010c', '0153', '0509', '0510', '0511', '0513', '051a', '0521', '0525', 
 '0527', '0529', '0558', '05d0', '0604', '0605', '0663', '0682', '0702', 
 '0708', '0709', '074b', '0782', '0785', '0810', '0811', '0818', '0851', 
 '0852', '085a', 'D11', 'PowerCycle']
 ]
car_dict={}
for i, car in enumerate(cars):
    car_dict[car] = error_codes[i]

# Add car nodes and error code nodes
car_nodes = list(car_dict.keys())
error_nodes = list(set(error for error_codes in car_dict.values() for error in error_codes))

# Create an empty graph
G = nx.Graph()

G.add_nodes_from(car_nodes, bipartite=0)
G.add_nodes_from(error_nodes, bipartite=1)

# Add edges between cars and their error codes
for car, error_codes in car_dict.items():
    for code in error_codes:
        G.add_edge(car, code)

# Drawing the bipartite graph
pos = {}
pos.update((node, (1, index)) for index, node in enumerate(car_nodes))  # Position car nodes
pos.update((node, (2, index)) for index, node in enumerate(error_nodes))  # Position error code nodes

nx.draw(G, pos, with_labels=True, node_size=5000, node_color='lightblue', font_size=10, font_weight='bold')
fig = plt.gcf()
fig.set_size_inches(16, 10)
# fig.savefig('test2png.png', dpi=100)
plt.show()

from networkx.algorithms import bipartite

# Project the bipartite graph to get the car-car projection
car_projection = bipartite.weighted_projected_graph(G, car_nodes,ratio=False)

# Draw the projected graph
pos = nx.spring_layout(car_projection,seed=100)
nx.draw(car_projection, pos, with_labels=True, node_size=70, node_color='blue', font_size=10, font_weight='bold',font_color='red')
edge_labels = nx.get_edge_attributes(car_projection, 'weight')
nx.draw_networkx_edge_labels(car_projection, pos, edge_labels=edge_labels)
fig = plt.gcf()
# fig.set_size_inches(16, 16)
plt.show()

What I don't like about just showing just the number of codes they have in common, is that different error codes reveal problems with different parts of the engine. This is just a few cars. When I have thousands of cars, I'm hoping that graphs will show different populations of cars with different problems. and from there hopefully predict future problems.

--
You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/networkx-discuss/29080a67-9984-4b1c-87ff-6d473ed14c0en%40googlegroups.com.

Mason MacPhail

unread,

Jul 24, 2024, 11:09:29 AM7/24/24

to networkx...@googlegroups.com

Sanchet Ram, my dilemma is two fold. One, trying to visualize what's going on with error codes and cars to give me an idea how to predict future breakdowns. The other is how to compare two different cars, like, what's the similarity metric. I'm still trying to decide how I will compare them. Say, I have two cars, Car 1 with error codes 1, 2, 3 4 and 5 and Car 2 with error codes 2, 3 and 4. Is car 2 an exact match because car 1 has codes 2, 3 and 4? Or is it some reduced similarity due to Car 1 having some extra codes that car 2 doesn't?

I don't know enough about graphing to know if this construction is useful, either. We only have 1 type of car but different parts break, and we hope that different combinations of codes will predict parts breaking ahead of time.

--

You received this message because you are subscribed to the Google Groups "networkx-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to networkx-discu...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/networkx-discuss/81ba4699-de83-42bf-a6eb-e1122d86e5f2n%40googlegroups.com.

Sanchit Ram

unread,

Aug 9, 2024, 11:05:49 AM8/9/24

to networkx...@googlegroups.com

It might be the case that graphs aren't best suited for this kind of analysis. I think your question about error code similarity is the key - my take would be that Car 1 and Car 2 share some elements in common (common error codes), but they aren't equivalent. Two cars with the same error codes would be equivalent. In that case, an edge would exist between two cars if they shared any common error codes, and the weight of the edge would be the number of common error codes. Note that `the number of common error codes` is one possible way you could construct the edge weights - you could normalize it, make it 2^number_of_edges...really anything.

Intuitively, suppose you calculate PageRank on that kind of graph. PageRank tells you which cars are most connected to other cars, with similar error codes. So, if you want to highlight cars whose error patterns are most interconnected with other cars' error patterns, it may be useful.

To view this discussion on the web visit https://groups.google.com/d/msgid/networkx-discuss/CAFbsGUU%3DUxfZr_kAAKxB6njHGKyriFGt4a%2BrWEx3vswRKjnPtQ%40mail.gmail.com.

Reply all

Reply to author

Forward