Skip to content

Commit 9986424

Browse files
author
Daniel Precioso, PhD
committed
Add introductory content and structure to NetworkX components and fundamentals
1 parent 9e6419e commit 9986424

3 files changed

Lines changed: 391 additions & 16 deletions

File tree

modules/networks/components.qmd

Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
---
2+
title: "Networks: Basic Components"
3+
subtitle: "Nodes and Edges"
4+
format: html
5+
---
6+
7+
A graph is a mathematical structure used to model pairwise relations between objects. It consists of **nodes** (also called vertices) and **edges** (also called links) that connect pairs of nodes.
8+
9+
In this page, we will explore the basic components of a graph using the `networkx` library in Python. We will cover:
10+
11+
- What are nodes and edges?
12+
- How to create a graph using `networkx`.
13+
14+
First, let's import the necessary library:
15+
16+
```{python}
17+
#| echo: false
18+
#| message: false
19+
import networkx as nx
20+
```
21+
22+
::: {.callout-tip collapse="true"}
23+
## Module not found error?
24+
25+
If you get a `ModuleNotFoundError` for `networkx`, you may need to install it first.
26+
27+
If you are working on **Google Colab**, you can run:
28+
```python
29+
!pip install networkx
30+
```
31+
32+
If you are working in a local Python environment, use conda or run:
33+
```bash
34+
pip install networkx
35+
```
36+
:::
37+
38+
And now we can initialize an empty graph:
39+
40+
```{python}
41+
#| echo: true
42+
#| message: true
43+
G = nx.Graph()
44+
45+
print(G)
46+
```
47+
48+
Our variable `G` is now an empty graph object. We can add nodes and edges to it, which we will see in the next sections.
49+
50+
## Nodes (Vertices)
51+
52+
Nodes represent the entities in a graph. They can be anything: people in a social network, airports in a flight network, or web pages in the internet. Each node can have attributes that provide additional information about it. For example, in a social network, a node might represent a person and have attributes like name, age, or location.
53+
54+
In order to add nodes to our graph, we can use the `add_node(<id>)` method. The `<id>` can be any hashable Python object. We can see the list of nodes in the graph using the `nodes()` method.
55+
56+
```{python}
57+
#| echo: true
58+
#| message: true
59+
# Add three nodes to the graph
60+
G.add_node("Spain")
61+
G.add_node("Portugal")
62+
G.add_node("France")
63+
64+
# Show the nodes in the graph
65+
print(G.nodes())
66+
```
67+
68+
::: {.callout-tip collapse="true"}
69+
## What does "hashable" mean?
70+
71+
In Python, a hashable object is an object that has a hash value that remains constant during its lifetime. This means that the object can be used as a key in a dictionary or as an element in a set. Examples of hashable objects include integers, strings, and tuples (as long as they contain only hashable types). Lists and dictionaries are not hashable because they are mutable (their contents can change).
72+
:::
73+
74+
### Node Attributes
75+
76+
We can also add attributes to nodes to store additional information. Think of `G.nodes` as a dictionary where the keys are the node IDs and the values are dictionaries of attributes. We can add attributes to a node by accessing it through `G.nodes[<id>]` and assigning values to the attributes.
77+
78+
For example, we can add a "population" attribute to our country nodes:
79+
80+
```{python}
81+
#| echo: true
82+
#| message: true
83+
# Add population attribute to the nodes
84+
G.nodes["Spain"]["population"] = 47_000_000
85+
G.nodes["Portugal"]["population"] = 10_000_000
86+
G.nodes["France"]["population"] = 67_000_000
87+
88+
# Show the nodes with their attributes
89+
population = nx.get_node_attributes(G, 'population')
90+
for node, pop in population.items():
91+
print(f"{node}: {pop} inhabitants")
92+
```
93+
94+
Using `nx.get_node_attributes(G, 'population')`, we can retrieve the population attribute for all nodes in the graph as a dictionary.
95+
96+
If you are including a new node and want to add attributes at the same time, you can use the `add_node()` method with keyword arguments. For example:
97+
98+
```{python}
99+
#| echo: true
100+
#| message: true
101+
# Add a new node with attributes
102+
G.add_node("Italy", population=60_000_000)
103+
104+
# Show the nodes with their attributes
105+
population = nx.get_node_attributes(G, 'population')
106+
for node, pop in population.items():
107+
print(f"{node}: {pop} inhabitants")
108+
```
109+
110+
## Edges (Links)
111+
112+
Edges represent the connections between nodes in a graph. They can also have attributes, such as weight, which might represent the strength of the connection. For example, in a social network, an edge might represent a friendship between two people, and the weight could represent how close they are.
113+
114+
To add edges to our graph, we can use the `add_edge(<node1>, <node2>)` method. This will create an undirected edge between `node1` and `node2` (we use their IDs here). We can see the list of edges in the graph using the `edges()` method.
115+
116+
```{python}
117+
#| echo: true
118+
#| message: true
119+
# Add edges between the nodes (neighboring countries)
120+
G.add_edge("Spain", "Portugal")
121+
G.add_edge("Spain", "France")
122+
# Show the edges in the graph
123+
print(G.edges())
124+
```
125+
126+
### Edge Attributes
127+
128+
Just like nodes, edges can also have attributes. We can add attributes to an edge by accessing it through `G.edges[<node1>, <node2>]` and assigning values to the attributes. For example, we can add a "distance" (between capitals) attribute to represent the distance between the countries:
129+
130+
```{python}
131+
#| echo: true
132+
#| message: true
133+
# Add distance attribute to the edges
134+
G.edges["Spain", "Portugal"]["distance"] = 600 # distance in kilometers
135+
G.edges["Spain", "France"]["distance"] = 1000 # distance in kilometers
136+
137+
# Show the edges with their attributes
138+
distance = nx.get_edge_attributes(G, 'distance')
139+
for edge, dist in distance.items():
140+
print(f"{edge}: {dist} km")
141+
```
142+
143+
Using `nx.get_edge_attributes(G, 'distance')`, we can retrieve the distance attribute for all edges in the graph as a dictionary.
144+
145+
Again, you can also add attributes to an edge at the same time as you create it using the `add_edge()` method with keyword arguments. For example:
146+
147+
```{python}
148+
#| echo: true
149+
#| message: true
150+
# Add a new edge with attributes
151+
G.add_edge("France", "Italy", distance=800)
152+
153+
# Show the edges with their attributes
154+
distance = nx.get_edge_attributes(G, 'distance')
155+
for edge, dist in distance.items():
156+
print(f"{edge}: {dist} km")
157+
```
158+
159+
### Adding Nodes and Edges Together
160+
161+
We can also add nodes and edges together using the `add_edge()` method. If we try to add an edge between two nodes that do not exist in the graph, `networkx` will automatically create those nodes for us. For example:
162+
163+
```{python}
164+
#| echo: true
165+
#| message: true
166+
# Add an edge between two nodes that do not exist
167+
G.add_edge("USA", "Canada", distance=3000)
168+
169+
# In this case, we will have to add the population attribute for the new nodes separately
170+
G.nodes["USA"]["population"] = 331_000_000
171+
G.nodes["Canada"]["population"] = 38_000_000
172+
173+
# Show the nodes and edges in the graph
174+
print("Nodes:", G.nodes())
175+
print("Edges:", G.edges())
176+
```
177+
178+
179+
## Visualization
180+
181+
Printing the graph object gives us a summary of its structure, but it doesn't show us the actual connections. To visualize the graph, we can use the `draw()` function from `networkx`, which uses Matplotlib to display the graph.
182+
183+
```{python}
184+
#| echo: true
185+
#| message: true
186+
import matplotlib.pyplot as plt
187+
188+
# Draw the graph
189+
nx.draw(
190+
G,
191+
with_labels=True, # show node labels (IDs)
192+
node_color='lightblue', # color of the nodes (vertices)
193+
edge_color='gray', # color of the edges (links)
194+
node_size=2000, # size of the nodes (vertices)
195+
font_size=12 # size of the labels (IDs)
196+
)
197+
plt.show()
198+
```
199+
200+
### Layouts
201+
202+
The `draw()` function has a `pos` parameter that allows us to specify the layout of the graph. A layout is a way to position the nodes in the graph for visualization. `networkx` provides several built-in layouts, such as `spring_layout`, `circular_layout`, and `shell_layout`. For example, we can use the spring layout, which simulates a force-directed algorithm to position the nodes:
203+
204+
```{python}
205+
#| echo: true
206+
#| message: true
207+
# Use the spring layout for visualization
208+
pos = nx.spring_layout(G)
209+
nx.draw(
210+
G,
211+
pos=pos, # specify the layout
212+
with_labels=True,
213+
node_color='lightblue',
214+
edge_color='gray',
215+
node_size=2000,
216+
font_size=12
217+
)
218+
plt.show()
219+
```
220+
221+
Playing with different layouts can help us better understand the structure of the graph and the relationships between nodes. Try it yourself!
222+
223+
### Visualizing Node Attributes
224+
225+
We can also visualize the attributes of nodes and edges by using different colors or sizes. For example, we can color the nodes based on their population attribute:
226+
227+
```{python}
228+
#| echo: true
229+
#| message: true
230+
# Get the population attribute for each node
231+
population = nx.get_node_attributes(G, 'population')
232+
# Draw the graph with node sizes proportional to population
233+
node_sizes = [population[node] / 1_000_000 for node in G.nodes()] # scale down for visualization
234+
235+
pos = nx.spring_layout(G)
236+
237+
nx.draw(
238+
G,
239+
pos=pos,
240+
with_labels=True,
241+
node_color='lightblue',
242+
edge_color='gray',
243+
node_size=node_sizes, # size of the nodes (vertices) proportional to population
244+
font_size=12,
245+
)
246+
plt.show()
247+
```
248+
249+
::: {.callout-tip collapse="true"}
250+
## What happens if a node is missing an attribute?
251+
252+
In this case, the `population` dictionary will not have an entry for that node, and trying to access it will raise a `KeyError`. To avoid this, we can use the `get()` method of the dictionary, which allows us to specify a default value if the key is not found. For example:
253+
254+
```{python}
255+
#| echo: true
256+
#| message: true
257+
258+
# Add a new node without the population attribute
259+
G.add_edge("France", "Germany", distance=900)
260+
# Get the population attribute for each node, using 0 as default if not found
261+
population = nx.get_node_attributes(G, 'population')
262+
# Draw the graph with node sizes proportional to population
263+
node_sizes = [population.get(node, 0) / 1_000_000 for node in G.nodes()] # scale down for visualization
264+
265+
pos = nx.spring_layout(G)
266+
267+
nx.draw(
268+
G,
269+
pos=pos,
270+
with_labels=True,
271+
node_color='lightblue',
272+
edge_color='gray',
273+
node_size=node_sizes, # size of the nodes (vertices) proportional to population
274+
font_size=12,
275+
)
276+
plt.show()
277+
```
278+
:::
279+
280+
**Exercise:** Add a new attribute to the nodes, called "visited", which is a boolean that indicates whether you have visited that country or not. Then, visualize the graph by coloring the nodes differently based on whether you have visited them or not: use blue for visited countries and red for unvisited countries.
281+
282+
::: {.callout-tip collapse="true"}
283+
## Solution to the Exercise
284+
285+
```{python}
286+
#| echo: true
287+
#| message: true
288+
# Add the "visited" attribute to the nodes
289+
G.nodes["Spain"]["visited"] = True
290+
G.nodes["Portugal"]["visited"] = True
291+
G.nodes["France"]["visited"] = True
292+
G.nodes["Italy"]["visited"] = True
293+
G.nodes["USA"]["visited"] = False
294+
G.nodes["Canada"]["visited"] = True
295+
296+
# Get the "visited" attribute for each node
297+
visited = nx.get_node_attributes(G, 'visited')
298+
# Define node colors based on the "visited" attribute
299+
node_colors = ['blue' if visited[node] else 'red' for node in G.nodes()]
300+
301+
pos = nx.spring_layout(G)
302+
303+
# Draw the graph with node colors based on the "visited" attribute
304+
nx.draw(
305+
G,
306+
pos=pos,
307+
with_labels=True,
308+
node_color=node_colors, # color of the nodes based on "visited" attribute
309+
edge_color='gray',
310+
node_size=2000,
311+
font_size=12,
312+
)
313+
plt.show()
314+
```
315+
:::
316+
317+
### Visualizing Edge Attributes
318+
319+
We can also visualize edge attributes by showing them as labels on the edges. For example, we can show the distance attribute on the edges:
320+
321+
```{python}
322+
#| echo: true
323+
#| message: true
324+
# Get the distance attribute for each edge
325+
distance = nx.get_edge_attributes(G, 'distance')
326+
# Draw the graph
327+
pos = nx.spring_layout(G)
328+
nx.draw(
329+
G,
330+
pos=pos,
331+
with_labels=True,
332+
node_color='lightblue',
333+
edge_color='gray',
334+
node_size=2000,
335+
font_size=12,
336+
)
337+
# Draw edge labels for the distance attribute
338+
nx.draw_networkx_edge_labels(G, pos, edge_labels=distance)
339+
plt.show()
340+
```
341+
342+
## Creating a Graph from an Edge List
343+
344+
In practice, we often have data in the form of an edge list, which is a list of pairs of nodes that are connected by edges. We can create a graph directly from an edge list using the `from_edgelist()` method. For example:
345+
346+
```{python}
347+
#| echo: true
348+
#| message: true
349+
# Define our edge list (actors that have worked together in movies)
350+
edge_list = [
351+
("Antonio Banderas", "Brad Pitt"), # Interview with the Vampire (1994)
352+
("Antonio Banderas", "Javier Bardem"), # Automata (2014)
353+
("Antonio Banderas", "Penelope Cruz"), # Dolor y Gloria (2019)
354+
("Antonio Banderas", "Tom Holland"), # Uncharted (2022)
355+
("Brad Pitt", "Javier Bardem"), # F1 (2025)
356+
("Javier Bardem", "Timothée Chalamet"), # Dune (2021)
357+
("Timothée Chalamet", "Zendaya"), # Dune (2021)
358+
("Tom Holland", "Zendaya"), # Spider-Man: No Way Home (2021)
359+
]
360+
361+
# Create a graph from the edge list
362+
G_actors = nx.from_edgelist(edge_list)
363+
# Draw the graph
364+
pos = nx.shell_layout(G_actors) # use shell layout for visualization
365+
nx.draw(
366+
G_actors,
367+
pos=pos,
368+
with_labels=True,
369+
node_color='lightgreen',
370+
edge_color='gray',
371+
node_size=2000,
372+
font_size=12
373+
)
374+
plt.show()
375+
```
376+
377+
**Exercise:** In the code above, I included the movies in the comments next to the edges. Can you create a graph where the edges are labeled with the movie titles?

0 commit comments

Comments
 (0)