Proximity Module¶
The proximity module provides functions for generating proximity-based graph networks from spatial data. These algorithms create edges between geometries based on various spatial relationships.
Geometric Graph Algorithms¶
Proximity-Based Graph Generation Module.
This module provides comprehensive functionality for generating graph networks based on spatial proximity relationships between geographic features. It implements several classical proximity models commonly used in spatial network analysis and geographic information systems. The module is particularly useful for constructing heterogeneous graphs from multiple domains of geospatial relations, enabling complex spatial analysis across different feature types and scales.
Functions:
| Name | Description |
|---|---|
knn_graph |
Generate a k-nearest neighbour graph from a GeoDataFrame of points. |
delaunay_graph |
Generate a Delaunay triangulation graph from a GeoDataFrame of points. |
gabriel_graph |
Generate a Gabriel graph from a GeoDataFrame of points. |
relative_neighborhood_graph |
Generate a Relative-Neighbourhood Graph (RNG) from a GeoDataFrame. |
euclidean_minimum_spanning_tree |
Generate a (generalised) Euclidean Minimum Spanning Tree from a GeoDataFrame of points. |
knn_graph ¶
knn_graph(
gdf,
k=5,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
target_gdf=None,
as_nx=False
)
Generate a k-nearest neighbour graph from a GeoDataFrame of points.
This function constructs a graph where each node is connected to its k nearest neighbors based on the specified distance metric. The resulting graph captures local spatial relationships and is commonly used in spatial analysis, clustering, and network topology studies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs. |
required |
k
|
int
|
The number of nearest neighbors to connect to each node. Must be positive and less than the total number of nodes. |
5
|
distance_metric
|
str
|
The distance metric to use for calculating nearest neighbors. Options are: - "euclidean": Straight-line distance - "manhattan": City-block distance (L1 norm) - "network": Shortest path distance along a network |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
A GeoDataFrame representing a network (e.g., roads, paths) to use for "network"
distance calculations. Required if |
None
|
network_weight
|
str
|
Edge attribute name in |
None
|
target_gdf
|
GeoDataFrame
|
If provided, creates a directed graph where edges connect nodes from |
None
|
as_nx
|
bool
|
If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges). |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
delaunay_graph ¶
delaunay_graph(
gdf,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
as_nx=False
)
Generate a Delaunay triangulation graph from a GeoDataFrame of points.
This function constructs a graph based on the Delaunay triangulation of the input points. Each edge in the graph corresponds to an edge in the Delaunay triangulation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs. |
required |
distance_metric
|
str
|
The distance metric to use for calculating edge weights. Options are "euclidean", "manhattan", or "network". |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
A GeoDataFrame representing a network (e.g., roads) to use for "network"
distance calculations. Required if |
None
|
network_weight
|
str
|
Edge attribute name in |
None
|
as_nx
|
bool
|
If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges). |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
See Also
knn_graph : Generate a k-nearest neighbour graph. fixed_radius_graph : Generate a fixed-radius graph. waxman_graph : Generate a probabilistic Waxman graph.
Notes
- Node IDs are preserved from the input GeoDataFrame's index.
- Edge weights represent the distance between connected nodes.
- Edge geometries are LineStrings connecting the centroids of the nodes.
- If the input gdf has fewer than 3 points, an empty graph will be returned as Delaunay triangulation requires at least 3 non-collinear points.
References
.. [1] Lee, D. T., & Schachter, B. J. (1980). Two algorithms for constructing a Delaunay triangulation. International Journal of Computer & Information Sciences, 9(3), 219-242. https://doi.org/10.1007/BF00977785
gabriel_graph ¶
gabriel_graph(
gdf,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
as_nx=False
)
Generate a Gabriel graph from a GeoDataFrame of points.
In a Gabriel graph two nodes u and v are connected iff the closed disc that has \(uv\) as its diameter contains no other node of the set.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
Input point layer. The GeoDataFrame index is preserved as the node id. |
required |
distance_metric
|
('euclidean', 'manhattan', 'network')
|
Metric used for edge weights / geometries (see the other generators). |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
Required when |
None
|
network_weight
|
str
|
Edge attribute in |
None
|
as_nx
|
bool
|
If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges)
via |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Notes
- The Gabriel graph is a sub-graph of the Delaunay triangulation; therefore the implementation first builds the Delaunay edges then filters them according to the disc-emptiness predicate, achieving an overall \(O(n \\log n + mk)\) complexity (m = Delaunay edges, k = average neighbours tested per edge).
- When the input layer has exactly two points, the unique edge is returned.
- If the layer has fewer than two points, an empty graph is produced.
References
.. [1] Gabriel, K. R., & Sokal, R. R. (1969). A new statistical approach to geographic variation analysis. Systematic zoology, 18(3), 259-278. https://doi.org/10.2307/2412323
relative_neighborhood_graph ¶
relative_neighborhood_graph(
gdf,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
as_nx=False
)
Generate a Relative-Neighbourhood Graph (RNG) from a GeoDataFrame.
In an RNG two nodes u and v are connected iff there is no third node w such that both \(d(u,w) < d(u,v)\) and \(d(v,w) < d(u,v)\). Equivalently, the intersection of the two open discs having radius \(d(u,v)\) and centres u and v (the lune) is empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
Input point layer whose index provides the node ids. |
required |
distance_metric
|
('euclidean', 'manhattan', 'network')
|
Metric used to attach edge weights / geometries (see the other generators). |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
Required when |
None
|
network_weight
|
str
|
Edge attribute in |
None
|
as_nx
|
bool
|
If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges)
via |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Notes
- The RNG is a sub-graph of the Delaunay triangulation; therefore the implementation first collects Delaunay edges (\(O(n \\log n)\)) and then filters them according to the lune-emptiness predicate.
- When the input layer has exactly two points the unique edge is returned.
- If the layer has fewer than two points, an empty graph is produced.
References
.. [1] Toussaint, G. T. (1980). The relative neighbourhood graph of a finite planar set. Pattern recognition, 12(4), 261-268. https://doi.org/10.1016/0031-3203(80)90066-7
euclidean_minimum_spanning_tree ¶
euclidean_minimum_spanning_tree(
gdf,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
as_nx=False
)
Generate a (generalised) Euclidean Minimum Spanning Tree from a GeoDataFrame of points.
The classical Euclidean Minimum Spanning Tree (EMST) is the minimum-total-length tree that connects a set of points when edge weights are the straight-line (\(L_2\)) distances. For consistency with the other generators this implementation also supports manhattan and network metrics - it simply computes the minimum-weight spanning tree under the chosen metric. When the metric is euclidean the edge search is restricted to the Delaunay triangulation (EMST ⊆ Delaunay), guaranteeing an \(O(n \log n)\) overall complexity. With other metrics, or degenerate cases where the triangulation cannot be built, the algorithm gracefully falls back to the complete graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
Input point layer. The index is preserved as the node identifier. |
required |
distance_metric
|
('euclidean', 'manhattan', 'network')
|
Metric used for the edge weights / geometries. |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
Required when |
None
|
network_weight
|
str
|
Edge attribute name in |
None
|
as_nx
|
bool
|
If True return a NetworkX graph, otherwise return two GeoDataFrames (nodes, edges)
via |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
See Also
delaunay_graph : Generate a Delaunay triangulation graph. gabriel_graph : Generate a Gabriel graph. relative_neighborhood_graph : Generate a Relative Neighborhood Graph.
Notes
- The resulting graph always contains n - 1 edges (or 0 / 1 when the input has < 2 points).
- For planar Euclidean inputs the computation is \(O(n \\log n)\) thanks to the Delaunay pruning.
- All the usual spatial attributes (weight, geometry, CRS checks, etc.) are attached through the shared private helpers.
References
.. [1] March, W. B., Ram, P., & Gray, A. G. (2010, July). Fast euclidean minimum spanning tree: algorithm, analysis, and applications. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 603-612). https://doi.org/10.1145/1835804.1835882
Distance-Based Graphs¶
Proximity-Based Graph Generation Module.
This module provides comprehensive functionality for generating graph networks based on spatial proximity relationships between geographic features. It implements several classical proximity models commonly used in spatial network analysis and geographic information systems. The module is particularly useful for constructing heterogeneous graphs from multiple domains of geospatial relations, enabling complex spatial analysis across different feature types and scales.
Functions:
| Name | Description |
|---|---|
fixed_radius_graph |
Generate a fixed-radius graph from a GeoDataFrame of points. |
waxman_graph |
Generate a probabilistic Waxman graph from a GeoDataFrame of points. |
fixed_radius_graph ¶
fixed_radius_graph(
gdf,
radius,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
target_gdf=None,
as_nx=False
)
Generate a fixed-radius graph from a GeoDataFrame of points.
This function constructs a graph where nodes are connected if the distance between them is within a specified radius. This model is particularly useful for modeling communication networks, ecological connectivity, and spatial influence zones where interaction strength has a clear distance threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
GeoDataFrame containing the source points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs. |
required |
radius
|
float
|
The maximum distance for connecting nodes. Nodes within this radius will have an edge between them. Must be positive. |
required |
distance_metric
|
str
|
The distance metric to use for determining connections. Options are: - "euclidean": Straight-line distance - "manhattan": City-block distance (L1 norm) - "network": Shortest path distance along a network |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
A GeoDataFrame representing a network (e.g., roads) to use for "network"
distance calculations. Required if |
None
|
network_weight
|
str
|
Edge attribute in |
None
|
target_gdf
|
GeoDataFrame
|
If provided, creates a directed graph where edges connect nodes from |
None
|
as_nx
|
bool
|
If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges). |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
See Also
knn_graph : Generate a k-nearest neighbour graph. delaunay_graph : Generate a Delaunay triangulation graph. waxman_graph : Generate a probabilistic Waxman graph.
Notes
- Node IDs are preserved from the input GeoDataFrame's index
- Edge weights represent the actual distance between connected nodes
- Edge geometries are LineStrings connecting node centroids
- The graph stores the radius parameter in G.graph["radius"]
- For Manhattan distance, edges follow L-shaped geometric paths
References
.. [1] Bentley, J. L., Stanat, D. F., & Williams Jr, E. H. (1977). The complexity of finding fixed-radius near neighbors. Information processing letters, 6(6), 209-212. https://doi.org/10.1016/0020-0190(77)90070-9
waxman_graph ¶
waxman_graph(
gdf,
beta,
r0,
seed=None,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
*,
as_nx=False
)
Generate a probabilistic Waxman graph from a GeoDataFrame of points.
This function constructs a random graph where the probability of an edge existing between two nodes decreases exponentially with their distance. The model is based on the Waxman random graph model, commonly used to simulate realistic network topologies in telecommunications, transportation, and social networks where connection probability diminishes with distance.
The connection probability follows the formula:
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
GeoDataFrame containing the points (nodes) for the graph. The index of this GeoDataFrame will be used as node IDs. |
required |
beta
|
float
|
Parameter controlling the overall probability of edge creation. Higher values (closer to 1.0) increase the likelihood of connections. Must be between 0 and 1. |
required |
r0
|
float
|
Parameter controlling the decay rate of probability with distance. Higher values result in longer-range connections being more likely. Must be positive. |
required |
seed
|
int
|
Seed for the random number generator to ensure reproducibility of results. If None, results will vary between runs. |
None
|
distance_metric
|
str
|
The distance metric to use for calculating distances between nodes. Options are: - "euclidean": Straight-line distance - "manhattan": City-block distance (L1 norm) - "network": Shortest path distance along a network |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
A GeoDataFrame representing a network (e.g., roads) to use for "network"
distance calculations. Required if |
None
|
network_weight
|
str
|
Edge attribute name in |
None
|
as_nx
|
bool
|
If True, returns a NetworkX graph object. Otherwise, returns a tuple of GeoDataFrames (nodes, edges). |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
If |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
See Also
knn_graph : Generate a k-nearest neighbour graph. delaunay_graph : Generate a Delaunay triangulation graph. fixed_radius_graph : Generate a fixed-radius graph.
Notes
- Node IDs are preserved from the input GeoDataFrame's index
- Edge weights represent the actual distance between connected nodes
- Edge geometries are LineStrings connecting node centroids
- The graph stores parameters in G.graph["beta"] and G.graph["r0"]
- Results are stochastic; use seed parameter for reproducible outputs
- The graph is undirected with symmetric edge probabilities
References
.. [1] Waxman, B. M. (2002). Routing of multipoint connections. IEEE journal on selected areas in communications, 6(9), 1617-1622. https://doi.org/10.1109/49.12889
Node Manipulation¶
Proximity-Based Graph Generation Module.
This module provides comprehensive functionality for generating graph networks based on spatial proximity relationships between geographic features. It implements several classical proximity models commonly used in spatial network analysis and geographic information systems. The module is particularly useful for constructing heterogeneous graphs from multiple domains of geospatial relations, enabling complex spatial analysis across different feature types and scales.
Functions:
| Name | Description |
|---|---|
bridge_nodes |
Build directed proximity edges between every ordered pair of node layers. |
group_nodes |
Create a heterogeneous graph linking polygon zones to contained points. |
bridge_nodes ¶
bridge_nodes(
nodes_dict,
proximity_method="knn",
*,
source_node_types=None,
target_node_types=None,
multigraph=False,
as_nx=False,
**kwargs
)
Build directed proximity edges between every ordered pair of node layers.
This function creates a multi-layer spatial network by generating directed proximity edges from nodes in one GeoDataFrame layer to nodes in another. It systematically processes all ordered pairs of layers and applies either k-nearest neighbors (KNN) or fixed-radius method to establish inter-layer connections. This function is specifically designed for constructing heterogeneous graphs by generating new edge types ("is_nearby") between different types of nodes, enabling the modeling of complex relationships across multiple domains. It is useful for modeling complex urban systems, ecological networks, or multi-modal transportation systems where different types of entities interact through spatial proximity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nodes_dict
|
dict[str, GeoDataFrame]
|
A dictionary where keys are layer names (strings) and values are GeoDataFrames representing the nodes of each layer. Each GeoDataFrame should contain point geometries with consistent CRS across all layers. |
required |
proximity_method
|
str
|
The method to use for generating proximity edges between layers. Options are: - "knn": k-nearest neighbors method - "fixed_radius": fixed-radius method |
"knn"
|
source_node_types
|
Iterable[str]
|
Node types from |
None
|
target_node_types
|
Iterable[str]
|
Node types from |
None
|
multigraph
|
bool
|
If True, the resulting NetworkX graph will be a MultiGraph, allowing multiple edges between the same pair of nodes from different proximity relationships. |
False
|
as_nx
|
bool
|
If True, returns a NetworkX graph object containing all nodes and inter-layer edges. Otherwise, returns dictionaries of GeoDataFrames. |
False
|
**kwargs
|
Any
|
Additional keyword arguments passed to the underlying proximity method: For For |
{}
|
Returns:
| Type | Description |
|---|---|
tuple[dict[str, GeoDataFrame], dict[tuple[str, str, str], GeoDataFrame]] or Graph
|
If
If |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
See Also
knn_graph : Generate a k-nearest neighbour graph. fixed_radius_graph : Generate a fixed-radius graph.
Notes
- All generated edges are directed from source layer to target layer
- The relation type for all generated edges is fixed as "is_nearby", creating a new edge type that bridges different node types in heterogeneous graphs
- Each ordered pair of node layers generates a distinct edge type, enabling the construction of rich heterogeneous graph structures with multiple relationship types between different domain entities
- Edge weights and geometries are calculated based on the chosen
distance_metric - Each ordered pair of layers generates a separate edge GeoDataFrame
- Self-connections (layer to itself) are not created
- The resulting structure is ideal for heterogeneous graph neural networks, multi-layer network analysis, and cross-domain spatial relationship modeling
group_nodes ¶
group_nodes(
polygons_gdf,
points_gdf,
*,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
predicate="covered_by",
as_nx=False
)
Create a heterogeneous graph linking polygon zones to contained points.
This function builds a bipartite relation between polygon and point features by connecting each polygon to the points that satisfy a spatial containment predicate (default: "covered_by" so boundary points are included). It follows city2graph heterogeneous graph conventions and reuses the proximity helpers for computing edge weights and geometries according to the chosen distance metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polygons_gdf
|
GeoDataFrame
|
GeoDataFrame of polygonal features representing zones. CRS must match
|
required |
points_gdf
|
GeoDataFrame
|
GeoDataFrame of point features to be associated with the polygons. CRS must
match |
required |
distance_metric
|
('euclidean', 'manhattan', 'network')
|
Metric used for edge weights and geometries. Euclidean produces straight
line segments, Manhattan produces L-shaped polylines, and Network traces polylines
along the provided |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
Required when |
None
|
network_weight
|
str
|
Edge attribute in |
None
|
predicate
|
str
|
Spatial predicate used to determine containment in a vectorized spatial join (e.g., "covered_by", "within", "contains", "intersects"). The default includes points on polygon boundaries. |
"covered_by"
|
as_nx
|
bool
|
If False, return heterogeneous GeoDataFrame dictionaries. If True, return a
typed heterogeneous NetworkX graph built with |
False
|
Returns:
| Type | Description |
|---|---|
tuple[dict[str, GeoDataFrame], dict[tuple[str, str, str], GeoDataFrame]] or Graph
|
(nodes_dict, edges_dict) (tuple of dicts)
Returned when |
Notes
- CRS must be present and identical for both inputs. For network metric, the network's CRS must also match.
- Boundary points are included by default via
predicate="covered_by". - Distance calculations and edge geometries reuse internal helpers (_prepare_nodes, _distance_matrix, _add_edges) to ensure consistency with other proximity functions.
Contiguity Graphs¶
Proximity-Based Graph Generation Module.
This module provides comprehensive functionality for generating graph networks based on spatial proximity relationships between geographic features. It implements several classical proximity models commonly used in spatial network analysis and geographic information systems. The module is particularly useful for constructing heterogeneous graphs from multiple domains of geospatial relations, enabling complex spatial analysis across different feature types and scales.
Functions:
| Name | Description |
|---|---|
contiguity_graph |
Generate a contiguity-based spatial graph from polygon geometries. |
contiguity_graph ¶
contiguity_graph(
gdf,
contiguity="queen",
*,
distance_metric="euclidean",
network_gdf=None,
network_weight=None,
as_nx=False
)
Generate a contiguity-based spatial graph from polygon geometries.
This function creates a spatial graph where nodes represent polygons and edges connect spatially contiguous (adjacent) polygons based on Queen or Rook contiguity rules. It leverages libpysal's robust spatial weights functionality to accurately determine adjacency relationships, making it ideal for spatial analysis of administrative boundaries, urban morphology studies, land use patterns, and geographic network analysis.
The function supports both Queen contiguity (polygons sharing edges or vertices)
and Rook contiguity (polygons sharing only edges), providing flexibility for
different spatial analysis requirements. Edge weights are calculated as
distances between polygon centroids using the selected distance_metric. Supported metrics:
euclidean(default): straight-line distance; edge geometry is a direct centroid-to-centroid LineString.manhattan: L1 distance; edge geometry is an L-shaped polyline (two segments) following an axis-aligned path between centroids.network: shortest-path distance overnetwork_gdf(a line network in the same CRS); edge geometry is the polyline path traced along the network.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
Input GeoDataFrame containing polygon geometries. Must contain valid polygon geometries in the 'geometry' column. The index of this GeoDataFrame will be preserved as node identifiers in the output graph. All original attributes are maintained in the nodes output. |
required |
contiguity
|
('queen', 'rook')
|
Type of spatial contiguity rule to apply for determining adjacency:
|
"queen"
|
distance_metric
|
('euclidean', 'manhattan', 'network')
|
Metric used to compute edge weights and geometries. |
"euclidean"
|
network_gdf
|
GeoDataFrame
|
Required when |
None
|
network_weight
|
str
|
Edge attribute in |
None
|
as_nx
|
bool
|
Output format control. If True, returns a NetworkX Graph object with spatial attributes. If False, returns a tuple of GeoDataFrames for nodes and edges, compatible with other city2graph functions. |
False
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] or Graph
|
When |
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
ValueError
|
If |
See Also
libpysal.weights.Queen : Spatial weights based on Queen contiguity. libpysal.weights.Rook : Spatial weights based on Rook contiguity. knn_graph : Generate k-nearest neighbor graphs from point data. delaunay_graph : Generate Delaunay triangulation graphs from point data. fixed_radius_graph : Generate fixed-radius proximity graphs. gabriel_graph : Generate Gabriel graphs from point data. relative_neighborhood_graph : Generate relative neighborhood graphs. waxman_graph : Generate probabilistic Waxman graphs.