Graph Module¶
The graph module provides functions for converting between different graph representations, including GeoDataFrames, NetworkX graphs, and PyTorch Geometric data objects.
Conversion Functions¶
Module for creating heterogeneous graph representations of urban environments.
This module provides comprehensive functionality for converting spatial data (GeoDataFrames and NetworkX objects) into PyTorch Geometric Data and HeteroData objects, supporting both homogeneous and heterogeneous graphs. It handles the complex mapping between geographical coordinates, node/edge features, and the tensor representations required by graph neural networks.
The module serves as a bridge between geospatial data analysis tools and deep learning frameworks, enabling seamless integration of spatial urban data with Graph Neural Networks (GNNs) for tasks of GeoAI such as urban modeling, traffic prediction, and spatial analysis.
Functions:
| Name | Description |
|---|---|
gdf_to_pyg |
Convert GeoDataFrames (nodes/edges) to a PyTorch Geometric object. |
nx_to_pyg |
Convert NetworkX graph to PyTorch Geometric Data object. |
pyg_to_gdf |
Convert PyTorch Geometric data to GeoDataFrames. |
pyg_to_nx |
Convert a PyTorch Geometric object to a NetworkX graph. |
gdf_to_pyg ¶
gdf_to_pyg(
nodes,
edges=None,
node_feature_cols=None,
node_label_cols=None,
edge_feature_cols=None,
device=None,
dtype=None,
keep_geom=True,
)
Convert GeoDataFrames (nodes/edges) to a PyTorch Geometric object.
This function serves as the main entry point for converting spatial data into PyTorch Geometric graph objects. It automatically detects whether to create homogeneous or heterogeneous graphs based on input structure. Node identifiers are taken from the GeoDataFrame index. Edge relationships are defined by a MultiIndex on the edge GeoDataFrame (source ID, target ID).
The operation multiplies typed adjacency tables to connect terminal node pairs and can aggregate additional numeric edge attributes along the way.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
nodes
|
dict[str, GeoDataFrame] or GeoDataFrame
|
Node data. For homogeneous graphs, provide a single GeoDataFrame. For heterogeneous graphs, provide a dictionary mapping node type names to their respective GeoDataFrames. The index of these GeoDataFrames will be used as node identifiers. |
required |
edges
|
dict[tuple[str, str, str], GeoDataFrame] or GeoDataFrame
|
Edge data. For homogeneous graphs, provide a single GeoDataFrame. For heterogeneous graphs, provide a dictionary mapping edge type tuples (source_type, relation_type, target_type) to their GeoDataFrames. The GeoDataFrame must have a MultiIndex where the first level represents source node IDs and the second level represents target node IDs. |
None
|
node_feature_cols
|
dict[str, list[str]] or list[str]
|
Column names to use as node features. For heterogeneous graphs, provide a dictionary mapping node types to their feature columns. |
None
|
node_label_cols
|
dict[str, list[str]] or list[str]
|
Column names to use as node labels for supervised learning tasks. For heterogeneous graphs, provide a dictionary mapping node types to their label columns. |
None
|
edge_feature_cols
|
dict[str, list[str]] or list[str]
|
Column names to use as edge features. For heterogeneous graphs, provide a dictionary mapping relation types to their feature columns. |
None
|
device
|
str or device
|
Target device for tensor placement ('cpu', 'cuda', or torch.device). If None, automatically selects CUDA if available, otherwise CPU. |
None
|
dtype
|
dtype
|
Data type for float tensors (e.g., torch.float32, torch.float16). If None, uses torch.float32 (default PyTorch float type). |
None
|
keep_geom
|
bool
|
Whether to preserve geometry information during conversion. If True, original geometries are serialized and stored in metadata for exact reconstruction. If False, geometries are reconstructed from node positions during conversion back to GeoDataFrames (creating straight-line edges between nodes). |
True
|
Returns:
| Type | Description |
|---|---|
Data or HeteroData
|
PyTorch Geometric Data object for homogeneous graphs or HeteroData object for heterogeneous graphs. The returned object contains:
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If PyTorch Geometric is not installed. |
ValueError
|
If input GeoDataFrames are invalid or incompatible. |
See Also
pyg_to_gdf : Convert PyTorch Geometric data back to GeoDataFrames. nx_to_pyg : Convert NetworkX graph to PyTorch Geometric object. city2graph.utils.validate_gdf : Validate GeoDataFrame structure.
Notes
This function automatically detects the graph type based on input structure. For heterogeneous graphs, provide dictionaries mapping types to GeoDataFrames. Node positions are automatically extracted from geometry centroids when available. - Preserves original coordinate reference systems (CRS) - Maintains index structure for bidirectional conversion - Handles both Point and non-Point geometries (using centroids) - Creates empty tensors for missing features/edges - For heterogeneous graphs, ensures consistent node/edge type mapping
Examples:
Create a homogeneous graph from single GeoDataFrames:
>>> import geopandas as gpd
>>> from city2graph.graph import gdf_to_pyg
>>>
>>> # Load and prepare node data
>>> nodes_gdf = gpd.read_file("nodes.geojson").set_index("node_id")
>>> edges_gdf = gpd.read_file("edges.geojson").set_index(["source_id", "target_id"])
>>>
>>> # Convert to PyTorch Geometric
>>> data = gdf_to_pyg(nodes_gdf, edges_gdf,
... node_feature_cols=['population', 'area'])
Create a heterogeneous graph from dictionaries:
>>> # Prepare heterogeneous data
>>> buildings_gdf = buildings_gdf.set_index("building_id")
>>> roads_gdf = roads_gdf.set_index("road_id")
>>> connections_gdf = connections_gdf.set_index(["building_id", "road_id"])
>>>
>>> # Define node and edge types
>>> nodes_dict = {'building': buildings_gdf, 'road': roads_gdf}
>>> edges_dict = {('building', 'connects', 'road'): connections_gdf}
>>>
>>> # Convert to heterogeneous graph with labels
>>> data = gdf_to_pyg(nodes_dict, edges_dict,
... node_label_cols={'building': ['type'], 'road': ['category']})
nx_to_pyg ¶
nx_to_pyg(
graph,
node_feature_cols=None,
node_label_cols=None,
edge_feature_cols=None,
device=None,
dtype=None,
keep_geom=True,
)
Convert NetworkX graph to PyTorch Geometric Data object.
Converts a NetworkX Graph to a PyTorch Geometric Data object by first converting to GeoDataFrames then using the main conversion pipeline. This provides a bridge between NetworkX's rich graph analysis tools and PyTorch Geometric's deep learning capabilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
Graph
|
NetworkX graph to convert. |
required |
node_feature_cols
|
list[str]
|
List of node attribute names to use as features. |
None
|
node_label_cols
|
list[str]
|
List of node attribute names to use as labels. |
None
|
edge_feature_cols
|
list[str]
|
List of edge attribute names to use as features. |
None
|
device
|
device or str
|
Target device for tensor placement ('cpu', 'cuda', or torch.device). If None, automatically selects CUDA if available, otherwise CPU. |
None
|
dtype
|
dtype
|
Data type for float tensors (e.g., torch.float32, torch.float16). If None, uses torch.float32 (default PyTorch float type). |
None
|
keep_geom
|
bool
|
Whether to preserve geometry information during conversion. If True, original geometries are serialized and stored in metadata for exact reconstruction. If False, geometries are reconstructed from node positions during conversion back to GeoDataFrames. |
True
|
Returns:
| Type | Description |
|---|---|
Data or HeteroData
|
PyTorch Geometric Data object for homogeneous graphs or HeteroData object for heterogeneous graphs. The returned object contains:
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If PyTorch Geometric is not installed. |
ValueError
|
If the NetworkX graph is invalid or empty. |
See Also
pyg_to_nx : Convert PyTorch Geometric data to NetworkX graph. gdf_to_pyg : Convert GeoDataFrames to PyTorch Geometric object. city2graph.utils.nx_to_gdf : Convert NetworkX graph to GeoDataFrames.
Notes
- Uses intermediate GeoDataFrame conversion for consistency
- Preserves all graph attributes and metadata
- Handles spatial coordinates if present in node attributes
- Maintains compatibility with existing city2graph workflows
- Automatically creates geometry from 'x', 'y' coordinates if available
Examples:
Convert a NetworkX graph with spatial data:
>>> import networkx as nx
>>> from city2graph.graph import nx_to_pyg
>>>
>>> # Create NetworkX graph with spatial attributes
>>> G = nx.Graph()
>>> G.add_node(0, x=0.0, y=0.0, population=1000)
>>> G.add_node(1, x=1.0, y=1.0, population=1500)
>>> G.add_edge(0, 1, weight=0.5, road_type='primary')
>>>
>>> # Convert to PyTorch Geometric
>>> data = nx_to_pyg(G,
... node_feature_cols=['population'],
... edge_feature_cols=['weight'])
Convert from graph analysis results:
>>> # Use NetworkX for analysis, then convert for ML
>>> communities = nx.community.greedy_modularity_communities(G)
>>> # Add community labels to nodes
>>> for i, community in enumerate(communities):
... for node in community:
... G.nodes[node]['community'] = i
>>>
>>> # Convert with community labels
>>> data = nx_to_pyg(G, node_label_cols=['community'])
pyg_to_gdf ¶
Convert PyTorch Geometric data to GeoDataFrames.
Reconstructs the original GeoDataFrame structure from PyTorch Geometric Data or HeteroData objects. This function provides bidirectional conversion capability, preserving spatial information, feature data, and metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Data or HeteroData
|
PyTorch Geometric data object to convert back to GeoDataFrames. |
required |
node_types
|
str or list[str]
|
For heterogeneous graphs, specify which node types to reconstruct. If None, reconstructs all available node types. |
None
|
edge_types
|
str or list[tuple[str, str, str]]
|
For heterogeneous graphs, specify which edge types to reconstruct. Edge types are specified as (source_type, relation_type, target_type) tuples. If None, reconstructs all available edge types. |
None
|
keep_geom
|
bool
|
Whether to use stored geometries for reconstruction. If True and geometries are stored in metadata, uses the original geometries. If False or no stored geometries exist, reconstructs geometries from node positions (creating straight-line edges between nodes). |
True
|
Returns:
| Type | Description |
|---|---|
tuple[GeoDataFrame, GeoDataFrame] | tuple[dict[str, GeoDataFrame], dict[tuple[str, str, str], GeoDataFrame]]
|
For Data input: Returns a tuple containing: - First element: GeoDataFrame containing nodes - Second element: GeoDataFrame containing edges (or None if no edges) For HeteroData input: Returns a tuple containing: - First element: dict mapping node type names to GeoDataFrames - Second element: dict mapping edge types to GeoDataFrames |
See Also
gdf_to_pyg : Convert GeoDataFrames to PyTorch Geometric object. pyg_to_nx : Convert PyTorch Geometric data to NetworkX graph.
Notes
- Preserves original index structure and names when available
- Reconstructs geometry from stored position tensors
- Maintains coordinate reference system (CRS) information
- Converts feature tensors back to named DataFrame columns
- Handles both homogeneous and heterogeneous graph structures
Examples:
Convert homogeneous PyTorch Geometric data back to GeoDataFrames:
>>> from city2graph.graph import pyg_to_gdf
>>>
>>> # Convert back to GeoDataFrames
>>> nodes_gdf, edges_gdf = pyg_to_gdf(data)
Convert heterogeneous data with specific node types:
pyg_to_nx ¶
Convert a PyTorch Geometric object to a NetworkX graph.
Converts PyTorch Geometric Data or HeteroData objects to NetworkX graphs, preserving node and edge features as graph attributes. This enables compatibility with the extensive NetworkX ecosystem for graph analysis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Data or HeteroData
|
PyTorch Geometric data object to convert. |
required |
keep_geom
|
bool
|
Whether to use stored geometries for reconstruction. If True and geometries are stored in metadata, uses the original geometries. If False or no stored geometries exist, reconstructs geometries from node positions. |
True
|
Returns:
| Type | Description |
|---|---|
Graph
|
The converted NetworkX graph with node and edge attributes. For heterogeneous graphs, node and edge types are stored as attributes. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If PyTorch Geometric is not installed. |
See Also
nx_to_pyg : Convert NetworkX graph to PyTorch Geometric object. pyg_to_gdf : Convert PyTorch Geometric data to GeoDataFrames.
Notes
- Node features, positions, and labels are stored as node attributes
- Edge features are stored as edge attributes
- For heterogeneous graphs, type information is preserved
- Geometry information is converted from tensor positions
- Maintains compatibility with NetworkX analysis algorithms
Examples:
Convert PyTorch Geometric data to NetworkX:
Validation Functions¶
Module for creating heterogeneous graph representations of urban environments.
This module provides comprehensive functionality for converting spatial data (GeoDataFrames and NetworkX objects) into PyTorch Geometric Data and HeteroData objects, supporting both homogeneous and heterogeneous graphs. It handles the complex mapping between geographical coordinates, node/edge features, and the tensor representations required by graph neural networks.
The module serves as a bridge between geospatial data analysis tools and deep learning frameworks, enabling seamless integration of spatial urban data with Graph Neural Networks (GNNs) for tasks of GeoAI such as urban modeling, traffic prediction, and spatial analysis.
Functions:
| Name | Description |
|---|---|
is_torch_available |
Check if PyTorch Geometric is available. |
validate_pyg |
Validate PyTorch Geometric Data or HeteroData objects and return metadata. |
is_torch_available ¶
Check if PyTorch Geometric is available.
This utility function checks whether the required PyTorch and PyTorch Geometric packages are installed and can be imported. It's useful for conditional functionality and providing helpful error messages.
Returns:
| Type | Description |
|---|---|
bool
|
True if PyTorch Geometric can be imported, False otherwise. |
See Also
gdf_to_pyg : Convert GeoDataFrames to PyTorch Geometric (requires torch). pyg_to_gdf : Convert PyTorch Geometric to GeoDataFrames (requires torch).
Notes
- Returns False if either PyTorch or PyTorch Geometric is missing
- Used internally by torch-dependent functions to provide helpful error messages
Examples:
Check availability before using torch-dependent functions:
validate_pyg ¶
Validate PyTorch Geometric Data or HeteroData objects and return metadata.
This centralized validation function performs comprehensive validation of PyG objects, including type checking, metadata validation, and structural consistency checks. It serves as the single point of validation for all PyG objects in city2graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Data or HeteroData
|
PyTorch Geometric data object to validate. |
required |
Returns:
| Type | Description |
|---|---|
GraphMetadata
|
Metadata object containing graph information for reconstruction. |
Raises:
| Type | Description |
|---|---|
ImportError
|
If PyTorch Geometric is not installed. |
TypeError
|
If data is not a valid PyTorch Geometric object. |
ValueError
|
If the data object is missing required metadata or is inconsistent. |
See Also
pyg_to_gdf : Convert PyG objects to GeoDataFrames. pyg_to_nx : Convert PyG objects to NetworkX graphs.
Examples:
Metapath Functions¶
Module for creating heterogeneous graph representations of urban environments.
This module provides comprehensive functionality for converting spatial data (GeoDataFrames and NetworkX objects) into PyTorch Geometric Data and HeteroData objects, supporting both homogeneous and heterogeneous graphs. It handles the complex mapping between geographical coordinates, node/edge features, and the tensor representations required by graph neural networks.
The module serves as a bridge between geospatial data analysis tools and deep learning frameworks, enabling seamless integration of spatial urban data with Graph Neural Networks (GNNs) for tasks of GeoAI such as urban modeling, traffic prediction, and spatial analysis.
Functions:
| Name | Description |
|---|---|
add_metapaths |
Add metapath-derived edges to a heterogeneous graph. |
add_metapaths_by_weight |
Connect nodes of a specific type if they are reachable within a cost threshold band. |
add_metapaths ¶
add_metapaths(
graph=None,
nodes=None,
edges=None,
sequence=None,
new_relation_name=None,
edge_attr=None,
edge_attr_agg="sum",
directed=False,
trace_path=False,
multigraph=False,
as_nx=False,
**_
)
Add metapath-derived edges to a heterogeneous graph.
The operation multiplies typed adjacency tables to connect terminal node pairs and can aggregate additional numeric edge attributes along the way.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
tuple or Graph or MultiGraph
|
Heterogeneous graph input expressed as typed GeoDataFrame dictionaries or a city2graph-compatible NetworkX graph. |
None
|
nodes
|
dict[str, GeoDataFrame]
|
Dictionary of node GeoDataFrames. |
None
|
edges
|
dict[tuple[str, str, str], GeoDataFrame]
|
Dictionary of edge GeoDataFrames. |
None
|
sequence
|
list[tuple[str, str, str]]
|
Sequence of metapath specifications; every edge type is a
|
None
|
new_relation_name
|
str
|
Target edge relation name for the new metapath edges.
If None (default), edges are named |
None
|
edge_attr
|
str | list[str] | None
|
Numeric edge attributes to aggregate along metapaths. When |
None
|
edge_attr_agg
|
str | object | None
|
Aggregation strategy for |
'sum'
|
directed
|
bool
|
Treat metapaths as directed when |
False
|
trace_path
|
bool
|
When |
False
|
multigraph
|
bool
|
When returning NetworkX data, build a |
False
|
as_nx
|
bool
|
Return the result as a NetworkX graph when |
False
|
**_
|
object
|
Ignored placeholder for future keyword extensions. |
{}
|
Returns:
| Type | Description |
|---|---|
tuple[dict[str, GeoDataFrame], dict[tuple[str, str, str], GeoDataFrame]] | Graph | MultiGraph
|
The graph with metapath-derived edges. If as_nx is False (default), returns a tuple of node and edge GeoDataFrames. If as_nx is True, returns a NetworkX graph (Graph or MultiGraph). |
Notes
Legacy scaffolding for path-tracing geometries has been removed because it was never executed. The trace_path argument is preserved for API compatibility but remains a no-op while straight-line geometries are generated for all metapath edges.
add_metapaths_by_weight ¶
add_metapaths_by_weight(
graph=None,
nodes=None,
edges=None,
weight=None,
threshold=None,
new_relation_name=None,
min_threshold=0.0,
edge_types=None,
endpoint_type=None,
directed=False,
multigraph=False,
as_nx=False,
)
Connect nodes of a specific type if they are reachable within a cost threshold band.
This function dynamically adds metapaths (edges) between nodes of a specified
endpoint_type if they are reachable within a given cost band [min_threshold,
threshold] based on edge weights (e.g., travel time). It uses Dijkstra's
algorithm for path finding via scipy.sparse.csgraph for efficiency.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
tuple or Graph or MultiGraph
|
Input graph. Can be a tuple of (nodes_dict, edges_dict) or a NetworkX graph. |
None
|
nodes
|
dict[str, GeoDataFrame]
|
Dictionary of node GeoDataFrames. |
None
|
edges
|
dict[tuple[str, str, str], GeoDataFrame]
|
Dictionary of edge GeoDataFrames. |
None
|
weight
|
str
|
The edge attribute to use as weight (e.g., 'travel_time'). |
None
|
threshold
|
float
|
The maximum cost threshold for connection. |
None
|
new_relation_name
|
str
|
Name of the new edge relation. |
None
|
min_threshold
|
float
|
The minimum cost threshold for connection. |
0.0
|
edge_types
|
list[tuple[str, str, str]]
|
List of edge types to consider for traversal. If None, all edges are used. |
None
|
endpoint_type
|
str
|
The node type to connect (e.g., 'building'). |
None
|
directed
|
bool
|
If True, creates a directed graph for traversal. |
False
|
multigraph
|
bool
|
If True, returns a MultiGraph (only relevant if as_nx=True). |
False
|
as_nx
|
bool
|
If True, returns a NetworkX graph. |
False
|
Returns:
| Type | Description |
|---|---|
Graph or MultiGraph or tuple
|
The graph with added metapaths. Format depends on |