Skip to content

Transportation Module

The transportation module provides functions for processing GTFS (General Transit Feed Specification) data and creating transportation network graphs.

GTFS Processing

Transportation Network Analysis Module.

This module provides comprehensive functionality for processing General Transit Feed Specification (GTFS) data and creating transportation network representations. It specializes in converting public transit data into graph structures suitable for network analysis and accessibility studies.

All functions return ready-to-use pandas/GeoPandas objects or NetworkX graphs that can be seamlessly integrated into analysis pipelines, notebooks, or model training workflows.

Functions:

Name Description
load_gtfs

Parse a GTFS zip file and enrich stops/shapes with geometry.

get_od_pairs

Materialise origin-destination pairs for every trip and service day.

load_gtfs

load_gtfs(path)

Parse a GTFS zip file and enrich stops/shapes with geometry.

This function loads a GTFS (General Transit Feed Specification) zip file and converts it into a dictionary of pandas/GeoPandas DataFrames. Stop locations and route shapes are automatically converted to geometric objects for spatial analysis.

Parameters:

Name Type Description Default
path str or Path

Location of the zipped GTFS feed (e.g. "./rome_gtfs.zip").

required

Returns:

Type Description
dict[str, DataFrame or GeoDataFrame]

Keys are the original GTFS file names (without extension) and values are pandas or GeoPandas DataFrames ready for analysis.

See Also

get_od_pairs : Create origin-destination pairs from GTFS data. travel_summary_graph : Create network representation from GTFS data.

Notes
  • The function never mutates the original file - everything is kept in memory.
  • Geometry columns are added only when the relevant coordinate columns are present and valid.

Examples:

>>> from pathlib import Path
>>> gtfs = load_gtfs(Path("data/rome_gtfs.zip"))
>>> print(list(gtfs))
['agency', 'routes', 'trips', 'stops', ...]
>>> gtfs['stops'].head(3)[['stop_name', 'geometry']]
       stop_name                     geometry
0  Termini (MA)  POINT (12.50118 41.90088)
1   Colosseo(MB)  POINT (12.49224 41.89021)

get_od_pairs

get_od_pairs(gtfs, start_date=None, end_date=None, include_geometry=True)

Materialise origin-destination pairs for every trip and service day.

This function creates a comprehensive dataset of all origin-destination pairs for transit trips within the specified date range, optionally including geometric information for spatial analysis.

Parameters:

Name Type Description Default
gtfs dict

Dictionary returned by :func:load_gtfs.

required
start_date str or None

Restrict the calendar expansion to the closed interval [start_date, end_date] (format YYYYMMDD). When None the period is inferred from calendar.txt.

None
end_date str or None

Restrict the calendar expansion to the closed interval [start_date, end_date] (format YYYYMMDD). When None the period is inferred from calendar.txt.

None
include_geometry bool

If True the result is a GeoDataFrame whose geometry is a straight LineString connecting the two stops.

True

Returns:

Type Description
DataFrame or GeoDataFrame

One row per trip-day-leg with departure / arrival timestamps, travel time in seconds and, optionally, geometry.

See Also

load_gtfs : Load GTFS data from zip file. travel_summary_graph : Create network representation from GTFS data.

Examples:

>>> gtfs = load_gtfs("data/rome_gtfs.zip")
>>> od = get_od_pairs(gtfs, start_date="20230101", end_date="20230107")
>>> od.head(3)[['orig_stop_id', 'dest_stop_id', 'travel_time_sec']]
  orig_stop_id dest_stop_id  travel_time_sec
0      7045490      7045491            120.0
1      7045491      7045492            180.0
2      7045492      7045493            240.0

Graph Construction

Transportation Network Analysis Module.

This module provides comprehensive functionality for processing General Transit Feed Specification (GTFS) data and creating transportation network representations. It specializes in converting public transit data into graph structures suitable for network analysis and accessibility studies.

All functions return ready-to-use pandas/GeoPandas objects or NetworkX graphs that can be seamlessly integrated into analysis pipelines, notebooks, or model training workflows.

Functions:

Name Description
travel_summary_graph

Aggregate stop-to-stop travel time & frequency into an edge list.

travel_summary_graph

travel_summary_graph(
    gtfs,
    start_time=None,
    end_time=None,
    calendar_start=None,
    calendar_end=None,
    as_nx=False,
)

Aggregate stop-to-stop travel time & frequency into an edge list.

This function analyzes GTFS data to create a network representation of transit connections, computing average travel times and service frequencies between consecutive stops.

Parameters:

Name Type Description Default
gtfs dict

A dictionary produced by :func:load_gtfs - must contain at least stop_times and stops.

required
start_time str or None

Consider only trips whose departure falls inside [start_time, end_time] (format HH:MM:SS). When None the whole service day is used.

None
end_time str or None

Consider only trips whose departure falls inside [start_time, end_time] (format HH:MM:SS). When None the whole service day is used.

None
calendar_start str or None

Period over which service-days are counted (format YYYYMMDD). If omitted it spans the native range in calendar.txt.

None
calendar_end str or None

Period over which service-days are counted (format YYYYMMDD). If omitted it spans the native range in calendar.txt.

None
as_nx bool

If True return a NetworkX graph, otherwise two GeoDataFrames (nodes_gdf, edges_gdf). The latter follow the convention used in utils.py.

False

Returns:

Type Description
tuple[GeoDataFrame, GeoDataFrame] or Graph

Nodes - every stop with a valid geometry. • Edges - columns = from_stop_id, to_stop_id, travel_time_sec, frequency, geometry.

See Also

get_od_pairs : Create origin-destination pairs from GTFS data. load_gtfs : Load GTFS data from zip file.

Examples:

>>> gtfs = load_gtfs("data/rome_gtfs.zip")
>>> nodes, edges = travel_summary_graph(
...     gtfs,
...     start_time="07:00:00",
...     end_time="10:00:00",
... )
>>> print(edges.head(3)[['travel_time_sec', 'frequency']])
                   travel_time_sec  frequency
from_stop_id to_stop_id
7045490      7045491            120.0        42
7045491      7045492            180.0        42
7045492      7045493            240.0        42

You can directly obtain a NetworkX object too:

>>> G = travel_summary_graph(gtfs, as_nx=True)
>>> print(G.number_of_nodes(), G.number_of_edges())
2564 3178