vtra.preprocess package


vtra.preprocess.convert_hazard_data module

Pre-process hazard data


Convert GeoTiff raster hazard datasets to shapefiles based on masking and selecting values from
  • Single-band raster files
  • Multi-band (3-bands) raster files

Input data requirements

  1. Correct paths to all hazard datasets
  2. Single-band GeoTiff hazard raster files with:
    • values - between 0 and 1000
    • raster grid geometry
    • projection systems: Default assumed = EPSG:32648
  3. Multi-band GeoTiff hazard raster files with:
    • 3-bands
    • values - in each band between 0 and 255
    • raster grid geometry
    • projection systems: Default assumed = EPSG:32648


  1. Shapefiles whose names show the hazard models and their selected range of values
    • ID - equal to 1
    • geometry - Shapely Polygon outline of selected hazard
convert(threshold, infile, tmpfile_1, outfile)[source]

Convert GeoTiff raster file to Shapefile with geometries based on raster threshold less that 999

  • threshold - Float value of lower bound of GeoTiff threshold value to be selected
  • infile - String name of input GeoTff file path
  • tmpfile_1 - Stirng name of tmp file 1
  • outfile - Stirng name of output shapefile
Shapefile with Polygon geometries of rasters based on raster values above a threshold
convert_geotiff_to_vector_with_multibands(band_colors, infile, infile_epsg, tmpfile_1, tmpfile_2, outfile)[source]

Convert multi-band GeoTiff raster file to Shapefile with geometries based on raster band color values

  • band_colors - Tuple with 3-values each corresponding to the values in raster bands
  • infile - String name of input GeoTff file path
  • infile_epsg - Integer value of EPSG Projection number of raster
  • tmpfile_1 - Stirng name of tmp file 1
  • tmpfile_2 - Stirng name of tmp file 2
  • outfile - Stirng name of output shapefile
Shapefile with Polygon geometries of rasters based on raster band values
convert_geotiff_to_vector_with_threshold(from_threshold, to_threshold, infile, infile_epsg, tmpfile_1, tmpfile_2, outfile)[source]

Convert GeoTiff raster file to Shapefile with geometries based on raster threshold ranges

  • from_threshold - Float value of lower bound of GeoTiff threshold value to be selected
  • to_threshold - Float value of upper bound of GeoTiff threshold value to be selected
  • infile - String name of input GeoTff file path
  • infile_epsg - Integer value of EPSG Projection number of raster
  • tmpfile_1 - Stirng name of tmp file 1
  • tmpfile_2 - Stirng name of tmp file 2
  • outfile - Stirng name of output shapefile
Shapefile with Polygon geometries of rasters based on raster threshold ranges
glofris_data_details(file_name, root_dir)[source]

Read names of GLOFRIS files and create attributes

  • file_name - String name of GeoTff file
  • root_dir - String path to directory of file
df - Pandas DataFrame written to csv file with columns:
  • file_name - String
  • hazard_type - String
  • year - Integer: 2016 or 2030
  • climate_scenario - String: RCP4.5 or RCP8.5 or none
  • probability - Float: 1/(return period)
  • banded - Boolean: True or False
  • bands - Integer

Process hazard data

  1. Specify the paths from where to read and write:
    • Input data
    • Hazard data
  2. Supply input data and parameters
    • Thresholds of flood hazards
    • Values of bands to be selected
    • Color code of multi-band rasters
    • Specific file names that might require some specific operations

Extract projection, data bands numbers and valuees from raster

  • file_path - String name of input GeoTff file path
  • counts - Number of bans in raster
  • crs - Projection system of raster
  • data_vals - Numpy array of raster values
raster_rewrite(in_raster, out_raster, nodata)[source]

Rewrite a raster to reproject and change no data value

  • in_raster - String name of input GeoTff file path
  • out_raster - String name of output GeoTff file path
  • nodata - Float value of data that is treated as no data
Reproject and replace raster with nodata = -1

vtra.preprocess.create_transport_networks module

Pre-process networks


Creating post-processed transport networks with attributes

From pre-processed input Shapefiles and collected network attributes data

For all Province road networks: [‘Lao Cai’, ‘Binh Dinh’, ‘Thanh Hoa’]

For all transport modes at national scale: [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’,’multi’]

Input data requirements

Correct paths to all files and correct input parameters

Edge and node shapefiles for all Province roads and national-scale networks

All Geometries in Edge Shapefiles should be valid LineStrings

All Geometries in Node Shapefiles should be valid Points

  1. Node Shapefiles should contain following column names and attributes
    • node_id - String node ID
    • geometry - Shapely Point geometry of nodes
    • attributes - Multiple types depending upon sector and context
  2. Edge Shapefiles should contain following column names and attributes:
    • edge_id - String Edge ID
    • from_node - String node ID that should be present in node_id column
    • to_node - String node ID that should be present in node_id column
    • geometry - Shapely LineString geometry of edges


  1. Excel sheets with post-processed network nodes and edges
  2. Shapefiles with post-processed network nodes and edges
  3. All nodes have the following attributes:
    • node_id - String Node ID
    • name - String name in Vietnamese/English
    • tons - Float assigned cargo freight tonnage using node
    • population - Float assigned passenger/population number using node
    • capacity - Float assigned capacity in tons/passenger numbers/other units
    • geometry - Shapely Point geometry of node
  4. Attributes only present in inland and coastal port nodes
    • port_type - String name of type of port: inland or sea
    • port_class - String name of class of port: class1A (international) or class1 (domestic hub)
  5. All edges have the following attributes:
    • edge_id - String edge ID
    • g_id - Integer edge ID
    • from_node - String node ID that should be present in node_id column
    • to_node - String node ID that should be present in node_id column
    • geometry - Shapely LineString geometry of edge
    • terrain - String name of terrain of edge
    • level - Integer number for edge level: National, Provincial, Local, etc.
    • width - Float width in meters of edge
    • length - Float estimated length in kilometers of edge
    • min_speed - Float estimated minimum speed in km/hr on edge
    • max_speed - Float estimated maximum speed in km/hr on edge
    • min_time - Float estimated minimum time of travel in hours on edge
    • max_time - Float estimated maximum time of travel in hours on edge
    • min_time_cost - Float estimated minimum cost of time in USD on edge
    • max_time_cost - Float estimated maximum cost of time in USD on edge
    • min_tariff_cost - Float estimated minimum tariff cost in USD on edge
    • max_tariff_cost - Float estimated maximum tariff cost in USD on edge
    • vehicle_co - Integer number of daily vehicle counts on edge
  6. Attributes only present in Province and national roads edges
    • surface - String value for surface
    • road_class - Integer between 1 and 6
    • road_cond - String value: paved or unpaved
    • asset_type - String name of type of asset


  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.

Pre-process networks

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Paths of the mode files: List of tuples of strings
    • Names of modes: List of strings
    • Unit weight of vehicle assumed for each mode: List of float types
    • Ranges of usage factors for each mode to represent uncertainty in cost estimations: List of tuples of float types
    • Ranges of speeds for each mode to represent uncertainty in speeds: List of tuple of float types
    • Names of all industry sector and crops in VITRANSS2 and IFPRI datasets: List of string types
    • Names of commodity/industry columns for which min-max tonnage column names already exist: List of string types
    • Percentage of OD flow we want to send along path: Float type
  3. Give the paths to the input data files:
    • Pre-proccesed network shapefiles
    • Costs of modes Excel file
    • Road properties Excel file
  4. Specify the output files and paths to be created

vtra.preprocess.cvts module

vtra.preprocess.cvts_speeds module


Traffic speed assignment script vehicle_id, edge_path, time_stamp

vtra.preprocess.national_modes_od_creation module

Pre-process OD matrix


Create national scale OD matrices at node and province levels from:
  • VITRANSS2 province-scale OD data
  • IFPRI crop data at 1km resolution

Input data requirements

  1. Correct paths to all files and correct input parameters
  2. Excel file with VITRANSS2 OD data with columns:
    • o - Integer values of Origin Province ID
    • d - Integer values of Destination Province ID
    • industry and crop names - Float values of tons between OD Provinces
    • modes - Float values to infer model split values for OD Provinces
  3. Geotiff files with IFPRI crop data:
    • tons - Float values of production tonnage at each grid cell
    • geometry - Raster grid cell geometry
  4. Shapefile of RiceAtlas data:
    • month production columns - tonnage of rice for each month
    • geometry - Shapely Polygon geometry of Provinces
  5. Shapefile of Provinces
    • od_id - Integer Province ID corresponding to OD ID
    • name_eng - String name of Province in English
    • geometry - Shapely Polygon geometry of Provinces
  6. Shapefile of Communes
    • population - Float values of populations in Communes
    • geometry - Shapely Polygon geometry of Communes
  7. Shapefiles of network nodes
    • node_id - String node ID
    • geometry - Shapely point geometry of nodes
  8. Shapefiels of network edges
    • vehicle_co - Count of vehiles only for roads
    • geometry - Shapely LineString geometry of edges


  1. Excel workbook with sheet of mode-wise and total OD flows
    • origin - String node ID of origin node
    • destination - String node ID of destination node
    • o_region - String names of origin Province
    • d_region - String names of destination Province
    • commodity_names - Float values of daily tonnages of commodities/industries between OD nodes from VITRANSS2 and IFPRI data (except rice)
    • min_rice - Float values of minimum daily tonnages of rice between OD nodes
    • max_rice - Float values of maximum daily tonnages of rice between OD nodes
    • min_tons - Float values of minimum daily tonnages between OD nodes
    • max_tons - Float values of maximum daily tonnages between OD nodes


  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
assign_crop_od_flows_to_nodes(national_ods_df, province_path, province_name_col, calc_path, crop_data_path, crop_names, rice_prod_months, modes_df, modes, od_fracs_crops, o_id_col, d_id_col)[source]

Assign IFPRI crop values to OD nodes based on VITRANSS 2 OD distributions

  • Based on VITRANSS 2 OD distributions
  • national_ods_df - List of lists of Pandas dataframes
  • province_path - Path of province shapefile
  • province_name_col - String name of column containing province names
  • calc_path - Path to store intermediary calculations
  • crop_data_path - Path to crop datasets
  • crop_names - List of string of crop names in IFPRI datasets
  • rice_prod_months - Geopandas dataframe of RiceAtlas crop values with minimum and maximum monthly production as fraction of annual production
  • modes_df - List of Geopnadas dataframes with nodes of each transport mode
  • modes - List of strings of names of transport modes
  • od_fracs_crops - Pandas dataframe of crop OD distributions and modal splits as per VITRANSS 2 data
  • o_id_col - String name of Origin province ID column
  • d_id_col - String name of Destination province ID column
  • Outputs
    national_ods_df - List of Lists of Pandas dataframes with columns:
    • origin - Origin node ID
    • o_region - Origin province name
    • destination - Destination node ID
    • d_region - Destination province ID
    • min_crop - Minimum Tonnage values for the named crop
    • max_crop - Maximum Tonnage values for the named crop
assign_daily_min_max_tons_rice(crop_df, rice_prod_df)[source]

Estimate minimum and maximum daily rice tonnages

  • crop_df - Geopandas dataframe of crop points with annual tonnages
  • rice_prod_df - Geopandas dataframe of RiceAtlas crop values with minimum and maximum monthly production as fraction of annual production
crop_df - Geopandas dataframe of crop points with new columns:
  • min_rice - Minimum daily rice tonnages
  • max_rice - Maximum daily rice tonnages
assign_industry_od_flows_to_nodes(national_ods_df, ind_cols, modes_df, modes, od_fracs, o_id_col, d_id_col)[source]

Assign VITRANSS 2 OD flows to nodes

  • national_ods_df - List of lists of Pandas dataframes
  • ind_cols - List of strings of names of indsutry columns
  • modes_df - List of Geopnadas dataframes with nodes of each transport mode
  • modes - List of strings of names of transport modes
  • od_fracs - Pandas dataframe of Industry OD flows and modal splits
  • o_id_col - String name of Origin province ID column
  • d_id_col - String name of Destination province ID column
national_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • o_region - Origin province name
  • destination - Destination node ID
  • d_region - Destination province ID
  • ind - Tonnage values for the named industry
assign_node_weights_by_commune_population_proximity(commune_path, nodes, commune_pop_col)[source]

Assign weights to nodes based on their nearest commune populations

  • By finding the communes that intersect with the Voronoi extents of nodes
  • commune_path - Path of commune shapefile
  • nodes_in - Path of nodes shapefile
  • commune_pop_col - String name of column containing commune population values
  • nodes - Geopandas dataframe of nodes with new column called weight
assign_province_name_id_to_nodes(province_path, nodes_in, province_name_col, province_id_col)[source]

Match the nodes to their province names and province IDs

  • By finding the province that contains or is nearest to the node
  • province_path - Path of province shapefile
  • nodes_in - Path of nodes shapefile
  • province_name_col - String name of column containing province names
  • province_id_col - String name of column containing province ID’s that match VITRANSS 2 OD ids
  • nodes - Geopandas dataframe of nodes with new columns called province_name and od_id
assign_road_weights(nodes, edges_in, aadt_column)[source]

Assign weights to nodes on the road network

  • By finding the total AADT counts converging on the node
  • nodes - Geopandas dataframe of nodes
  • edges_in - Path of edges shapefile
  • aadt_column - String name of column containing AADT values
  • nodes - Geopandas dataframe of nodes with new column called weight
ifpri_crop_od_split(vitranss2_od_data_file, modes, o_id_col, d_id_col, crop_cols)[source]

Create IFPRI crop OD modal split estimates from VITRANSS2 OD values

  • Combine the crop-wise OD values with the mode-wise OD values
  • Estimate the OD modal-split from the mode-wsie OD values
  • vitranss2_od_data_file - Excel file with VITRANSS2 OD data
  • modes - List of strings of mode types, e.g. [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’]
  • o_id_col - String name of name of Origin column
  • d_id_col - String name of name of Destination column
  • crop_cols - List of strings of crop names in VITRANSS 2 OD data
  • od_fracs_crops - Pandas dataframe of VITRANSS2 crop OD data with modal splits

Pre-process national OD matrix

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Names of modes: List of strings
    • OD column names in OD file: String types
    • Names of industry columns in VITRANSS2 data: List of string types
    • Names of crop columns in VITRANSS2 data: List of string types
    • Names of crops in IFPRI crop data: List of string types
    • Names of months in Rice Atlas data: List of string types
  3. Give the paths to the input data files:
    • Network nodes files
    • VITRANSS 2 OD file
    • IFPRI crop data files
    • Rice Atlas data shapefile
    • Province boundary and stats data shapefile
    • Commune boundary and stats data shapefile
  4. Specify the output files and paths to be created
riceatlas_crop_minmax(riceatlas_crop_file, crop_month_fields)[source]

Create MIN_MAX fractions of rice production estimates for provinces

  • By reading data from the RiceAtlas data
  • And estimating the minimum and maximum monthly values > 0
  • riceatlas_crop_file - Shapefile with RiceAtlas data
  • crop_month_fields - List of strings of names of columns indicating rice monthly production
rice_prod_months - Geopandas dataframe of RiceAtlas crop values
  • min_frac - minimum month of production as fraction of annual production
  • max_frac - maximum month of production as fraction of annual production
vitranss2_od_split(vitranss2_od_data_file, modes, o_id_col, d_id_col)[source]

Create VITRANSS2 OD modal split estimates

  • Combine the commodity-wise OD values with the mode-wise OD values
  • Estimate the OD modal-split from the mode-wsie OD values
  • vitranss2_od_data_file - Excel file with VITRANSS2 OD data
  • modes - List of strings of mode types, e.g. [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’]
  • o_id_col - String name of name of Origin column
  • d_id_col - String name of name of Destination column
  • od_fracs - Pandas dataframe of VITRANSS 2 commodity OD data with modal splits

vtra.preprocess.province_roads_access_od_creation module

Pre-process accessibility-based provincial OD matrix


Create province scale OD matrices between roads connecting villages to nearest communes:
  • Net revenue estimates of commune villages
  • IFPRI crop data at 1km resolution

Input data requirements

  1. Correct paths to all files and correct input parameters
  2. Geotiff files with IFPRI crop data:
    • tons - Float values of production tonnage at each grid cell
    • geometry - Raster grid cell geometry
  3. Shapefile of RiceAtlas data:
    • month production columns - tonnage of rice for each month
    • geometry - Shapely Polygon geometry of Provinces
  4. Shapefile of Provinces
    • od_id - Integer Province ID corresponding to OD ID
    • name_eng - String name of Province in English
    • geometry - Shapely Polygon geometry of Provinces
  5. Shapefile of Communes
    • population - Float values of populations in Communes
    • nfrims - Float values of number of firms in Provinces
    • netrevenue - Float values of Net Revenue in Provinces
    • argi_prop - Float values of proportion of agrivculture firms in Provinces
    • geometry - Shapely Polygon geometry of Communes
  6. Shapefiles of network nodes
    • node_id - String node ID
    • geometry - Shapely point geometry of nodes
  7. Shapefiles of network edges
    • vehicle_co - Count of vehiles only for roads
    • geometry - Shapely LineString geometry of edges
  8. Shapefiles of Commune center points
    • object_id - Integer ID of point
    • geometry - Shapely point geometry of points
  9. Shapefiles of Village center points
    • object_id - Integer ID of points
    • geometry - Shapely point geometry of points


  1. Excel workbook with sheet of province-wise OD flows
    • origin - String node ID of origin node
    • destination - String node ID of destination node
    • crop_names - Float values of daily tonnages of IFPRI crops (except rice) between OD nodes
    • min_rice - Float values of minimum daily tonnages of rice between OD nodes
    • max_rice - Float values of maximum daily tonnages of rice between OD nodes
    • min_croptons - Float values of minimum daily tonnages of crops between OD nodes
    • max_croptons - Float values of maximum daily tonnages of crops between OD nodes
    • min_agrirev - Float value of Minimum daily revenue of agriculture firms between OD nodes
    • max_agrirev - Float value of Maximum daily revenue of agriculture firms between OD nodes
    • min_noagrirev - Float value of Minimum daily revenue of non-agriculture firms between OD nodes
    • max_noagrirev - Float value of Maximum daily revenue of non-agriculture firms between OD nodes
    • min_netrev - Float value of Minimum daily revenue of all firms between OD nodes
    • max_netrev - Float value of Maximum daily revenue of all firms between OD nodes


  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
assign_io_rev_costs_crops(x, cost_dataframe, rice_prod_file, crop_month_fields, province, x_cols, ex_rate)[source]

Assign crop tonnages to daily net revenues

  • x - Pandas DataFrame of values
  • cost_dataframe - Pandas DataFrame of conversion of tonnages to net revenues
  • rice_prod_file - Shapefile of RiceAtlas monthly production value
  • province - Stirng name of province
  • x_cols - List of string names of crops
  • ex_rate - Exchange rate from VND millions to USD
  • min_croprev - Float value of Minimum daily revenue of crops
  • max_croprev - Float value of Maximum daily revenue of crops
assign_monthly_tons_crops(x, rice_prod_file, crop_month_fields, province, x_cols)[source]

Assign crop tonnages to OD pairs

  • x - Pandas DataFrame of values
  • rice_prod_file - Shapefile of RiceAtlas monthly production value
  • crop_month_fields - Lsit of strings of month columns in Rice Atlas shapefile
  • province - Stirng name of province
  • x_cols - List of string names of crops
  • min_croptons - Float value of Minimum daily tonnages of crops
  • max_croptons - Float value of Maximum daily tonnages of crops
crop_od_pairs(start_points, end_points, crop_name)[source]

Assign crop tonnages to OD pairs

  • start_points - GeoDataFrame of start points for Origins
  • end_points - GeoDataFrame of potential end points for Destinations
  • crop_name - String name of crop
od_pairs_df - Pandas DataFrame wit columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • crop - Tonnage values for the named crop
  • netrev_argi - Daily Net revenue of agriculture firms in USD
  • netrev_noargi - Daily Net revenue of non-agriculture firms in USD
crop_values_to_province_od_nodes(province_ods_df, province_geom, calc_path, crop_data_path, crop_names, nodes, sindex_nodes, prov_commune_center, sindex_commune_center, node_id, object_id)[source]

Assign IFPRI crop values to OD nodes in provinces

  • Based on finding nearest nodes to crop production sites as Origins
  • And finding nearest commune centers as Destinations
  • province_ods_df - List of lists of Pandas dataframes
  • province_geom - Shapely Geometry of province
  • calc_path - Path to store intermediary calculations
  • crop_data_path - Path to crop datasets
  • crop_names - List of string of crop names in IFPRI datasets
  • nodes - GeoDataFrame of province road nodes
  • sindex_nodes - Spatial index of province road nodes
  • prov_commune_center - GeoDataFrame of province commune center points
  • sindex_commune_center - Spatial index of commune center points
  • node_id - String name of Node ID column
  • object_id - String name of commune ID column
province_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • crop - Tonnage values for the named crop

Pre-process provincial-scale OD

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Names of the Provinces: List of strings
    • Exchange rate to convert 2012 Net revenue in million VND values to USD in 2016
    • Names of crops in IFPRI crop data
    • Names of months in Rice Atlas data
    • Name of column for netrevenue of communes in VND millions
    • Name of column for numebr of firms in communes
    • Name of column for proportion of agriculture firms in communes
    • Name of Node ID column
    • Name of commune ID column
  3. Give the paths to the input data files:
    • Network nodes files
    • IFPRI crop data files
    • Rice Atlas data shapefile
    • Province boundary and stats data shapefile
    • Commune boundary and stats data shapefile
    • Population points shapefile for locations of villages
    • Commune center points shapefile
netrev_od_pairs(start_points, end_points)[source]

Assign crop tonnages to OD pairs

  • start_points - GeoDataFrame of start points for Origins
  • end_points - GeoDataFrame of potential end points for Destinations
od_pairs_df - Pandas DataFrame with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • netrev_argi - Net revenue of agriculture firms
  • netrev_noargi - Net revenue of non-agriculture firms
netrevenue_values_to_province_od_nodes(province_ods_df, prov_communes, commune_sindex, netrevenue, n_firms, agri_prop, prov_pop, prov_pop_sindex, nodes, sindex_nodes, prov_commune_center, sindex_commune_center, node_id, object_id, exchange_rate)[source]

Assign commune level netrevenue values to OD nodes in provinces

  • Based on finding nearest nodes to village points with netrevenues as Origins
  • And finding nearest commune centers as Destinations
  • province_ods_df - List of lists of Pandas dataframes
  • prov_communes - GeoDataFrame of commune level statistics
  • commune_sindex - Spatial index of communes
  • netrevenue - String name of column for netrevenue of communes in VND millions
  • nfirm - String name of column for numebr of firms in communes
  • agri_prop - Stirng name of column for proportion of agriculture firms in communes
  • prov_pop - GeoDataFrame of population points in Province
  • prov_pop_sindex - Spatial index of population points in Province
  • nodes - GeoDataFrame of province road nodes
  • sindex_nodes - Spatial index of province road nodes
  • prov_commune_center - GeoDataFrame of province commune center points
  • sindex_commune_center - Spatial index of commune center points
  • node_id - String name of Node ID column
  • object_id - String name of commune ID column
  • exchange_rate - Float value for exchange rate from VND million to USD
province_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • netrev_argi - Net revenue of agriculture firms
  • netrev_noargi - Net revenue of non-agriculture firms

vtra.preprocess.transport_network_inputs module

Utility functions for transport networks


Helper functions to create post-processeed networks with attributes from specific types of input datasets


  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.

Assign asset types to roads assets in Vietnam

The types are assigned based on our understanding of: 1. The reported asset code in the data

x - Pandas DataFrame with numeric asset code
asset type - Which is either of (Bridge, Dam, Culvert, Tunnel, Spillway, Road)
assign_asset_type_to_province_roads_from_file(asset_code, asset_type_list)[source]

Assign asset types to roads assets in Vietnam based on values in file

The types are assigned based on our understanding of: 1. The reported asset code in the data

  • asset_code - Numeric value for code of asset
  • asset_type_list - List of Strings wiht names of asset types
asset_type - String name of type of asset
assign_assumed_width_to_national_roads_from_file(x, flat_width_range_list, mountain_width_range_list)[source]

Assign widths to national roads assets in Vietnam

The widths are assigned based on our understanding of: 1. The class of the road which is not reliable 2. The number of lanes 3. The terrain of the road

  • x - Pandas DataFrame row with values
    • road_class - Integer value of road class
    • lanenum__s - Integer value of number of lanes on road
  • flat_width_range_list - List of tuples containing (from_width, to_width, assumed_width)
  • moiuntain_width_range_list - List of tuples containing (from_width, to_width, assumed_width)
assumed_width - Float assigned width of the road asset based on design specifications

Assign widths to Province roads assets in Vietnam

x : int value for width of asset
int assigned width of the road asset based on design specifications
assign_assumed_width_to_province_roads_from_file(asset_width, width_range_list)[source]

Assign widths to Province roads assets in Vietnam

The widths are assigned based on our understanding of:

  1. The reported width in the data which is not reliable
  2. A design specification based understanding of the assumed width based on ranges of values
  • asset_width - Numeric value for width of asset
  • width_range_list - List of tuples containing (from_width, to_width, assumed_width)
assumed_width - assigned width of the raod asset based on design specifications
assign_min_max_speeds_to_national_roads_from_file(x, flat_width_range_list, mountain_width_range_list)[source]

Assign speeds to national roads in Vietnam

The speeds are assigned based on our understanding of: 1. The class of the road 2. The estimated speed from the CVTS data 3. The terrain of the road

x - Pandas DataFrame of values
  • road_class - Integer value of road class
  • terrain - String value of road terrain
  • est_speed - Float value of estimated speed from CVTS data
  • flat_width_range_list - List of tuples containing design speeds
  • moiuntain_width_range_list - List of tuples containing design speeds
  • Float minimum assigned speed in km/hr
  • Float maximum assigned speed in km/hr
assign_minmax_tariff_costs_multi_modal_apply(x, cost_dataframe)[source]

Assign tariff costs on multi-modal network links in Vietnam

  • x - Pandas dataframe with values
    • port_type - String name of port type
    • from_mode - String name of mode
    • to_mode - String name of mode
    • other_mode - String name of mode
  • cost_dataframe - Pandas Dataframe with costs
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_national_roads_apply(x, cost_dataframe)[source]

Assign tariff costs on national roads in Vietnam

The costs are assigned based on our understanding of:

  1. The vehicle counts on roads
  • x - Pandas dataframe with values
    • vehicle_co - Count of number of vehicles on road
  • cost_dataframe - Pandas Dataframe with costs
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_networks_apply(x, cost_dataframe)[source]

Assign tariff costs on networks in Vietnam

  • x - Pandas dataframe with values
    • length - Float length of edge in km
  • cost_dataframe - Pandas Dataframe with costs
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_province_roads_apply(x, cost_dataframe)[source]

Assign tariff costs on Province roads in Vietnam

The costs are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
  • x - Pandas dataframe with values
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
  • cost_dataframe - Pandas Dataframe with costs
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_time_costs_national_roads_apply(x, cost_dataframe)[source]

Assign time costs on national roads in Vietnam

The costs are assigned based on our understanding of:

  1. The vehicle counts on roads
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
  • x - Pandas dataframe with values
    • vehicle_co - Count of number of vehicles on road
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD
assign_minmax_time_costs_networks_apply(x, cost_dataframe)[source]

Assign time costs on networks in Vietnam

  • x - Pandas dataframe with values
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD
assign_minmax_time_costs_province_roads_apply(x, cost_dataframe)[source]

Assign time costs on Province roads in Vietnam

The costs are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
  • x - Pandas dataframe with values
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD

Assign travel speeds to roads assets in Vietnam

The speeds are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
x - Pandas dataframe with values
  • code - Numeric code for type of asset
  • level - Numeric code for level of asset
  • terrain - String value of the terrain of asset
  • Float minimum assigned speed in km/hr
  • Float maximum assigned speed in km/hr

Assign road speeds to national roads

x - Pandas DataFrame of values
  • capkth__ca - String value of road class
  • vehicle_co - Float value of number of vehicles on road
  • Integer value of road class

Assign road conditions as paved or unpaved to national roads

x - Pandas DataFrame of values
String value of road as paved or unpaved

Assign terrain as flat or mountain to national roads

x - Pandas DataFrame of values
String value of terrain as flat or mountain

Assign road conditions as paved or unpaved to Province roads

x - Pandas DataFrame of values
  • code - Numeric code for type of asset
  • level - Numeric code for level of asset
String value as paved or unpaved
create_port_names(x, port_names_df)[source]

Add port names in Vietnamese to port data

  • x - Pandas DataFrame with values
    • port_type - String type of port
    • cangbenid - Integer ID of inland port
    • objectid - Integer ID of sea port
  • port_names_df - Pandas DataFrame with port names
name - Vietnamese name of port
multi_modal_shapefile_to_dataframe(edges_in, mode_properties_file, mode_name, length_threshold, usage_factors)[source]

Create multi-modal network dataframe from inputs

  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • length_threshold - Float value of threshold in km of length of multi-modal links
  • usage_factor - Tuple of 2-float values between 0 and 1
edges - Geopandas DataFrame with network edge topology and attributes
multi_modal_shapefile_to_network(edges_in, mode_properties_file, mode_name, length_threshold, utilization_factors)[source]

Create multi-modal igraph network dataframe from inputs

  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • length_threshold - Float value of threshold in km of length of multi-modal links
  • usage_factor - Tuple of 2-float values between 0 and 1
G - Igraph object with network edge topology and attributes
national_road_shapefile_to_dataframe(edges_in, road_properties_file, usage_factors)[source]

Create national network dataframe from inputs

  • edges_in - String path to edges file/network Shapefile
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
edges: Geopandas DataFrame with network edge topology and attributes
national_road_shapefile_to_network(edges_in, road_properties_file, usage_factors)[source]

Create national igraph network from inputs

  • edges_in - String path to edges file/network Shapefile
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
G - Igraph object with network edge topology and attributes
network_shapefile_to_dataframe(edges_in, mode_properties_file, mode_name, speed_min, speed_max, usage_factors)[source]

Create network dataframe from inputs

  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • speed_min - Float value of minimum assgined speed
  • speed_max - Float value of maximum assgined speed
  • usage_factor - Tuple of 2-float values between 0 and 1
edges - Geopandas DataFrame with network edge topology and attributes
network_shapefile_to_network(edges_in, mode_properties_file, mode_name, speed_min, speed_max, utilization_factors)[source]

Create igraph network from inputs

  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • speed_min - Float value of minimum assgined speed
  • speed_max - Float value of maximum assgined speed
  • usage_factor - Tuple of 2-float values between 0 and 1
G - Igraph object with network edge topology and attributes
province_shapefile_to_dataframe(edges_in, road_terrain, road_properties_file, usage_factors)[source]

Create province network dataframe from inputs

  • edges_in - String path to edges file/network Shapefile
  • road_terrain - String name of terrain: flat or mountanious
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
edges - Geopandas DataFrame with network edge topology and attributes
province_shapefile_to_network(edges_in, road_terrain, road_properties_file, usage_factors)[source]

Create province igraph network from inputs

  • edges_in - String path to edges file/network Shapefile
  • road_terrain - String name of terrain: flat or mountanious
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
G - Igraph object with network edge topology and attributes
read_setor_nodes(node_file_with_ids, sector)[source]

Create port data with attributes

  • ports_file_with_ids - String path of GeoDataFrame with port IDs
  • sector - String path of sector
ports_with_id - GeoPandas DataFrame with port attributes
read_waterway_ports(ports_file_with_ids, ports_file_with_names)[source]

Create port data with attributes

  • ports_file_with_ids - String path of GeoDataFrame with port IDs
  • ports_file_with_names - String path of GeoDataFrame with port names
ports_with_id - GeoPandas DataFrame with port attributes