vtra.preprocess package

Submodules

vtra.preprocess.convert_hazard_data module

Pre-process hazard data

Purpose

Convert GeoTiff raster hazard datasets to shapefiles based on masking and selecting values from
  • Single-band raster files
  • Multi-band (3-bands) raster files

Input data requirements

  1. Correct paths to all hazard datasets
  2. Single-band GeoTiff hazard raster files with:
    • values - between 0 and 1000
    • raster grid geometry
    • projection systems: Default assumed = EPSG:32648
  3. Multi-band GeoTiff hazard raster files with:
    • 3-bands
    • values - in each band between 0 and 255
    • raster grid geometry
    • projection systems: Default assumed = EPSG:32648

Results

  1. Shapefiles whose names show the hazard models and their selected range of values
    • ID - equal to 1
    • geometry - Shapely Polygon outline of selected hazard
convert(threshold, infile, tmpfile_1, outfile)[source]

Convert GeoTiff raster file to Shapefile with geometries based on raster threshold less that 999

Parameters
  • threshold - Float value of lower bound of GeoTiff threshold value to be selected
  • infile - String name of input GeoTff file path
  • tmpfile_1 - Stirng name of tmp file 1
  • outfile - Stirng name of output shapefile
Outputs
Shapefile with Polygon geometries of rasters based on raster values above a threshold
convert_geotiff_to_vector_with_multibands(band_colors, infile, infile_epsg, tmpfile_1, tmpfile_2, outfile)[source]

Convert multi-band GeoTiff raster file to Shapefile with geometries based on raster band color values

Parameters
  • band_colors - Tuple with 3-values each corresponding to the values in raster bands
  • infile - String name of input GeoTff file path
  • infile_epsg - Integer value of EPSG Projection number of raster
  • tmpfile_1 - Stirng name of tmp file 1
  • tmpfile_2 - Stirng name of tmp file 2
  • outfile - Stirng name of output shapefile
Outputs
Shapefile with Polygon geometries of rasters based on raster band values
convert_geotiff_to_vector_with_threshold(from_threshold, to_threshold, infile, infile_epsg, tmpfile_1, tmpfile_2, outfile)[source]

Convert GeoTiff raster file to Shapefile with geometries based on raster threshold ranges

Parameters
  • from_threshold - Float value of lower bound of GeoTiff threshold value to be selected
  • to_threshold - Float value of upper bound of GeoTiff threshold value to be selected
  • infile - String name of input GeoTff file path
  • infile_epsg - Integer value of EPSG Projection number of raster
  • tmpfile_1 - Stirng name of tmp file 1
  • tmpfile_2 - Stirng name of tmp file 2
  • outfile - Stirng name of output shapefile
Outputs
Shapefile with Polygon geometries of rasters based on raster threshold ranges
glofris_data_details(file_name, root_dir)[source]

Read names of GLOFRIS files and create attributes

Parameters
  • file_name - String name of GeoTff file
  • root_dir - String path to directory of file
Outputs
df - Pandas DataFrame written to csv file with columns:
  • file_name - String
  • hazard_type - String
  • year - Integer: 2016 or 2030
  • climate_scenario - String: RCP4.5 or RCP8.5 or none
  • probability - Float: 1/(return period)
  • banded - Boolean: True or False
  • bands - Integer
main()[source]

Process hazard data

  1. Specify the paths from where to read and write:
    • Input data
    • Hazard data
  2. Supply input data and parameters
    • Thresholds of flood hazards
    • Values of bands to be selected
    • Color code of multi-band rasters
    • Specific file names that might require some specific operations
raster_projections_and_databands(file_path)[source]

Extract projection, data bands numbers and valuees from raster

Parameters
  • file_path - String name of input GeoTff file path
Outputs
  • counts - Number of bans in raster
  • crs - Projection system of raster
  • data_vals - Numpy array of raster values
raster_rewrite(in_raster, out_raster, nodata)[source]

Rewrite a raster to reproject and change no data value

Parameters
  • in_raster - String name of input GeoTff file path
  • out_raster - String name of output GeoTff file path
  • nodata - Float value of data that is treated as no data
Outputs
Reproject and replace raster with nodata = -1

vtra.preprocess.create_transport_networks module

Pre-process networks

Purpose

Creating post-processed transport networks with attributes

From pre-processed input Shapefiles and collected network attributes data

For all Province road networks: [‘Lao Cai’, ‘Binh Dinh’, ‘Thanh Hoa’]

For all transport modes at national scale: [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’,’multi’]

Input data requirements

Correct paths to all files and correct input parameters

Edge and node shapefiles for all Province roads and national-scale networks

All Geometries in Edge Shapefiles should be valid LineStrings

All Geometries in Node Shapefiles should be valid Points

  1. Node Shapefiles should contain following column names and attributes
    • node_id - String node ID
    • geometry - Shapely Point geometry of nodes
    • attributes - Multiple types depending upon sector and context
  2. Edge Shapefiles should contain following column names and attributes:
    • edge_id - String Edge ID
    • from_node - String node ID that should be present in node_id column
    • to_node - String node ID that should be present in node_id column
    • geometry - Shapely LineString geometry of edges

Results

  1. Excel sheets with post-processed network nodes and edges
  2. Shapefiles with post-processed network nodes and edges
  3. All nodes have the following attributes:
    • node_id - String Node ID
    • name - String name in Vietnamese/English
    • tons - Float assigned cargo freight tonnage using node
    • population - Float assigned passenger/population number using node
    • capacity - Float assigned capacity in tons/passenger numbers/other units
    • geometry - Shapely Point geometry of node
  4. Attributes only present in inland and coastal port nodes
    • port_type - String name of type of port: inland or sea
    • port_class - String name of class of port: class1A (international) or class1 (domestic hub)
  5. All edges have the following attributes:
    • edge_id - String edge ID
    • g_id - Integer edge ID
    • from_node - String node ID that should be present in node_id column
    • to_node - String node ID that should be present in node_id column
    • geometry - Shapely LineString geometry of edge
    • terrain - String name of terrain of edge
    • level - Integer number for edge level: National, Provincial, Local, etc.
    • width - Float width in meters of edge
    • length - Float estimated length in kilometers of edge
    • min_speed - Float estimated minimum speed in km/hr on edge
    • max_speed - Float estimated maximum speed in km/hr on edge
    • min_time - Float estimated minimum time of travel in hours on edge
    • max_time - Float estimated maximum time of travel in hours on edge
    • min_time_cost - Float estimated minimum cost of time in USD on edge
    • max_time_cost - Float estimated maximum cost of time in USD on edge
    • min_tariff_cost - Float estimated minimum tariff cost in USD on edge
    • max_tariff_cost - Float estimated maximum tariff cost in USD on edge
    • vehicle_co - Integer number of daily vehicle counts on edge
  6. Attributes only present in Province and national roads edges
    • surface - String value for surface
    • road_class - Integer between 1 and 6
    • road_cond - String value: paved or unpaved
    • asset_type - String name of type of asset

References

  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
main()[source]

Pre-process networks

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Paths of the mode files: List of tuples of strings
    • Names of modes: List of strings
    • Unit weight of vehicle assumed for each mode: List of float types
    • Ranges of usage factors for each mode to represent uncertainty in cost estimations: List of tuples of float types
    • Ranges of speeds for each mode to represent uncertainty in speeds: List of tuple of float types
    • Names of all industry sector and crops in VITRANSS2 and IFPRI datasets: List of string types
    • Names of commodity/industry columns for which min-max tonnage column names already exist: List of string types
    • Percentage of OD flow we want to send along path: Float type
  3. Give the paths to the input data files:
    • Pre-proccesed network shapefiles
    • Costs of modes Excel file
    • Road properties Excel file
  4. Specify the output files and paths to be created

vtra.preprocess.cvts module

vtra.preprocess.cvts_speeds module

main()[source]

Traffic speed assignment script vehicle_id, edge_path, time_stamp

vtra.preprocess.national_modes_od_creation module

Pre-process OD matrix

Purpose

Create national scale OD matrices at node and province levels from:
  • VITRANSS2 province-scale OD data
  • IFPRI crop data at 1km resolution

Input data requirements

  1. Correct paths to all files and correct input parameters
  2. Excel file with VITRANSS2 OD data with columns:
    • o - Integer values of Origin Province ID
    • d - Integer values of Destination Province ID
    • industry and crop names - Float values of tons between OD Provinces
    • modes - Float values to infer model split values for OD Provinces
  3. Geotiff files with IFPRI crop data:
    • tons - Float values of production tonnage at each grid cell
    • geometry - Raster grid cell geometry
  4. Shapefile of RiceAtlas data:
    • month production columns - tonnage of rice for each month
    • geometry - Shapely Polygon geometry of Provinces
  5. Shapefile of Provinces
    • od_id - Integer Province ID corresponding to OD ID
    • name_eng - String name of Province in English
    • geometry - Shapely Polygon geometry of Provinces
  6. Shapefile of Communes
    • population - Float values of populations in Communes
    • geometry - Shapely Polygon geometry of Communes
  7. Shapefiles of network nodes
    • node_id - String node ID
    • geometry - Shapely point geometry of nodes
  8. Shapefiels of network edges
    • vehicle_co - Count of vehiles only for roads
    • geometry - Shapely LineString geometry of edges

Results

  1. Excel workbook with sheet of mode-wise and total OD flows
    • origin - String node ID of origin node
    • destination - String node ID of destination node
    • o_region - String names of origin Province
    • d_region - String names of destination Province
    • commodity_names - Float values of daily tonnages of commodities/industries between OD nodes from VITRANSS2 and IFPRI data (except rice)
    • min_rice - Float values of minimum daily tonnages of rice between OD nodes
    • max_rice - Float values of maximum daily tonnages of rice between OD nodes
    • min_tons - Float values of minimum daily tonnages between OD nodes
    • max_tons - Float values of maximum daily tonnages between OD nodes

References

  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
assign_crop_od_flows_to_nodes(national_ods_df, province_path, province_name_col, calc_path, crop_data_path, crop_names, rice_prod_months, modes_df, modes, od_fracs_crops, o_id_col, d_id_col)[source]

Assign IFPRI crop values to OD nodes based on VITRANSS 2 OD distributions

  • Based on VITRANSS 2 OD distributions
Parameters
  • national_ods_df - List of lists of Pandas dataframes
  • province_path - Path of province shapefile
  • province_name_col - String name of column containing province names
  • calc_path - Path to store intermediary calculations
  • crop_data_path - Path to crop datasets
  • crop_names - List of string of crop names in IFPRI datasets
  • rice_prod_months - Geopandas dataframe of RiceAtlas crop values with minimum and maximum monthly production as fraction of annual production
  • modes_df - List of Geopnadas dataframes with nodes of each transport mode
  • modes - List of strings of names of transport modes
  • od_fracs_crops - Pandas dataframe of crop OD distributions and modal splits as per VITRANSS 2 data
  • o_id_col - String name of Origin province ID column
  • d_id_col - String name of Destination province ID column
  • Outputs
    national_ods_df - List of Lists of Pandas dataframes with columns:
    • origin - Origin node ID
    • o_region - Origin province name
    • destination - Destination node ID
    • d_region - Destination province ID
    • min_crop - Minimum Tonnage values for the named crop
    • max_crop - Maximum Tonnage values for the named crop
assign_daily_min_max_tons_rice(crop_df, rice_prod_df)[source]

Estimate minimum and maximum daily rice tonnages

Parameters
  • crop_df - Geopandas dataframe of crop points with annual tonnages
  • rice_prod_df - Geopandas dataframe of RiceAtlas crop values with minimum and maximum monthly production as fraction of annual production
Outputs
crop_df - Geopandas dataframe of crop points with new columns:
  • min_rice - Minimum daily rice tonnages
  • max_rice - Maximum daily rice tonnages
assign_industry_od_flows_to_nodes(national_ods_df, ind_cols, modes_df, modes, od_fracs, o_id_col, d_id_col)[source]

Assign VITRANSS 2 OD flows to nodes

Parameters
  • national_ods_df - List of lists of Pandas dataframes
  • ind_cols - List of strings of names of indsutry columns
  • modes_df - List of Geopnadas dataframes with nodes of each transport mode
  • modes - List of strings of names of transport modes
  • od_fracs - Pandas dataframe of Industry OD flows and modal splits
  • o_id_col - String name of Origin province ID column
  • d_id_col - String name of Destination province ID column
Outputs
national_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • o_region - Origin province name
  • destination - Destination node ID
  • d_region - Destination province ID
  • ind - Tonnage values for the named industry
assign_node_weights_by_commune_population_proximity(commune_path, nodes, commune_pop_col)[source]

Assign weights to nodes based on their nearest commune populations

  • By finding the communes that intersect with the Voronoi extents of nodes
Parameters
  • commune_path - Path of commune shapefile
  • nodes_in - Path of nodes shapefile
  • commune_pop_col - String name of column containing commune population values
Outputs
  • nodes - Geopandas dataframe of nodes with new column called weight
assign_province_name_id_to_nodes(province_path, nodes_in, province_name_col, province_id_col)[source]

Match the nodes to their province names and province IDs

  • By finding the province that contains or is nearest to the node
Parameters
  • province_path - Path of province shapefile
  • nodes_in - Path of nodes shapefile
  • province_name_col - String name of column containing province names
  • province_id_col - String name of column containing province ID’s that match VITRANSS 2 OD ids
Outputs
  • nodes - Geopandas dataframe of nodes with new columns called province_name and od_id
assign_road_weights(nodes, edges_in, aadt_column)[source]

Assign weights to nodes on the road network

  • By finding the total AADT counts converging on the node
Parameters
  • nodes - Geopandas dataframe of nodes
  • edges_in - Path of edges shapefile
  • aadt_column - String name of column containing AADT values
Outputs
  • nodes - Geopandas dataframe of nodes with new column called weight
ifpri_crop_od_split(vitranss2_od_data_file, modes, o_id_col, d_id_col, crop_cols)[source]

Create IFPRI crop OD modal split estimates from VITRANSS2 OD values

  • Combine the crop-wise OD values with the mode-wise OD values
  • Estimate the OD modal-split from the mode-wsie OD values
Parameters
  • vitranss2_od_data_file - Excel file with VITRANSS2 OD data
  • modes - List of strings of mode types, e.g. [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’]
  • o_id_col - String name of name of Origin column
  • d_id_col - String name of name of Destination column
  • crop_cols - List of strings of crop names in VITRANSS 2 OD data
Outputs
  • od_fracs_crops - Pandas dataframe of VITRANSS2 crop OD data with modal splits
main()[source]

Pre-process national OD matrix

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Names of modes: List of strings
    • OD column names in OD file: String types
    • Names of industry columns in VITRANSS2 data: List of string types
    • Names of crop columns in VITRANSS2 data: List of string types
    • Names of crops in IFPRI crop data: List of string types
    • Names of months in Rice Atlas data: List of string types
  3. Give the paths to the input data files:
    • Network nodes files
    • VITRANSS 2 OD file
    • IFPRI crop data files
    • Rice Atlas data shapefile
    • Province boundary and stats data shapefile
    • Commune boundary and stats data shapefile
  4. Specify the output files and paths to be created
riceatlas_crop_minmax(riceatlas_crop_file, crop_month_fields)[source]

Create MIN_MAX fractions of rice production estimates for provinces

  • By reading data from the RiceAtlas data
  • And estimating the minimum and maximum monthly values > 0
Parameters
  • riceatlas_crop_file - Shapefile with RiceAtlas data
  • crop_month_fields - List of strings of names of columns indicating rice monthly production
Outputs
rice_prod_months - Geopandas dataframe of RiceAtlas crop values
  • min_frac - minimum month of production as fraction of annual production
  • max_frac - maximum month of production as fraction of annual production
vitranss2_od_split(vitranss2_od_data_file, modes, o_id_col, d_id_col)[source]

Create VITRANSS2 OD modal split estimates

  • Combine the commodity-wise OD values with the mode-wise OD values
  • Estimate the OD modal-split from the mode-wsie OD values
Parameters
  • vitranss2_od_data_file - Excel file with VITRANSS2 OD data
  • modes - List of strings of mode types, e.g. [‘road’, ‘rail’, ‘air’, ‘inland’, ‘coastal’]
  • o_id_col - String name of name of Origin column
  • d_id_col - String name of name of Destination column
Outputs
  • od_fracs - Pandas dataframe of VITRANSS 2 commodity OD data with modal splits

vtra.preprocess.province_roads_access_od_creation module

Pre-process accessibility-based provincial OD matrix

Purpose

Create province scale OD matrices between roads connecting villages to nearest communes:
  • Net revenue estimates of commune villages
  • IFPRI crop data at 1km resolution

Input data requirements

  1. Correct paths to all files and correct input parameters
  2. Geotiff files with IFPRI crop data:
    • tons - Float values of production tonnage at each grid cell
    • geometry - Raster grid cell geometry
  3. Shapefile of RiceAtlas data:
    • month production columns - tonnage of rice for each month
    • geometry - Shapely Polygon geometry of Provinces
  4. Shapefile of Provinces
    • od_id - Integer Province ID corresponding to OD ID
    • name_eng - String name of Province in English
    • geometry - Shapely Polygon geometry of Provinces
  5. Shapefile of Communes
    • population - Float values of populations in Communes
    • nfrims - Float values of number of firms in Provinces
    • netrevenue - Float values of Net Revenue in Provinces
    • argi_prop - Float values of proportion of agrivculture firms in Provinces
    • geometry - Shapely Polygon geometry of Communes
  6. Shapefiles of network nodes
    • node_id - String node ID
    • geometry - Shapely point geometry of nodes
  7. Shapefiles of network edges
    • vehicle_co - Count of vehiles only for roads
    • geometry - Shapely LineString geometry of edges
  8. Shapefiles of Commune center points
    • object_id - Integer ID of point
    • geometry - Shapely point geometry of points
  9. Shapefiles of Village center points
    • object_id - Integer ID of points
    • geometry - Shapely point geometry of points

Results

  1. Excel workbook with sheet of province-wise OD flows
    • origin - String node ID of origin node
    • destination - String node ID of destination node
    • crop_names - Float values of daily tonnages of IFPRI crops (except rice) between OD nodes
    • min_rice - Float values of minimum daily tonnages of rice between OD nodes
    • max_rice - Float values of maximum daily tonnages of rice between OD nodes
    • min_croptons - Float values of minimum daily tonnages of crops between OD nodes
    • max_croptons - Float values of maximum daily tonnages of crops between OD nodes
    • min_agrirev - Float value of Minimum daily revenue of agriculture firms between OD nodes
    • max_agrirev - Float value of Maximum daily revenue of agriculture firms between OD nodes
    • min_noagrirev - Float value of Minimum daily revenue of non-agriculture firms between OD nodes
    • max_noagrirev - Float value of Maximum daily revenue of non-agriculture firms between OD nodes
    • min_netrev - Float value of Minimum daily revenue of all firms between OD nodes
    • max_netrev - Float value of Maximum daily revenue of all firms between OD nodes

References

  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
assign_io_rev_costs_crops(x, cost_dataframe, rice_prod_file, crop_month_fields, province, x_cols, ex_rate)[source]

Assign crop tonnages to daily net revenues

Parameters
  • x - Pandas DataFrame of values
  • cost_dataframe - Pandas DataFrame of conversion of tonnages to net revenues
  • rice_prod_file - Shapefile of RiceAtlas monthly production value
  • province - Stirng name of province
  • x_cols - List of string names of crops
  • ex_rate - Exchange rate from VND millions to USD
Outputs
  • min_croprev - Float value of Minimum daily revenue of crops
  • max_croprev - Float value of Maximum daily revenue of crops
assign_monthly_tons_crops(x, rice_prod_file, crop_month_fields, province, x_cols)[source]

Assign crop tonnages to OD pairs

Parameters
  • x - Pandas DataFrame of values
  • rice_prod_file - Shapefile of RiceAtlas monthly production value
  • crop_month_fields - Lsit of strings of month columns in Rice Atlas shapefile
  • province - Stirng name of province
  • x_cols - List of string names of crops
Outputs
  • min_croptons - Float value of Minimum daily tonnages of crops
  • max_croptons - Float value of Maximum daily tonnages of crops
crop_od_pairs(start_points, end_points, crop_name)[source]

Assign crop tonnages to OD pairs

Parameters
  • start_points - GeoDataFrame of start points for Origins
  • end_points - GeoDataFrame of potential end points for Destinations
  • crop_name - String name of crop
Outputs
od_pairs_df - Pandas DataFrame wit columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • crop - Tonnage values for the named crop
  • netrev_argi - Daily Net revenue of agriculture firms in USD
  • netrev_noargi - Daily Net revenue of non-agriculture firms in USD
crop_values_to_province_od_nodes(province_ods_df, province_geom, calc_path, crop_data_path, crop_names, nodes, sindex_nodes, prov_commune_center, sindex_commune_center, node_id, object_id)[source]

Assign IFPRI crop values to OD nodes in provinces

  • Based on finding nearest nodes to crop production sites as Origins
  • And finding nearest commune centers as Destinations
Parameters
  • province_ods_df - List of lists of Pandas dataframes
  • province_geom - Shapely Geometry of province
  • calc_path - Path to store intermediary calculations
  • crop_data_path - Path to crop datasets
  • crop_names - List of string of crop names in IFPRI datasets
  • nodes - GeoDataFrame of province road nodes
  • sindex_nodes - Spatial index of province road nodes
  • prov_commune_center - GeoDataFrame of province commune center points
  • sindex_commune_center - Spatial index of commune center points
  • node_id - String name of Node ID column
  • object_id - String name of commune ID column
Outputs
province_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • crop - Tonnage values for the named crop
main()[source]

Pre-process provincial-scale OD

  1. Specify the paths from where to read and write:
    • Input data
    • Intermediate calcuations data
    • Output results
  2. Supply input data and parameters
    • Names of the Provinces: List of strings
    • Exchange rate to convert 2012 Net revenue in million VND values to USD in 2016
    • Names of crops in IFPRI crop data
    • Names of months in Rice Atlas data
    • Name of column for netrevenue of communes in VND millions
    • Name of column for numebr of firms in communes
    • Name of column for proportion of agriculture firms in communes
    • Name of Node ID column
    • Name of commune ID column
  3. Give the paths to the input data files:
    • Network nodes files
    • IFPRI crop data files
    • Rice Atlas data shapefile
    • Province boundary and stats data shapefile
    • Commune boundary and stats data shapefile
    • Population points shapefile for locations of villages
    • Commune center points shapefile
netrev_od_pairs(start_points, end_points)[source]

Assign crop tonnages to OD pairs

Parameters
  • start_points - GeoDataFrame of start points for Origins
  • end_points - GeoDataFrame of potential end points for Destinations
Outputs
od_pairs_df - Pandas DataFrame with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • netrev_argi - Net revenue of agriculture firms
  • netrev_noargi - Net revenue of non-agriculture firms
netrevenue_values_to_province_od_nodes(province_ods_df, prov_communes, commune_sindex, netrevenue, n_firms, agri_prop, prov_pop, prov_pop_sindex, nodes, sindex_nodes, prov_commune_center, sindex_commune_center, node_id, object_id, exchange_rate)[source]

Assign commune level netrevenue values to OD nodes in provinces

  • Based on finding nearest nodes to village points with netrevenues as Origins
  • And finding nearest commune centers as Destinations
Parameters
  • province_ods_df - List of lists of Pandas dataframes
  • prov_communes - GeoDataFrame of commune level statistics
  • commune_sindex - Spatial index of communes
  • netrevenue - String name of column for netrevenue of communes in VND millions
  • nfirm - String name of column for numebr of firms in communes
  • agri_prop - Stirng name of column for proportion of agriculture firms in communes
  • prov_pop - GeoDataFrame of population points in Province
  • prov_pop_sindex - Spatial index of population points in Province
  • nodes - GeoDataFrame of province road nodes
  • sindex_nodes - Spatial index of province road nodes
  • prov_commune_center - GeoDataFrame of province commune center points
  • sindex_commune_center - Spatial index of commune center points
  • node_id - String name of Node ID column
  • object_id - String name of commune ID column
  • exchange_rate - Float value for exchange rate from VND million to USD
Outputs
province_ods_df - List of Lists of Pandas dataframes with columns:
  • origin - Origin node ID
  • destination - Destination node ID
  • netrev_argi - Net revenue of agriculture firms
  • netrev_noargi - Net revenue of non-agriculture firms

vtra.preprocess.transport_network_inputs module

Utility functions for transport networks

Purpose

Helper functions to create post-processeed networks with attributes from specific types of input datasets

References

  1. Pant, R., Koks, E.E., Russell, T., Schoenmakers, R. & Hall, J.W. (2018). Analysis and development of model for addressing climate change/disaster risks in multi-modal transport networks in Vietnam. Final Report, Oxford Infrastructure Analytics Ltd., Oxford, UK.
  2. All input data folders and files referred to in the code below.
assign_asset_type_to_province_roads(x)[source]

Assign asset types to roads assets in Vietnam

The types are assigned based on our understanding of: 1. The reported asset code in the data

Parameters
x - Pandas DataFrame with numeric asset code
Returns
asset type - Which is either of (Bridge, Dam, Culvert, Tunnel, Spillway, Road)
assign_asset_type_to_province_roads_from_file(asset_code, asset_type_list)[source]

Assign asset types to roads assets in Vietnam based on values in file

The types are assigned based on our understanding of: 1. The reported asset code in the data

Parameters
  • asset_code - Numeric value for code of asset
  • asset_type_list - List of Strings wiht names of asset types
Returns
asset_type - String name of type of asset
assign_assumed_width_to_national_roads_from_file(x, flat_width_range_list, mountain_width_range_list)[source]

Assign widths to national roads assets in Vietnam

The widths are assigned based on our understanding of: 1. The class of the road which is not reliable 2. The number of lanes 3. The terrain of the road

Parameters
  • x - Pandas DataFrame row with values
    • road_class - Integer value of road class
    • lanenum__s - Integer value of number of lanes on road
  • flat_width_range_list - List of tuples containing (from_width, to_width, assumed_width)
  • moiuntain_width_range_list - List of tuples containing (from_width, to_width, assumed_width)
Returns
assumed_width - Float assigned width of the road asset based on design specifications
assign_assumed_width_to_province_roads(x)[source]

Assign widths to Province roads assets in Vietnam

Parameters
x : int value for width of asset
Returns
int assigned width of the road asset based on design specifications
assign_assumed_width_to_province_roads_from_file(asset_width, width_range_list)[source]

Assign widths to Province roads assets in Vietnam

The widths are assigned based on our understanding of:

  1. The reported width in the data which is not reliable
  2. A design specification based understanding of the assumed width based on ranges of values
Parameters
  • asset_width - Numeric value for width of asset
  • width_range_list - List of tuples containing (from_width, to_width, assumed_width)
Returns
assumed_width - assigned width of the raod asset based on design specifications
assign_min_max_speeds_to_national_roads_from_file(x, flat_width_range_list, mountain_width_range_list)[source]

Assign speeds to national roads in Vietnam

The speeds are assigned based on our understanding of: 1. The class of the road 2. The estimated speed from the CVTS data 3. The terrain of the road

Parameters
x - Pandas DataFrame of values
  • road_class - Integer value of road class
  • terrain - String value of road terrain
  • est_speed - Float value of estimated speed from CVTS data
  • flat_width_range_list - List of tuples containing design speeds
  • moiuntain_width_range_list - List of tuples containing design speeds
Returns
  • Float minimum assigned speed in km/hr
  • Float maximum assigned speed in km/hr
assign_minmax_tariff_costs_multi_modal_apply(x, cost_dataframe)[source]

Assign tariff costs on multi-modal network links in Vietnam

Parameters
  • x - Pandas dataframe with values
    • port_type - String name of port type
    • from_mode - String name of mode
    • to_mode - String name of mode
    • other_mode - String name of mode
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_national_roads_apply(x, cost_dataframe)[source]

Assign tariff costs on national roads in Vietnam

The costs are assigned based on our understanding of:

  1. The vehicle counts on roads
Parameters
  • x - Pandas dataframe with values
    • vehicle_co - Count of number of vehicles on road
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_networks_apply(x, cost_dataframe)[source]

Assign tariff costs on networks in Vietnam

Parameters
  • x - Pandas dataframe with values
    • length - Float length of edge in km
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_tariff_costs_province_roads_apply(x, cost_dataframe)[source]

Assign tariff costs on Province roads in Vietnam

The costs are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
Parameters
  • x - Pandas dataframe with values
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_tariff_cost - Float minimum assigned tariff cost in USD/ton
  • max_tariff_cost - Float maximum assigned tariff cost in USD/ton
assign_minmax_time_costs_national_roads_apply(x, cost_dataframe)[source]

Assign time costs on national roads in Vietnam

The costs are assigned based on our understanding of:

  1. The vehicle counts on roads
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
Parameters
  • x - Pandas dataframe with values
    • vehicle_co - Count of number of vehicles on road
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD
assign_minmax_time_costs_networks_apply(x, cost_dataframe)[source]

Assign time costs on networks in Vietnam

Parameters
  • x - Pandas dataframe with values
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD
assign_minmax_time_costs_province_roads_apply(x, cost_dataframe)[source]

Assign time costs on Province roads in Vietnam

The costs are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
Parameters
  • x - Pandas dataframe with values
    • code - Numeric code for type of asset
    • level - Numeric code for level of asset
    • terrain - String value of the terrain of asset
    • length - Float length of edge in km
    • min_speed - Float minimum assigned speed in km/hr
    • max_speed - Float maximum assigned speed in km/hr
  • cost_dataframe - Pandas Dataframe with costs
Returns
  • min_time_cost - Float minimum assigned cost of time in USD
  • max_time_cost - Float maximum assigned cost of time in USD
assign_minmax_travel_speeds_province_roads_apply(x)[source]

Assign travel speeds to roads assets in Vietnam

The speeds are assigned based on our understanding of:

  1. The types of assets
  2. The levels of classification of assets: 0-National, 1-Provinical, 2-Local, 3-Other
  3. The terrain where the assets are located: Flat or Mountain or No information
Parameters
x - Pandas dataframe with values
  • code - Numeric code for type of asset
  • level - Numeric code for level of asset
  • terrain - String value of the terrain of asset
Returns
  • Float minimum assigned speed in km/hr
  • Float maximum assigned speed in km/hr
assign_national_road_class(x)[source]

Assign road speeds to national roads

Parameters
x - Pandas DataFrame of values
  • capkth__ca - String value of road class
  • vehicle_co - Float value of number of vehicles on road
Returns
  • Integer value of road class
assign_national_road_conditions(x)[source]

Assign road conditions as paved or unpaved to national roads

Parameters
x - Pandas DataFrame of values
Returns
String value of road as paved or unpaved
assign_national_road_terrain(x)[source]

Assign terrain as flat or mountain to national roads

Parameters
x - Pandas DataFrame of values
Returns
String value of terrain as flat or mountain
assign_province_road_conditions(x)[source]

Assign road conditions as paved or unpaved to Province roads

Parameters
x - Pandas DataFrame of values
  • code - Numeric code for type of asset
  • level - Numeric code for level of asset
Returns
String value as paved or unpaved
create_port_names(x, port_names_df)[source]

Add port names in Vietnamese to port data

Parameters
  • x - Pandas DataFrame with values
    • port_type - String type of port
    • cangbenid - Integer ID of inland port
    • objectid - Integer ID of sea port
  • port_names_df - Pandas DataFrame with port names
Returns
name - Vietnamese name of port
multi_modal_shapefile_to_dataframe(edges_in, mode_properties_file, mode_name, length_threshold, usage_factors)[source]

Create multi-modal network dataframe from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • length_threshold - Float value of threshold in km of length of multi-modal links
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
edges - Geopandas DataFrame with network edge topology and attributes
multi_modal_shapefile_to_network(edges_in, mode_properties_file, mode_name, length_threshold, utilization_factors)[source]

Create multi-modal igraph network dataframe from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • length_threshold - Float value of threshold in km of length of multi-modal links
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
G - Igraph object with network edge topology and attributes
national_road_shapefile_to_dataframe(edges_in, road_properties_file, usage_factors)[source]

Create national network dataframe from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
edges: Geopandas DataFrame with network edge topology and attributes
national_road_shapefile_to_network(edges_in, road_properties_file, usage_factors)[source]

Create national igraph network from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
G - Igraph object with network edge topology and attributes
network_shapefile_to_dataframe(edges_in, mode_properties_file, mode_name, speed_min, speed_max, usage_factors)[source]

Create network dataframe from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • speed_min - Float value of minimum assgined speed
  • speed_max - Float value of maximum assgined speed
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
edges - Geopandas DataFrame with network edge topology and attributes
network_shapefile_to_network(edges_in, mode_properties_file, mode_name, speed_min, speed_max, utilization_factors)[source]

Create igraph network from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • mode_properties_file - String path to Excel file with mode attributes
  • mode_name - String name of mode
  • speed_min - Float value of minimum assgined speed
  • speed_max - Float value of maximum assgined speed
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
G - Igraph object with network edge topology and attributes
province_shapefile_to_dataframe(edges_in, road_terrain, road_properties_file, usage_factors)[source]

Create province network dataframe from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • road_terrain - String name of terrain: flat or mountanious
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
edges - Geopandas DataFrame with network edge topology and attributes
province_shapefile_to_network(edges_in, road_terrain, road_properties_file, usage_factors)[source]

Create province igraph network from inputs

Parameters
  • edges_in - String path to edges file/network Shapefile
  • road_terrain - String name of terrain: flat or mountanious
  • road_properties_file - String path to Excel file with road attributes
  • usage_factor - Tuple of 2-float values between 0 and 1
Returns
G - Igraph object with network edge topology and attributes
read_setor_nodes(node_file_with_ids, sector)[source]

Create port data with attributes

Parameters
  • ports_file_with_ids - String path of GeoDataFrame with port IDs
  • sector - String path of sector
Returns
ports_with_id - GeoPandas DataFrame with port attributes
read_waterway_ports(ports_file_with_ids, ports_file_with_names)[source]

Create port data with attributes

Parameters
  • ports_file_with_ids - String path of GeoDataFrame with port IDs
  • ports_file_with_names - String path of GeoDataFrame with port names
Returns
ports_with_id - GeoPandas DataFrame with port attributes