RAPIDDataset

This is a wrapper for the RAPID Qout netCDF file. Here are some basic examples for useage.

class RAPIDpy.dataset.RAPIDDataset(filename, river_id_dimension='', river_id_variable='', streamflow_variable='', datetime_simulation_start=None, simulation_time_step_seconds=None, out_tzinfo=None)[source]

This class is designed to access data from the RAPID Qout NetCDF file.

filename

str – Path to the RAPID Qout NetCDF file.

river_id_dimension

Optional[str] – Name of the river ID dimension. Default is to search through a standard list.

river_id_variable

Optional[str] – Name of the river ID variable. Default is to search through a standard list.

streamflow_variable

Optional[str] – Name of the streamflow varaible. Default is to search through a standard list.

datetime_simulation_start

Optional[datetime] – This is a datetime object with the date of the simulation start time.

simulation_time_step_seconds

Optional[integer] – This is the time step of the simulation output in seconds.

out_tzinfo

Optional[tzinfo] – Time zone to output data as. The dates will be converted from UTC to the time zone input. Default is UTC.

Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    #USE FUNCTIONS TO ACCESS DATA HERE
get_qout(river_id_array=None, date_search_start=None, date_search_end=None, time_index_start=None, time_index_end=None, time_index=None, time_index_array=None, daily=False, pd_filter=None, daily_mode='mean')[source]

This method extracts streamflow data by a single river ID or by a river ID array. It has options to extract by date or by date index.

Parameters:
  • river_id_array (Optional[list or int]) – A single river ID or an array of river IDs.
  • date_search_start (Optional[datetime]) – This is a datetime object with the date of the minimum date for starting.
  • date_search_end (Optional[datetime]) – This is a datetime object with the date of the maximum date for ending.
  • time_index_start (Optional[int]) – This is the index of the start of the time array subset. Useful for the old file version.
  • time_index_end (Optional[int]) – This is the index of the end of the time array subset. Useful for the old file version.
  • time_index (Optional[int]) – This is the index of time to return in the case that your code only wants one index. Used internally.
  • time_index_array (Optional[list or np.array]) – This is used to extract the vales only for particular dates. This can be from the get_time_index_range function.
  • daily (Optional[bool]) – If true, this will convert qout to daily average.
  • pd_filter (Optional[str]) – This is a valid pandas resample frequency filter.
  • filter_mode (Optional[str]) – You can get the daily average “mean” or the maximum “max”. Default is “mean”.
Returns:

This is a 1D or 2D array or a single value depending on your input search.

Return type:

numpy.array

This example demonstrates how to retrieve the streamflow associated with the reach you are interested in:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
river_id = 500
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    streamflow_array = qout_nc.get_qout(river_id)

This example demonstrates how to retrieve the streamflow within a date range associated with the reach you are interested in:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
river_id = 500
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    streamflow_array = qout_nc.get_qout(river_id,
                                        date_search_start=datetime(1985,1,1),
                                        date_search_end=datetime(1985,2,4))
get_river_id_array()[source]

This method returns the river ID array for this file.

Returns:An array of the river ID’s
Return type:numpy.array

Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    river_ids = qout_nc.get_river_id_array()
get_river_index(river_id)[source]

This method retrieves the river index in the netCDF dataset corresponding to the river ID.

Returns:The index of the river ID’s in the file
Return type:int

Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
river_id = 53458

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    river_index = qout_nc.get_river_index(river_id)
get_time_array(datetime_simulation_start=None, simulation_time_step_seconds=None, return_datetime=False, time_index_array=None)[source]

This method extracts or generates an array of time. The new version of RAPID output has the time array stored. However, the old version requires the user to know when the simulation began and the time step of the output.

Parameters:
  • return_datetime (Optional[boolean]) – If true, it converts the data to a list of datetime objects. Default is False.
  • time_index_array (Optional[list or np.array]) – This is used to extract the datetime vales. This can be from the get_time_index_range function.
Returns:

An array of integers representing seconds since Jan 1, 1970 UTC or datetime objects if return_datetime is set to True.

Return type:

list

These examples demonstrates how to retrieve or generate a time array to go along with your RAPID streamflow series.

CF-Compliant Qout File Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    #retrieve integer timestamp array
    time_array = qout_nc.get_time_array()

    #or, to get datetime array
    time_datetime = qout_nc.get_time_array(return_datetime=True)

Legacy Qout File Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout,
                  datetime_simulation_start=datetime_simulation_start,
                  simulation_time_step_seconds=simulation_time_step_seconds) as qout_nc:

    #retrieve integer timestamp array
    time_array = qout_nc.get_time_array()

    #or, to get datetime array
    time_datetime = qout_nc.get_time_array(return_datetime=True)
get_time_index_range(date_search_start=None, date_search_end=None, time_index_start=None, time_index_end=None, time_index=None)[source]

Generates a time index range based on time bounds given. This is useful for subset data extraction.

Parameters:
  • date_search_start (Optional[datetime]) – This is a datetime object with the date of the minimum date for starting.
  • date_search_end (Optional[datetime]) – This is a datetime object with the date of the maximum date for ending.
  • time_index_start (Optional[int]) – This is the index of the start of the time array subset. Useful for the old file version.
  • time_index_end (Optional[int]) – This is the index of the end of the time array subset. Useful for the old file version.
  • time_index (Optional[int]) – This is the index of time to return in the case that your code only wants one index. Used internally.
Returns:

This is an array used to extract a subset of data.

Return type:

index_array

CF-Compliant Qout File Example:

from datetime import datetime
from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    time_index_range = qout_nc.get_time_index_range(date_search_start=datetime(1980, 1, 1),
                                                    date_search_end=datetime(1980, 12, 11))

Legacy Qout File Example:

from datetime import datetime
from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout,
                  datetime_simulation_start=datetime(1980, 1, 1),
                  simulation_time_step_seconds=3600) as qout_nc:

    time_index_range = qout_nc.get_time_index_range(date_search_start=datetime(1980, 1, 1),
                                                    date_search_end=datetime(1980, 12, 11))
is_time_variable_valid()[source]

This function returns whether or not the time variable is valid.

Returns:True if the time variable is valid, otherwise false.
Return type:boolean

Example:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    if qout_nc.is_time_variable_valid():
        #DO WORK HERE
write_flows_to_csv(path_to_output_file, river_index=None, river_id=None, date_search_start=None, date_search_end=None, daily=False, mode='mean')[source]

Write out RAPID output to CSV file.

Note

Need either reach_id or reach_index parameter, but either can be used.

Parameters:
  • path_to_output_file (str) – Path to the output csv file.
  • river_index (Optional[datetime]) – This is the index of the river in the file you want the streamflow for.
  • river_id (Optional[datetime]) – This is the river ID that you want the streamflow for.
  • date_search_start (Optional[datetime]) – This is a datetime object with the date of the minimum date for starting.
  • date_search_end (Optional[datetime]) – This is a datetime object with the date of the maximum date for ending.
  • daily (Optional[boolean]) – If True and the file is CF-Compliant, write out daily flows.
  • mode (Optional[str]) – You can get the daily average “mean” or the maximum “max”. Defauls is “mean”.

Example writing entire time series to file:

from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    #for writing entire time series to file
    qout_nc.write_flows_to_csv('/timeseries/Qout_3624735.csv',
                               river_id=river_id,
                               )


    #if file is CF compliant, you can write out daily average

    #NOTE: Getting the river index is not necessary
    #this is just an example of how to use this
    river_index = qout_nc.get_river_index(river_id)
    qout_nc.write_flows_to_csv('/timeseries/Qout_daily.csv',
                               river_index=river_index,
                               daily=True,
                               )

Example writing entire time series as daily average to file:

from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    #NOTE: Getting the river index is not necessary
    #this is just an example of how to use this
    river_index = qout_nc.get_river_index(river_id)

    #if file is CF compliant, you can write out daily average
    qout_nc.write_flows_to_csv('/timeseries/Qout_daily.csv',
                               river_index=river_index,
                               daily=True,
                               )

Example writing entire time series as daily average to file:

from datetime import datetime
from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    # if file is CF compliant, you can filter by date
    qout_nc.write_flows_to_csv('/timeseries/Qout_daily_date_filter.csv',
                               river_id=river_id,
                               daily=True,
                               date_search_start=datetime(2002, 8, 31),
                               date_search_end=datetime(2002, 9, 15),
                               mode="max"
                               )
write_flows_to_gssha_time_series_ihg(path_to_output_file, connection_list_file, date_search_start=None, date_search_end=None, daily=False, mode='mean')[source]

Write out RAPID output to GSSHA time series ihg file

Note

GSSHA project card is CHAN_POINT_INPUT

Parameters:
  • path_to_output_file (str) – Path to the output xys file.
  • connection_list_file (list) – CSV file with link_id, node_id, baseflow, and rapid_rivid header and rows with data.
  • date_search_start (Optional[datetime]) – This is a datetime object with the date of the minimum date for starting.
  • date_search_end (Optional[datetime]) – This is a datetime object with the date of the maximum date for ending.
  • out_tzinfo (Optional[tzinfo]) – Timezone object with output time zone for GSSHA. Default is the native RAPID output timezone (UTC).
  • daily (Optional[boolean]) – If True and the file is CF-Compliant, write out daily flows.
  • mode (Optional[str]) – You can get the daily average “mean” or the maximum “max”. Defauls is “mean”.

Example connection list file:

link_id, node_id, baseflow, rapid_rivid
599, 1, 0.0, 80968
603, 1, 0.0, 80967

Example writing entire time series to file:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
connection_list_file = '/path/to/connection_list_file.csv'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    #for writing entire time series to file
    qout_nc.write_flows_to_gssha_time_series_ihg('/timeseries/Qout_3624735.ihg',
                                                 connection_list_file,
                                                 )

Example writing entire time series as daily average to file:

from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
connection_list_file = '/path/to/connection_list_file.csv'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    # if file is CF compliant, you can write out daily average
    qout_nc.write_flows_to_gssha_time_series_ihg('/timeseries/Qout_3624735.ihg',
                                                 connection_list_file,
                                                 daily=True,
                                                 )

Example writing subset of time series as daily maximum to file:

from datetime import datetime
from RAPIDpy import RAPIDDataset

path_to_rapid_qout = '/path/to/Qout.nc'
connection_list_file = '/path/to/connection_list_file.csv'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    # if file is CF compliant, you can filter by date and get daily values
    qout_nc.write_flows_to_gssha_time_series_ihg('/timeseries/Qout_daily_date_filter.ihg',
                                                 connection_list_file,
                                                 date_search_start=datetime(2002, 8, 31),
                                                 date_search_end=datetime(2002, 9, 15),
                                                 daily=True,
                                                 mode="max"
                                                 )
write_flows_to_gssha_time_series_xys(path_to_output_file, series_name, series_id, river_index=None, river_id=None, date_search_start=None, date_search_end=None, daily=False, mode='mean')[source]

Write out RAPID output to GSSHA WMS time series xys file.

Parameters:
  • path_to_output_file (str) – Path to the output xys file.
  • series_name (str) – The name for the series.
  • series_id (int) – The ID to give the series.
  • river_index (Optional[datetime]) – This is the index of the river in the file you want the streamflow for.
  • river_id (Optional[datetime]) – This is the river ID that you want the streamflow for.
  • date_search_start (Optional[datetime]) – This is a datetime object with the date of the minimum date for starting.
  • date_search_end (Optional[datetime]) – This is a datetime object with the date of the maximum date for ending.
  • daily (Optional[boolean]) – If True and the file is CF-Compliant, write out daily flows.
  • mode (Optional[str]) – You can get the daily average “mean” or the maximum “max”. Defauls is “mean”.

Example writing entire time series to file:

from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    qout_nc.write_flows_to_gssha_time_series_xys('/timeseries/Qout_3624735.xys',
                                                 series_name="RAPID_TO_GSSHA_{0}".format(river_id),
                                                 series_id=34,
                                                 river_id=river_id,
                                                 )

Example writing entire time series as daily average to file:

from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    # NOTE: Getting the river index is not necessary
    # this is just an example of how to use this
    river_index = qout_nc.get_river_index(river_id)

    # if file is CF compliant, you can write out daily average
    qout_nc.write_flows_to_gssha_time_series_xys('/timeseries/Qout_daily.xys',
                                                 series_name="RAPID_TO_GSSHA_{0}".format(river_id),
                                                 series_id=34,
                                                 river_index=river_index,
                                                 daily=True,
                                                 )

Example writing subset of time series as daily maximum to file:

from datetime import datetime
from RAPIDpy import RAPIDDataset

river_id = 3624735
path_to_rapid_qout = '/path/to/Qout.nc'

with RAPIDDataset(path_to_rapid_qout) as qout_nc:
    # NOTE: Getting the river index is not necessary
    # this is just an example of how to use this
    river_index = qout_nc.get_river_index(river_id)

    # if file is CF compliant, you can filter by date and
    # get daily values
    qout_nc.write_flows_to_gssha_time_series_xys('/timeseries/Qout_daily_date_filter.xys',
                                                 series_name="RAPID_TO_GSSHA_{0}".format(river_id),
                                                 series_id=34,
                                                 river_index=river_index,
                                                 date_search_start=datetime(2002, 8, 31),
                                                 date_search_end=datetime(2002, 9, 15),
                                                 daily=True,
                                                 mode="max"
                                                 )