Module pipelines.rj_smtr.br_rj_riodejaneiro_onibus_gps.tasks

Tasks for br_rj_riodejaneiro_onibus_gps

Functions

def clean_br_rj_riodejaneiro_onibus_gps(date_range: dict) ‑> Optional[str]

Clean GPS data for a given date range.

This function deletes records from three different tables in the database: - rj-smtr.br_rj_riodejaneiro_onibus_gps.sppo_aux_registros_filtrada - rj-smtr.br_rj_riodejaneiro_onibus_gps.sppo_aux_registros_realocacao - rj-smtr.br_rj_riodejaneiro_veiculos.gps_sppo

The records to be deleted are determined by the provided date range and the timestamp_gps column.

Parameters

  • date_range (dict): A dictionary containing the start and end dates for the data to be cleaned.

Returns

  • str or None: If an error occurs during the cleaning process, the error message is returned. Otherwise, None is returned.
def create_api_url_onibus_gps(version: str, timestamp: datetime.datetime = None) ‑> str

Generates the complete URL to get data from API.

def create_api_url_onibus_realocacao(interval_minutes: int = 10, timestamp: datetime.datetime = None, secret_path: str = 'realocacao_api') ‑> str

start_date: datahora mínima do sinal de GPS avaliado end_date: datahora máxima do sinal de GPS avaliado

def pre_treatment_br_rj_riodejaneiro_onibus_gps(status: dict, timestamp: datetime.datetime, version: int = 1, recapture: bool = False) ‑> Dict

Basic data treatment for bus gps data. Converts unix time to datetime, and apply filtering to stale data that may populate the API response.

Args

status_dict : dict
dict containing the status of the request made to the
API. Must contain keys: data and error
version : int, optional
Source API version. Temporary argument
for testing
timestamp : str
Capture data timestamp.

Returns

df_gps
pandas.core.DataFrame containing the treated data.
def pre_treatment_br_rj_riodejaneiro_onibus_realocacao(status: dict, timestamp: datetime.datetime) ‑> Dict

Basic data treatment for bus gps relocation data. Converts unix time to datetime, and apply filtering to stale data that may populate the API response.

Args

status : dict
dict containing the status of the request made to the

API. Must contain keys: data and error

Returns

df_realocacao
pandas.core.DataFrame containing the treated data.