Module pipelines.rj_smtr.br_rj_riodejaneiro_onibus_gps.tasks
Tasks for br_rj_riodejaneiro_onibus_gps
Functions
def clean_br_rj_riodejaneiro_onibus_gps(date_range: dict) ‑> Optional[str]
-
Clean GPS data for a given date range.
This function deletes records from three different tables in the database: -
rj-smtr.br_rj_riodejaneiro_onibus_gps.sppo_aux_registros_filtrada
-rj-smtr.br_rj_riodejaneiro_onibus_gps.sppo_aux_registros_realocacao
-rj-smtr.br_rj_riodejaneiro_veiculos.gps_sppo
The records to be deleted are determined by the provided date range and the timestamp_gps column.
Parameters
- date_range (dict): A dictionary containing the start and end dates for the data to be cleaned.
Returns
- str or None: If an error occurs during the cleaning process, the error message is returned. Otherwise, None is returned.
def create_api_url_onibus_gps(version: str, timestamp: datetime.datetime = None) ‑> str
-
Generates the complete URL to get data from API.
def create_api_url_onibus_realocacao(interval_minutes: int = 10, timestamp: datetime.datetime = None, secret_path: str = 'realocacao_api') ‑> str
-
start_date: datahora mínima do sinal de GPS avaliado end_date: datahora máxima do sinal de GPS avaliado
def pre_treatment_br_rj_riodejaneiro_onibus_gps(status: dict, timestamp: datetime.datetime, version: int = 1, recapture: bool = False) ‑> Dict
-
Basic data treatment for bus gps data. Converts unix time to datetime, and apply filtering to stale data that may populate the API response.
Args
status_dict
:dict
- dict containing the status of the request made to the
- API. Must contain keys: data and error
version
:int
, optional- Source API version. Temporary argument
- for testing
timestamp
:str
- Capture data timestamp.
Returns
df_gps
- pandas.core.DataFrame containing the treated data.
def pre_treatment_br_rj_riodejaneiro_onibus_realocacao(status: dict, timestamp: datetime.datetime) ‑> Dict
-
Basic data treatment for bus gps relocation data. Converts unix time to datetime, and apply filtering to stale data that may populate the API response.
Args
status
:dict
- dict containing the status of the request made to the
API. Must contain keys: data and error
Returns
df_realocacao
- pandas.core.DataFrame containing the treated data.