Module pipelines.rj_cor.comando.eventos.tasks
Tasks for comando
Functions
def download_data_atividades(first_date, last_date, wait=None) ‑> pandas.core.frame.DataFrame
-
Download data from API
def download_data_ocorrencias(first_date, last_date, wait=None) ‑> pandas.core.frame.DataFrame
-
Download data from API
def get_date_interval(first_date, last_date) ‑> Tuple[dict, str]
-
If
first_date
andlast_date
are provided, format it to DD/MM/YYYY. Else, get data from last 3 days. first_date: str YYYY-MM-DD last_date: str YYYY-MM-DD def get_redis_df(dataset_id: str, table_id: str, name: str, mode: str = 'prod') ‑> pandas.core.frame.DataFrame
-
Acess redis to get the last saved df and compare to actual df, return only the rows from actual df that are not already saved.
def get_redis_max_date(dataset_id: str, table_id: str, name: str = None, mode: str = 'prod') ‑> str
-
Acess redis to get the last saved date and compare to actual df.
def not_none(something: Any) ‑> bool
-
Returns True if something is not None.
def save_data(dataframe: pandas.core.frame.DataFrame) ‑> Union[str, pathlib.Path]
-
Save data on a csv file to be uploaded to GCP
def save_no_partition(dataframe: pandas.core.frame.DataFrame, append: bool = False) ‑> str
-
Saves a dataframe to a temporary directory and returns the path to the directory.
def save_redis_max_date(dataset_id: str, table_id: str, name: str = None, mode: str = 'prod', redis_max_date: str = None, wait=None)
-
Acess redis to save last date.
def treat_data_atividades(dfr: pandas.core.frame.DataFrame, dfr_redis: pandas.core.frame.DataFrame, columns: list) ‑> Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]
-
Normalize data to be similiar to old API.
def treat_data_ocorrencias(dfr: pandas.core.frame.DataFrame, redis_max_date: str) ‑> Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]
-
Rename cols and normalize data.