Module pipelines.utils.dump_url.utils
General purpose tasks for dumping data from URLs.
Functions
def generate_dump_url_schedules(interval: datetime.timedelta, start_date: datetime.datetime, labels: List[str], dataset_id: str, table_parameters: dict, batch_data_type: str = 'csv', runs_interval_minutes: int = 15) ‑> List[prefect.schedules.clocks.IntervalClock]
-
Generates multiple schedules for url dumping.
def handle_dataframe_chunk(dataframe: pandas.core.frame.DataFrame, save_path: str, partition_columns: List[str], event_id: str, idx: int, build_json_dataframe: bool = False, dataframe_key_column: str = None)
-
Handles a chunk of dataframe.