Module pipelines.utils.dump_url.utils

General purpose tasks for dumping data from URLs.

Functions

def generate_dump_url_schedules(interval: datetime.timedelta, start_date: datetime.datetime, labels: List[str], dataset_id: str, table_parameters: dict, batch_data_type: str = 'csv', runs_interval_minutes: int = 15) ‑> List[prefect.schedules.clocks.IntervalClock]

Generates multiple schedules for url dumping.

def handle_dataframe_chunk(dataframe: pandas.core.frame.DataFrame, save_path: str, partition_columns: List[str], event_id: str, idx: int, build_json_dataframe: bool = False, dataframe_key_column: str = None)

Handles a chunk of dataframe.