Azure Table Storage
Azure Table Storage is a NoSQL key-value store for rapid development using massive semi-structured datasets and stores petabytes of structured data that are garnered from various source platforms.
Azure Table storage is excellent for flexible datasets, it lets users build cloud applications without locking down the data model to particular schemas.
Sprinkle supports a wide range of data sources. On clicking the “+sign”, a list of data sources pops up. In this case, Azure Table Storage is selected. A new Azure Table Storage data source is named and created.
After naming the data source, the configure tab would require the user to fill in the connection string and select the table type (Azure Table/Azure Cosmos Table) before testing the connection and updating it.
Also users can select Yes or No to Optimize Incremental Ingestion. If optimize is Yes, all the datasets will undergo full ingestion on every Sunday. If optimize is No, data will be ingesting incrementally and it never goes under complete ingestion.
In Datasets, the user must select the table from the drop down before opting between complete or incremental ingestion. If it’s incremental, the time column name should also be specified. It’s not the case when it comes to complete ingestion.
Tables can be ingested in four ways.
- Incremental loading with Start Date
- Incremental loading with No of days
- Complete loading with Start Date
- Complete loading with No of days
Incremental loading with Start Date
In this ingestion, during the first run complete data is pulled from the given Start Date and pulls data incrementally during weekdays. On every sunday morning it goes under complete loading and pulls data from the Start Date, according to optimization choice.
Incremental loading with No of days
In this ingestion, during the first run data is pulled according to the number of days and pulls data incrementally during Weekdays. On every sunday morning it goes under complete loading and pulls data from the number of days given, according to optimization choice. It won’t pull old data like in Start Date as ingestion is running based on the number of days.
Complete loading with Start Date
In this ingestion, it always loads data according to the Start date given.
Complete loading with No of days
In this ingestion, it always loads data according to the No of days given.
In the Ingestion Jobs tab, the concurrency (number of tables that can run in parallel, a maximum of 7) can be set preferentially before running the job. The status of the job will be updated in the tab below once it’s complete. The jobs can also be set to run automatically by enabling autorun. By default, the frequency is set to every night. Frequency can be changed by clicking on More --> Autorun-->Change Frequency.