Sprinkle

Sprinkle

  • Docs
  • Tutorials
  • API
  • FAQ's
  • Blog
  • Go to sprinkledata.com

›Data Sources

Data Warehouse

  • Why the warehouse?
  • Amazon Athena
  • Apache Hive
  • Databricks
  • BigQuery
  • Snowflake
  • Redshift

Storage

  • Why the storage?
  • AWS S3 Bucket
  • Google Cloud Storage
  • Azure Blob Storage

Data Sources

  • Overview and Creating Data Source
  • Ingestion Mode
  • How Sprinkle handles the ingestion if there is a change in schema in the client DB?
  • Flattening JSON columns in DB
  • Column excluding and masking in DB table
  • Ingestion via SSH Tunnel
  • Configurable Destination Schema and table name
  • PostgreSQL
  • Salesforce
  • MySQL
  • MongoDB
  • Mixpanel
  • Hubspot
  • CosmosDB
  • CSV
  • AppsFlyer
  • CleverTap
  • SQL
  • Kafka
  • Amazon Kinesis
  • Azure Event Hub
  • Azure Table Storage
  • Zoho CRM
  • Freshsales
  • Google Analytics
  • GoogleSheet
  • Google Cloud Storage
  • Azure Blob
  • S3
  • Webhook
  • Sendgrid
  • Segment
  • Google Ads
  • Google Analytics MCF
  • Zendesk Support
  • Zendesk Chat
  • Google Search Console
  • Shopify
  • Facebook Ads
  • Mailchimp
  • WebURL
  • Klaviyo
  • SAP S4
  • Intercom
  • Marketo
  • Freshdesk
  • Leadsquared
  • Bigquery
  • MongoDB Atlas
  • Paytm
  • HDFS
  • FTPS
  • FTP

CDC Setup

  • MySQL
  • Postgres
  • Mongo

Transform

  • Schema Browser
  • Overview and Creating Flow
  • Advanced Features in Flow

KPI

    Models

    • Overview
    • Creating Model
    • Joins
    • Hierarchical Filters
    • Default Date Filters
    • Column Description in reports

    Segments

    • Overview
    • Creating Segment
    • Publish segment as table
    • Transpose
    • Show Labels Annotations on Charts
    • Tooltips
    • Fixed Columns
    • Conditional Builders
    • Cumulative Sum and Percentages
    • Embed Segment

    Metric Alerts

    • Overview and Creating Metric Alerts

Dashboards

  • Overview and Creating Dashboard
  • Embed Dashboard
  • Restricting filters
  • Sharing resources

Drill Down

  • Drill Down Feature In Segments And Dashboards
  • Drill Down Hierarchical Dimensions
  • Drill Down Expression Hierarchical Dimensions

Explores

  • Overview and Creating Explore
  • Show Labels Annotations on Charts
  • Tooltips

Machine Learning

  • Jupyter
  • Notebook Setup Guide

Sharing

  • Sharing Segments and Explore Reports
  • Share folders with users or groups

Scheduling

  • Schedule Timeline
  • Autorun

Notifications

  • Email Notifications
  • Slack Notifications

View Activity

  • View Activity

Admin

  • Admin -> usage
  • User Permissions & Restrictions
  • Github Integration

Launch On Cloud

  • AWS
  • Azure
  • Setup Sprinkle

Security

  • Security at Sprinkle
  • GDPR

Feedback

  • Option to take feedback from UI

Release Notes

  • Release Notes

S3

The Amazon S3-based data lake solution uses Amazon S3 as its primary storage platform. Amazon S3 provides an optimal foundation for a data lake because of its virtually unlimited scalability. This storage space is used as a cloud storage data source.

Sprinkle supports a wide range of data sources. On clicking the “+sign”, a list of data sources pops up. In this case, S3 Datasource is selected. A new S3 Data source is named and created.

alt_text

After naming the data source, the connection tab would require the user to provide the Secret Key, Access Key, Region, Bucket Name. The credentials can be tested if they are valid or not by testing the connection before updating.

Follow the documentation at https://amzn.to/2CS9OcK to generate access key and secret key and provide them in the connection details, if you are allowing access key based access to your storage.

Region should be where the storage bucket was created, for example ap-south-1 and Bucket name is the name of the bucket created on AWS S3. Eg: Twx-Bucket

alt_text

In Datasets, the user is required to specify a table name and select the type of ingestion, whether it is complete ingestion or incremental ingestion. Complete ingestion loads the entire data at once irrespective of the pre-existing data. This takes significant time, if data is huge. In Incremental loading only new and latest data is ingested.

After selecting the ingestion mode, the File Type needs to be selected as either ORC, JSON, CSV or PARQUET. Then, the user can optionally define a directory path to pull data from, so that it pulls all the files in that specific path, Eg: s3a://test-sprinkle-a/s3Ingest/s3Ingest13

alt_text

In the Ingestion jobs tab, the concurrency (number of tables that can run in parallel, a maximum of 7) can be set preferentially before running the job. The status of the job will be updated in the tab below once it’s complete. The jobs can also be set to run automatically by enabling autorun. By default, the frequency is set to every night. Frequency can be changed by clicking on More --> Autorun-->Change Frequency.

alt_text

Sprinkle supports different types of delimiters for CSV ingestion through cloud storage ingestion. When a user chooses CSV as the type of file then drop downs related to CSV file appear.

alt_text

In the drop down there are delimiters like comma,tab,pipe, dash or other.

alt_text

If the user chooses OTHER_CHARACTER as a type of CSV delimiter then one more field appears where the user can write the symbol for the delimiter.

alt_text

← Azure BlobWebhook →

Product

FeaturesHow it worksIntegrationsDeploymentPricing

Industries

Retail & EcommerceUrban MobilityFinanceEducation

Departments

MarketingOperationsTechnology

Connect

Free trialAbout Us

Actionable Insights. Faster.

Sprinkle offers self-service analytics by unlocking enterprise scale data via simple search and powerful reporting service.


Copyright © 2021 Sprinkle data