Data Warehouse

Tutorial Videos

API

Storage and Compute

Data Sources

CDC Setup

Transform

KPI

Models
Segments

Dashboard

Drill Down

Explores

Machine Learning

Sharing

Scheduling

Notifications

View Activity

Admin

Launch On Cloud

FAQs

FAQ's

Security

Feedback

Option to take feedback from UI

Release Notes

Release Notes

Security at Sprinkle

Models

Segments

Overview

Sprinkle data provides a secure environment to customers and keeps all the data safe, by following industry standard practices for security. Sprinkle data follows security by design for building the product, any new feature or improvement. Following sections document various facets of security at Sprinkle.

Data retention

Customer Metadata

User data of customers is stored in Database and will be deleted immediately when the customer’s organization is removed.

Data points for storage or warehouse drivers, datasource connectors, metadata for reports, queries, dashboards are all stored in a separate database for each organization, which gets removed immediately whenever a data point is removed or when the organization is removed.

Customer Data

Customer data from the customer's warehouse is never stored in sprinkle infrastructure. And customer data ingested through Sprinkle is never stored in Sprinkle infrastructure, for any datasource other than Webhooks. When a customer creates a project with Sprinkle, it is associated with a storage. All data imported, query result outputs, any other temporary files created are stored in customer storage.

In case of webhook based ingestion, data is stored on sprinkle infrastructure, which gets promoted to customer’s storage every 5 minutes.

Encryption at REST

Sprinkle infrastructure uses cloud based infra (GCP, Azure or AWS), which provides encryption at REST by default for storage in database or Managed disk.

Reference :

https://cloud.google.com/security/encryption-at-rest

https://docs.microsoft.com/en-us/azure/mysql/concepts-security#information-protection-and-encryption

https://docs.microsoft.com/en-us/azure/virtual-machines/disk-encryption

https://aws.amazon.com/rds/features/security/

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html#encryption-by-default

Network security / Encryption in motion

Web portal/ API

  • All connections to web portal or API are encrypted by default using TLS 1.x
  • Any attempt to connect over unencrypted (HTTP) channel are redirected to encrypted channel (HTTPS)
  • All the connections over TLS are authenticated from the origin server.
  • All the connection/ API are protected through Web Application Firewall (WAF) to prevent attacks through known vulnerabilities

Data through connectors

  • Data from all SaaS based connectors is pulled through encrypted channels (HTTPS).
  • Customers can whitelist specific Sprinkle IPs to access data, instead of keeping the connections open to the public. Customers can also create SSH proxy tunnels instead of whitelisting IPs, to provide access to Sprinkle.

Infrastructure

  • All the production environment lies in a separate Virtual Private Network, which has strict firewall controls over incoming and outgoing traffic.

Credential Management

Data points for storage or warehouse drivers, or datasource connectors can have keys, secrets and passwords. They are stored in encrypted form in a separate database for each organization.

Sprinkle data requires READ permissions for reading data from connectors, WRITE/DELETE permission in warehouses to create/drop tables, and WRITE/DELETE permission in storage for creating/dropping data. If customers grant higher permissions than required, Sprinkle data would never use them.

User authentication

Customer’s organization in Sprinkle can choose password based authentication or Oauth based single sign-on authentication for users.

Password based authentication

  • Sprinkle enforces Password policy to have strong passwords for users with
  • Passwords must be at least 8 chars long with at least 1 number, 1 special char and one capital letter.
  • Sprinkle enforces Account lockout policy to lockout the user account after 3 failed attempts of login.

Oauth based Single sign-on

Customer’s users can sign in with a single signon provided by Google or Microsoft.

Access control and Audit

Sprinkle provides Admin level controls, user level controls with different roles and permissions to achieve fine grained control over access to data. More documentation at User permissions and Restrictions.

No developer/engineer will have access to any customer’s organization, other than technical support (only qualified staff are allowed). Technical support will help customers with issues reported for debugging or setup. Support’s login has been protected through single sign-on with Google and MFA enabled.

All the activities on Sprinkle resource will have the audit log. More details in activity docs.

Human Access to Infrastructure

  • Developer/Engineer helping with customer issues will not have access to any Sprinkle infrastructure other than application logs.
  • Only qualified staff are allowed to access Sprinkle infrastructure for doing any deployments, upgrades or configuration changes for the environment.
  • Login to VMs are only with SSH keys, and no password based login allowed, and are allowed only from a jump box, which is further protected to access within VPN.
  • At Sprinkle Data, access to all infrastructure has been protected through MFA.

PII handling

Sprinkle data itself does not handle any PII (personally identifiable information) separately. It is like any other data that a customer is ingesting. Sprinkle provides features to customers to exclude columns or mask columns with 1-way hash to avoid loading PII data to their warehouse.

import requests
from requests.auth import HTTPBasicAuth

auth =  HTTPBasicAuth(<API_KEY>, <API_SECRET>)
response = requests.get("https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE_ID>", auth)

print(response.content)

library('httr')

username = '<API KEY>'
password = '<API SECRET>'

temp = GET("https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>",
           authenticate(username,password, type = "basic"))

temp = content(temp, 'text')
temp = textConnection(temp)
temp = read.csv(temp)

/*Download the Data*/

filename resp temp;
proc http
url="https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>"
   method= "GET"  
   WEBUSERNAME = "<API KEY>"
   WEBPASSWORD = "<API SECRET>"
   out=resp;
run;

/*Import the data in to csv dataset*/
proc import
   file=resp
   out=csvresp
   dbms=csv;
run;

/*Print the data */
PROC PRINT DATA=csvresp;
RUN;

import requests
import json

url='http://hostname/api/v0.4/createCSV'

username='API_KEY'
password='API_SECRET'

files={'file':open('FILE_PATH.csv','rb')}
values={'projectname':PROJECT_NAME','name':'CSV_DATASOURCE_NAME'}

r=requests.post(url, files=files, data=values, auth=(username,password))

res_json=json.loads(r.text)

print(res_json['success'])

import requests
import json

url='http://hostname/api/v0.4/updateCSV'

username='API_KEY'
password='API_SECRET'

files={'file':open('FILE_PATH.csv','rb')}
values={'projectname':PROJECT_NAME','name':'CSV_DATASOURCE_NAME'}

r=requests.post(url, files=files, data=values, auth=(username,password))

res_json=json.loads(r.text)

print(res_json['success'])

import requests

url='https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password))
print(r)
print(r.text)

import requests
import pandas as pd

url='https://<hostname>/api/v0.4/explores/infoByFolder/<SPACE_ID>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password)).json()
df = pd.DataFrame(r)
print(df)

import requests
import pandas as pd

url='https://<hostname>/api/v0.4/folders/byOrgName/<ORG_NAME>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password)).json()
df = pd.DataFrame(r)
print(df.loc[:,['name','id']])

import requests

import pandas as pd

import io

url='https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>'

secret='API_SECRET'

r=requests.get(url,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

df = pd.read_csv(io.StringIO(r.text),sep=',')

import requests

import pandas as pd

import io

url='https://<hostname>/api/v0.4/segment/streamresult/<SEGMENT ID>'

secret='API_SECRET'

r=requests.get(url,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

df = pd.read_csv(io.StringIO(r.text),sep=',')

import requests

import json

url='http://hostname/api/v.o4/createCSV'

files={'file':open('path/file.csv’')}

values={'projectname':PROJECT_NAME,'name':'csv_datasource_name/table_name'}

secret='API_SECRET'

r=requests.post(url, files=files, data=values, headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

res_json=json.loads(r.text)

import requests

import json

url='http://hostname/api/v.o4/updateCSV'

files={'file':open('path/file.csv’')}

values={'projectname':PROJECT_NAME,'name':'csv_datasource_name/table_name'}

secret='API_SECRET'

r=requests.post(url, files=files, data=values,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

res_json=json.loads(r.text)