Data Warehouse

Tutorial Videos

API

Storage and Compute

Data Sources

CDC Setup

Transform

KPI

Models
Segments

Dashboard

Drill Down

Explores

Machine Learning

Sharing

Scheduling

Notifications

View Activity

Admin

Launch On Cloud

FAQs

FAQ's

Security

Feedback

Option to take feedback from UI

Release Notes

Release Notes

Jupyter Notebook

Models

Segments

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, narrative text, equations, and visualizations.

It’s basically used for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning.

  • Jupyter supports over 40 programming languages, including Python, R, and PySpark
  • Your code can produce rich, interactive output: HTML, images, videos, LaTeX, and custom MIME types.
  • Leverage big data tools, such as Apache Spark, from Python, R and PySpark. Explore that same data with pandas, scikit-learn, ggplot2, TensorFlow.

Notebooks is a new feature in Sprinkle, on clicking it, it routes the user to a screen where a new notebook can be created. Notebook name and its type should be selected before creating one.

   

Create Jupyter

   

To commence with notebook and to run your scripts, you need to click on the “Start” button. Once the notebook is started the user can import the libraries.

   

Start Notebook

   

How to import data from sprinkle’s explore and segment reports to the notebook?

Sprinkle created a library named “sprinkleSdk” to import data from the reports.

Please find the below script to import the library and to import data into the data frame.

Import sprinkle SDK:

from sprinkleSdk import SprinkleSdk as sp

Read segment:

df = sp.read_segment('<segment_id>')

_ _

Read explore:

df = sp.read_explore('<explore_id>')

Once data is imported, you can run every kind of analysis like descriptive, predictive, prescriptive, diagnostic analysis using these data.

How to create a table and update an existing table in Sprinkle post-analysis?

Create table in warehouse using dataframe:

sp.create_table(user_defined_name, df)

Update existing table in warehouse:

sp.update_table(user_defined_name, df)

How to work on Spark session operations?

Get spark session with default configurations:

spark = sp.getOrCreate()

Change spark app name while creating default spark session:

spark = sp.appName('some-name').getOrCreate()

Get spark session where the user can customise your configuration:

spark = sp.sparkBuilder()

appName('some-name')

.config("spark.some.config.option1", "some-value")

.config("spark.some.config.option2", "some-value")

.getOrCreate()

import requests
from requests.auth import HTTPBasicAuth

auth =  HTTPBasicAuth(<API_KEY>, <API_SECRET>)
response = requests.get("https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE_ID>", auth)

print(response.content)

library('httr')

username = '<API KEY>'
password = '<API SECRET>'

temp = GET("https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>",
           authenticate(username,password, type = "basic"))

temp = content(temp, 'text')
temp = textConnection(temp)
temp = read.csv(temp)

/*Download the Data*/

filename resp temp;
proc http
url="https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>"
   method= "GET"  
   WEBUSERNAME = "<API KEY>"
   WEBPASSWORD = "<API SECRET>"
   out=resp;
run;

/*Import the data in to csv dataset*/
proc import
   file=resp
   out=csvresp
   dbms=csv;
run;

/*Print the data */
PROC PRINT DATA=csvresp;
RUN;

import requests
import json

url='http://hostname/api/v0.4/createCSV'

username='API_KEY'
password='API_SECRET'

files={'file':open('FILE_PATH.csv','rb')}
values={'projectname':PROJECT_NAME','name':'CSV_DATASOURCE_NAME'}

r=requests.post(url, files=files, data=values, auth=(username,password))

res_json=json.loads(r.text)

print(res_json['success'])

import requests
import json

url='http://hostname/api/v0.4/updateCSV'

username='API_KEY'
password='API_SECRET'

files={'file':open('FILE_PATH.csv','rb')}
values={'projectname':PROJECT_NAME','name':'CSV_DATASOURCE_NAME'}

r=requests.post(url, files=files, data=values, auth=(username,password))

res_json=json.loads(r.text)

print(res_json['success'])

import requests

url='https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password))
print(r)
print(r.text)

import requests
import pandas as pd

url='https://<hostname>/api/v0.4/explores/infoByFolder/<SPACE_ID>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password)).json()
df = pd.DataFrame(r)
print(df)

import requests
import pandas as pd

url='https://<hostname>/api/v0.4/folders/byOrgName/<ORG_NAME>'

username='API_KEY'
password='API_SECRET'

r=requests.get(url,auth=(username,password)).json()
df = pd.DataFrame(r)
print(df.loc[:,['name','id']])

import requests

import pandas as pd

import io

url='https://<hostname>/api/v0.4/explore/streamresult/<EXPLORE ID>'

secret='API_SECRET'

r=requests.get(url,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

df = pd.read_csv(io.StringIO(r.text),sep=',')

import requests

import pandas as pd

import io

url='https://<hostname>/api/v0.4/segment/streamresult/<SEGMENT ID>'

secret='API_SECRET'

r=requests.get(url,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

df = pd.read_csv(io.StringIO(r.text),sep=',')

import requests

import json

url='http://hostname/api/v.o4/createCSV'

files={'file':open('path/file.csv’')}

values={'projectname':PROJECT_NAME,'name':'csv_datasource_name/table_name'}

secret='API_SECRET'

r=requests.post(url, files=files, data=values, headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

res_json=json.loads(r.text)

import requests

import json

url='http://hostname/api/v.o4/updateCSV'

files={'file':open('path/file.csv’')}

values={'projectname':PROJECT_NAME,'name':'csv_datasource_name/table_name'}

secret='API_SECRET'

r=requests.post(url, files=files, data=values,headers = {'Authorization': 'SprinkleUserKeys ' +secret } )

res_json=json.loads(r.text)