What is Cohort Analysis: Beginners Guide to Improving Retention


A cohort analysis is used to study the behavior or outcome associated with a group of users over time. In Sprinkle we can analyse cohorts by creating different groups and understand the user retention and also new customer acquisition over a period of time. It could also be termed as Behavioral analytics.

For any business, cohort analysis is often used to study user retention. Cohort is a group of customers who have common characteristics. Cohorts can be defined as any customer who joined any website or registers to avail any service,which can be categorized by demographics, age,etc for a set of individuals. This can define cohorts over a period of time.

What is Cohort Analysis?

A cohort is a group of people with a shared trait at a particular time, e.g., customers who became customers at around the same time, a graduating class, or people being tracked during an epidemic.

Cohort analysis is an analysis focused on a specific cohort type. Cohort analysis tables are used to show cohort data to compare user groups in their lifecycle and understand the long-term relationships between user characteristics.

How to Use Cohort Analysis

Services like Google Analytics and Sprinkledata offer cohort analysis tools for marketers. This tool generates a table of users, their website acquisition date range, and user retention rate. It analyzes metrics such as conversion rates, goal completions, page views, revenue, sessions, session duration, and transactions per user.

Advanced cohort analysis tools enable segmenting data into more specific groups—e.g., acquisition cohorts vs. behavioral cohorts or mobile users vs. desktop users. The date range and cohort size can be adjusted to suit the project's scope.

The steps typically involved in the analysis process include:

  • Extracting raw data: Pull raw data from a database using MySQL, then export it into spreadsheet software to join user attributes and further segment.
  • Creating cohort identifiers: Categorize user data according to date joined, first purchase, graduation year, and mobile device location and time.
  • Calculating lifecycle stages: After categorizing customers into cohorts, we measure the time gap between their actions to determine lifecycle stages.
  • Creating tables and graphs: Pivot tables and graphs provide insights into user data by displaying comparisons and aggregated information for multiple elements.

When to Use Cohort Analysis

Customer cohort analysis is beneficial to businesses and marketing. Examining trends in cohort spending over time reveals whether the quality of customers is improving. This process is referred to as lifetime value cohort analysis.

Cohort analysis for retention offers valuable insights into user behavior and website performance. Analysts can detect trends and patterns by comparing user groups and determining which behavioral adjustments lead to varying results.

Early customer churn can be reduced through cohort retention and acquisition analysis. Cohort analysis charts display churn timeframes. Common causes of early churn include product dissatisfaction, subpar onboarding processes, and inadequate user acquisition model.

To aid contact tracing, authorities and health professionals can utilize data from trusted sources (e.g., SafeGraph and X-Mode Social) and spatiotemporal cohort analysis to track individuals via their mobile devices. This analysis can be split into segments based on location, time, and device over a pre-defined period.

Types of cohort analysis

The two most common types of cohorts are:

  • Acquisition cohorts: Divide users into groups based on their signup date. Comparing retention and churn rates among these similar users can help measure success.
  • Behavioral cohorts: Divide users into groups according to their behavior in your product. This will let you observe active users by different demographics and with varied behavioral trends.

Acquisition cohorts enable insight into the timing of user actions. In contrast, behavioral cohorts provide a way to monitor and analyze user churn rates as they explain the reasoning behind user actions.

Five benefits of cohort analysis

Cohort analysis is a valuable tool for gaining insight into customers' behaviors and decisions within an app. The benefits of using cohort analysis include the following:

  1. Determine business health. Actiondesk co-founder and CEO Jonathan Parisot highlight cohort analysis as a key indicator of a healthy business: "It helps you identify which groups of customers are driving the most revenue, enabling you to focus on upselling them other products or services." Revenue growth without acquiring new customers strongly indicates a healthy business.
  2. Understand customers better. Through cohort analysis, businesses can unlock a richer understanding of their customers. They can spot patterns and trends beyond vanity metrics by monitoring customer conduct over time.
  3. Enhanced customer segmentation. Businesses can leverage user segmentation to create targeted marketing campaigns and provide personalized customer experiences.
  4. Increased customer retention. Cohort analysis can provide insight into retention rates and signs of churn. With this data, businesses can take proactive steps to enhance customer satisfaction.
  5. Optimize your app for increased interest. Cohort analysis illuminates trends and patterns in the customer lifecycle and can be used to enhance the user experience and bolster customer lifetime value.

Cohort Analysis in Sprinkle Data

The steps involved for analysing cohort analysis are as follows:-

  • Data Collection from different sources
  • Data Cleaning and Fact table creation
  • Data Modeling
  • Cohort Report creation

1. Date/Time based data

The data should have date and time based on the list of individuals or events. The events should track attributes like purchase date, sign-in date, product name, etc relevant to the analysis.


2. Calculating Days, Week or Month from the day of first Order date (Data Preparation)

 For calculating the 2nd order date we can use window functions. This will help us know if the user is a regular customer or an occasional shopper. The step will help the organization to decide on the discounts or the marketing campaigns.

In the below screenshot we have created a fact table where we have changed the data type of different attributes, calculated when the user ordered for the second time using the lag and lead window function. The expression to achieve the second order date -:

select *,
lead(Order_Date) over (partition by Customer_ID order by  Order_Date) as second_order_date,
rank() over (partition by Customer_ID order by  Order_Date) as order_ranking,
(select *, rank() over (partition by z.Order_ID order by  z.Order_Date) ranking  from ds_superstore_superstore z) where ranking=1)a where order_ranking=1


3. Data Modeling in Sprinkle

Data modelling in Sprinkle is easy and users do not have to write any complex codes. Creating a model is just some clicks away. For creating cohorts according to the first and second purchase we need to find the buckets like number of customers returned within 5 months, after 5 months etc. We have configured the cohort interval for 5 months. The expression query is as follows-:

when date_diff(second_order_date, first_order_date, month) <= 5 then "< = 5 months"
when date_diff(second_order_date, first_order_date, month) between 5 and 11 then "> 5 and <=10 months"  
when date_diff(second_order_date, first_order_date, month) between 10 and 16 then "> 10 and<=15 months"
when date_diff(second_order_date, first_order_date, month) between 15 and 21 then "> 15 and <=20 months"
when date_diff(second_order_date, first_order_date, month) between 20 and 26 then "> 20 and < = 25 months"
when date_diff(second_order_date, first_order_date, month) >25 then "> 25 months"

The difference between the 2nd order date and 1st order date can be calculated using models in Sprinkle tool. The expression can be as follows-:

 date_diff(second_order_date, first_order_date, month)


4. Final report on Cohort Analysis in Segment

The report is now a Segment in Sprinkle Tool. The segment can be prepared using the model where we model the data by adding dimensions, measures, creating expressions, etc. The report can be achieved by the user by just some clicks, choosing the drop downs for filters and sorting.


To see the tabular cohort analysis in sprinkle we need to pivot on the cohort column i.e “Months to repeat purchase”. The pivot, transpose and seeing aggregates appears in a pop window.


FInally we are able to achieve the Cohort analysis in tabular format, which will give the analyst a broader analysis of marketing campaigns, acquiring new users and the regular users.



Cohort analysis is a simple and efficient way to understand the marketing campaign, user retention and new user acquisition performance and user acquisition. Cohort analysis can be achieved in Sprinkle tools by creating fact tables in flows by doing all data cleaning, transformations and enrichments. The user can create models using those fact tables, define the measures and dimensions and user defined expressions, using those in segments we are able to achieve Cohort analysis.

Written by
Soham Dutta
No items found.


What is Cohort Analysis: Beginners Guide to Improving Retention