The retention view allows you to visualize engagement over time by tracking the number of users that complete one event after another event. Retention analysis, also known as cohort analysis, has the potential to answer a multitude of questions related to engagement over time such as, "After users sign up for our app, do they continue to visit day by day?" or "Are my new users more active than my old users?"

Making Your First Retention Report

To make a retention report, navigate to Analyze > Retention, then select the following:

Start Event: The start event is the event that you'd like to use as the foundation of the retention report; it is often some sort of one-time activation event. Session, Create Account, Upload a Profile picture and Install App are all good examples of start events, though any defined event can serve as a start event.

Return Event: The return event is generally a repeated action that you want to see over time. Examples of repeated actions might be Login, Visit, or Read Article, though any event can be a return event.

Group By: By default, the group by clause is set to Date of Start Event, which means the retention chart will group based on the first time they completed the start event within the date range. The group by clause supports any user-level property available to Heap, as well as behavioral properties like Has done, In Segment, etc. You can also remove the group by to see your users' aggregate retention without cohorts.

Date Range: By default, Heap will set the date range of the retention report to the previous two weeks (14 days). The retention report supports the following date ranges:

  • Past 7 Days
  • Past 30 Days
  • Past 90 Days
  • Past Year
  • Date to Now
  • Choose a Date Range (a custom date range)

Keep in mind that the date range bounds all the numbers presented in the retention report.

Granularity: This value determines the interval by which we view how often a person completes the return event, either by Day, Week, or Month. By default, this value is set to Day. A week or month refers to 7 days or 30 days, rather than a calendar week or a calendar month.

For this example, we will make a Session to Session retention report, grouped by the Date of First Event.

Understanding the Retention Report

The graph below shows a Session to Session retention report for the past 7 days by Day. Each row represents a cohort of users. Each column represents a rolling window from the start event, defined by the granularity.

Column 0 is within 24 hours of the user's first event. Column 1 is 24-48 hours after the user's first event. Users are only counted once per report, so a user who had their first session in the range on March 26 who came back a day later on March 27 will not be double counted in that cohort.

Hovering over each cell gives you a description of the data.

By default, users are counted each time they do the return event. Clicking the First Time checkbox and re-running the query changes the data to count only each user's first return event.

You can also view the report as a line graph, which lets you see trends at a glance more clearly.

Why is the first row in my retention report always so high?

This is a result of the fact that each cohort is mutually exclusive. For example, if a user in already counted in row 0, then they will not show up in row 1, even if they were active during that time frame. No user is double counted.

Above the first row in your chart, there is an implicit, unseen number of rows that we don't show that would have further segmented your user activity. Since your retention report is set up to only count user activity within a specific time frame, anyone who would have been counted in one of those rows is automatically bucketed with the first row.

Date Range, Granularity and Retention

Retention is relative to the date range and the granularity chosen. Staying with the above example, if you change the date range to include the previous week, a user whose start event in the range appeared in the March 26 cohort may now appear in an earlier cohort. As you can see below, the number of users in the March 26 cohort has dropped from 4,557 to 2,51.

Let's change the date range to the past month and granularity to week. Each row now shows a calendar week in which the users did their first start event in the range selected. Each column is now a rolling 7-day window in which the user can do subsequent return events.

Because the date of the return event is relative to the first event, per user, in the graph above a user who did their first event in the week of Mar 24 - Mar 30 has a possible window of Mar 31 - April 6 to do their second event and be counted in column 0. This is an important distinction to remember when analyzing 7-day (weekly) or 30-day (monthly) retention.

Defining Cohorts

Retention analysis is most powerful when you group by cohorts beyond the default Date of Start Event. A cohort is a group of people who share a common characteristic over a period of time. For instance, in the previous example, users who first signed up on the same day make up one cohort of users. Almost any type of user-level property you can imagine can be a cohort. Some examples of cohorts are location, event history, and you can even define cohorts based upon custom properties sent to Heap via our Custom Identify API.

Heap only counts unique users in cohorts, not the total number of events or sessions. One person is only counted once in a given cohort and is in only one cohort (row) per retention report. However, an individual user is not limited to one cell within a row - if a person repeats a return event many times over time, these actions are reflected in multiple columns.

The example below answers the question "how does a particular user activity affect retention?" In this case, we want to know if users who upload a picture to our app are better retained over time than those who don't. Grouping by this property creates two cohorts.

As you can see from the result, people who upload are better retained: 47.13% of users who did our upload event came back a week after sign-up compared to 7.11% of those who did not. This suggests that we could pay more attention to our onboarding UX to make sure users are encouraged to upload a picture early in the process.

If we want to analyze our data without cohorts, we can do that, too! Click the x next to the group by clause to see the average retention curve across all users within the range selected.

Getting Value From Retention Analysis

The ability to derive insights from retention analysis goes beyond knowing how to generate and read a report — it depends on knowing what questions to ask. We recognize determining what metrics to analyze is difficult, so we've created a short list of tips below. Please don't hesitate to reach out to if you would like some guidance on how to analyze your data.

Tip 1: Retention analysis is particularly useful for making sure that changes to your application actually drive engagement. If you've made product improvements over time, you can use retention analysis to see if these changes have made an impact. You will most likely want to make sure that your newest users are more engaged, as this signals that product iterations are driving retention.

Tip 2: Retention analysis enables analyzing engagement even when masked by growth metrics. 52 Weeks of UX has a great write-up on this topic.

Tip 3: It is always best to define the start event as a one-time event to render clear results when grouping by the Date of Start Event. If you define the start event as a repeated event, such as Session, the first row of your retention table will likely be inflated. This is because the number of users in a cohort is defined by the first time they completed an action within the time range defined (which most likely will not be all-time). Thus, activity in the first row of your table will be artificially high from the activity of your power users.

Tip 4: Although we use retention and engagement synonymously in our documentation, it is always good to keep in mind the difference between the two.


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.