top of page
Image by UX Indonesia

Google Data Analytics Capstone Project

Background:

The goal of this project is to allow us to get exposure to how a data analyst does its job in helping stakeholders achieve its overall goal. This gives us an opportunity to take what we have been learning in the program and apply our skills to a real-life case study. In this report, I will be walking through the process of my approach to helping the company achieve their business task. I used the data analysis steps: ask, prepare, process, analyze, share, and act to help me answer the key business questions asked by the Bellabeat company.​

 

About the company:

Bellabeat is a high-tech company that develops wearable products that monitor biometric and lifestyle data to help women become more aware of how their bodies workies leading to making healthier choices. They offer smart devices that collect data based on activity, calories, steps, and sleep that empower women with knowledge about their daily habitats, enhaving their health. An Bellabeat app comes along with it that allows users acess to their health data based on their device recrod. The Bellabeat app connects them to their line of smart wellness prdoucts as well.

 

Ask Phase:

The first step of the data analysis is the ask phase. This is where we define the problem we are trying to solve, making sure we fully understand the stakeholder’s expectations and focusing on the actual problem. This is the step where we take a step back and see the whole situation in context. In this case we focus on our stakeholder and the business task:

Business Task: gain insight on how the non-Bellabeat smart devices are used by consumers and apply it to one of the company’s product. The insights will allow us to form a marketing strategy.

 

Stakeholders: Urška Sršen (cofounder, CEO), Sando Mur (co-founder, mathematician), Bellabeat marketing analytics team

Prepare Phase:

The second step is the preparation phase where we decide what data we need to collect to answer the question and how to organize it to be useful. Using the business task as a guide will help us decide what metrics to measure, locate the data in the database, and create security measures to protect the data.

Data source: The data is stored in the dataset found in kaggle: FitBit Fitness Tracker Data (CC0: Public Domain, dataset made available through Mobius): This Kaggle data set contains personal fitness tracker from thirty fitbit users. The set consists of 30 eligible fitbit users that gave their consent to the submission of personal tracker data

 

Data Organization: Developed a pivot table which sums of the information that is consisted in each of the csv files. Here is the following key point information that is contained in each of the file:

Update coming soon

Accessibility and privacy of data: The data is an open-source, confirmed by Kaggle. The owner released its work to the public domain by waiving all of the rights to the work worldwide.

 

Data Information: The data consists of 33 users within a 30-day period from 2016, meaning the data is a bit outdated and also has a small population size leading to some bias. There is not enough demographic information in which we know whether the data that is presented actually represents the population that Bellabeat is targeting. The time limitation of the survey is only 2 months, so the approach I will take towards this case study is operational and be thoroughly straightforward

Process Phase:

This is the phase where we get the chance to learn even more about the data by playing around with it. We start by organizing and cleaning the data the best possible way we can to avoid any possible errors, inaccuracies, or inconsistencies. This might mean:

 

  • Using spreadsheet function to find incorrectly entered data

  • Using SQL functions or R programming to check for extra spaces

  • Removing repeated entries

  • Checking as much as possible for bias in the data.

 

Tool that I will be using: Due to the accessibility I used R programming to help analyze the data and create visualization to make my point across.

 

Step 1: Installing the libraries and opening them in: I download the libraries to make sure I have full access to the data in R and that will help us be able to analyze the data. In this report, the libraries had been downloaded beforehand (from above), so we are all set.

Step 2:Importing the datasets: I upload the datasets that I will be analyzing. Our analysis will focus on the following:

See the full report to look at the R programming code and its remaining steps

Step 3: Checking Data integrity by cleaning & formatting: Now that we understand about the data structures, next is to process them to look for any sort of errors and inconsistencies that could effect our overall data integrity. We check to see how many users have actually recorded their data and how many of those data are a duplicate.

Analyze and Share:

Step 4 and 5 of the data analysis will be combined for this case study since there will be visualization that will be coming off based on the analysis that we come across. For analysis, this where we think analytically about the data in which we might sort and format the data to make it easier to:

  • Perform calculation

  • Combine data from multiple sources

  • Create tables with your result

Questions to ask ourselves are:

  • What story is my data telling me

  • How will my data help me solve this problem

  • What type of person is most likely to use the project

While for the sharing process, this is where we summarize the results with clear and enticing visuals of the analysis using data via tools like graphs or dashboards. This is the chance to show the stakeholders that we have solved their problem and the process of reaching there.

 

We will analyze the trends of users of the Fitbit devices and determine whether we find will help us come up with a marketing strategy for Bellabeat.

Continue with the report to look more at the code and the results that are produced

Screenshot 2022-08-11 135255.png
bottom of page