Table of contents
The Challenge
Blog owner travel around the world and sell two types of products. One is an e-book and the second one is a course about best practices. On his blog he has thousands of readers every day. He uses a special logging software to monitor the performance of their blog. The business strategy is simple:
- New readers visit the blog.
- They read articles.
- They subscribe to the newsletter.
- They purchase the info products.
The marketing budget is not too high at the moment: He spends only ~$1000 a month combined:
- for Adwords advertisement (~$500 a month for paid ads),
- for SEO (~$250 a month for editing) and
- for Reddit (~$250 a month for content creation)
Project Overview
Client wants to invest more (time, work and money). But he would like to do it in a smart way. First let’s start our server and set up the environment.
Used Technology:
- Digital Ocean
- Bash
- Python
- Canva
Data Preprocessing and Exploration
In this data set, you will find the raw user activity data from the blog between 1 January 2018 and 31 March 2018. It’s a log with ~600.000 rows. Here is a screenshot about the raw data:
During the data cleaning process we have to create common columns. As you can see there are different types of information recorded for different events. This is the code how we deal with it:
The result is much better than the raw data. Now we can continue with the data exploration part.
During the general exploration we are able to see the big picture, which is not always useful but it can help us later.
Country country_5 25.042900 country_7 22.088402 country_2 21.755405 country_4 11.918007 country_6 10.662617 country_8 3.283648 country_3 1.526672 country_1 1.042144 Name: count, dtype: float64
Country Data Analysis
User Id Analysis
Source, Topic and Price Analysis
How many readers / country / source ?
Add source and country information to buy and subscribe event
Calculate Total Revenue, Conversion Rate and readers per country
Calculate the ROI per sources
Conversion Rates Per Source
Visitors per country
Revenue per country per source
Purchase Funnel
Most reads topics
Visualization
From previous section we already have the data. Let’s convert them into beautiful charts.
Let start with the first and most obvious question:
In which country should he prioritize his effort and why?
To answer this complex question we need a complex chart. Let’s put together the best KPI indicators and it will give us the answer.
Country 4 has the biggest potential due to high revenue and visitor conversion rate.
Country 5 also worse to consider due to high revenue and above average conversion rate and high number of readers
Reddit has the best ROI indicator
For the best ROI focus the resources to Reddit. The recommended source weighting can be found in the table.
Most of the visitors coming from Country 2 and Country 7.
Country 5 generates the most revenue.
Within Country 5 Reddit leads the revenue generated championship and it is also true for the other countries too.
Best Conversion Rate provided by SEO
The most read article is Asia.
Asia is clearly a favorite subject for readers. Interestingly, SEO visitors are mainly interested in North America, while paid visitors are interested in Europe.
The final conversion rate is 3%.
On the purchase funnel chart you can clearly see that how important is the conversion optimization. 7 out 10 visitors is not going to purchase, but there is a big potential in it.
Conclusion
This has turned out to be a good lengthy article. But as you can see, data analysis is not a 2-minute exercise. We need to dig deep into the data to find the obviousness correlations.
These correlations are usually shown on 3-4 dimension charts, but it is worth it because it can give you a real competitive advantage.