计算次日、三日、七日留存

刘超 12天前 ⋅ 87 阅读   编辑

一、描述

  我们需要研究用户活跃情况,需统计如下数据,某日活跃用户(hashoperaid)在后续的一周内的存留情况(计算次日留存、三留、七留),指标定义如下

  1、某日活跃用户数即某日活跃的去重用户数
  2、N日留存用户数即某日活跃的用户在之后的第N日活跃用户数
  3、N日活跃留存率即N日留存用户数/某日活跃用户数

  例:20200701日去重用户数为10000,这批用户20200702依有8000人活跃,则次日留存率8000/10000=80%

  现有test表,包含uid、day等字段,所需获得结果如下:

日期 活跃用户数 次日留存用户数 三日留存用户数 七日留存用户数 次日留存率 三日留存率 七日留存率
20200701
20200702
20200703
20200704
20200705

二、准备

  1、建表、导入数据

create table test(
uid string,
day int
);
 
insert into test values
('1001',20200701),
('1002',20200701),
('1002',20200702),
('1001',20200702),
('1003',20200701),
('1004',20200701),
('1003',20200702),
('1004',20200702),
('1003',20200703),
('1001',20200703),
('1002',20200703),
('1002',20200704),
('1002',20200705),
('1001',20200704),
('1003',20200704);

三、实现

  1、先计算活跃用户数

select day, count(distinct uid) as userCount from test group by day;

  2、再看看次日留存,两表自关联,利用case when找到符合相差日期为1天的uid,计数,得出次日留存人数,最后用distinct去重

select a.day,count(distinct case when b.day-a.day==1 then b.uid end) as user1Keep 
from test a left join test b 
on a.uid=b.uid 
group by a.day;

  3、计算次日、三日、七日留存

select a.day,count(distinct a.uid) as userCount,
    count(distinct case when b.day-a.day==1 then b.uid end) as user1Keep,
    count(distinct case when b.day-a.day==3 then b.uid end) as user3Keep,
    count(distinct case when b.day-a.day==7 then b.uid end) as user7Keep,
    concat(count(distinct case when b.day-a.day==1 then b.uid end)/count(distinct a.uid)*100,'%') as user1Rate,
    concat(count(distinct case when b.day-a.day==3 then b.uid end)/count(distinct a.uid)*100,'%') as user3Rate,
    concat(count(distinct case when b.day-a.day==7 then b.uid end)/count(distinct a.uid)*100,'%') as user7Rate
from test a left join test b 
on a.uid=b.uid 
group by a.day;

注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: