Updated on 2026-07-02 GMT+08:00

Real-Time Analysis of Website User Behaviors

Scenario

A website needs to collect statistics on user behavior data in real time.

Business Requirements

Analyze user behavior data in real time, including hourly/daily statistics on unique visitors (UVs), page views (PVs), visit duration, and retention rate.

Solution

Use the incremental refresh of materialized views and partitioned materialized views.

Example

  1. Create a user behavior log table.

    1
    2
    3
    4
    5
    6
    7
    CREATE TABLE user_behavior_log (
    user_id BIGINT,
    event_type VARCHAR(50),
    page_url VARCHAR(200),
    duration INT, -- Stay duration (in seconds)
    event_time TIMESTAMP
    ) WITH (orientation = COLUMN);
    

  2. Create a materialized view by hour and use partitioned materialized views for partition rollup calculations.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    CREATE MATERIALIZED VIEW mv_user_stats_hourly
    REFRESH FAST EVERY (interval '10 min')
    ENABLE QUERY REWRITE
    WITH (orientation = COLUMN)
    PARTITION BY date_trunc('hour', event_time)
    AS
    SELECT
    DATE_TRUNC('hour', event_time) as stat_hour,
    event_type,
    page_url,
    COUNT(*) as pv, --Number of page visits
    COUNT(DISTINCT user_id) as uv, --Number of unique visitors
    AVG(duration) as avg_duration, --Average stay duration
    COUNT(CASE WHEN duration > 60 THEN 1 END) as long_stay_count --Stay duration > 1 minute
    FROM user_behavior_log
    GROUP BY DATE_TRUNC('hour', event_time), event_type, page_url;
    

  3. Collect statistics on the retention rate.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    CREATE TABLE user_behavior_log (
    CREATE MATERIALIZED VIEW mv_user_retention
    REFRESH FAST EVERY (interval '1 hour')
    ENABLE QUERY REWRITE
    AS
    SELECT
    first_date,
    days_diff,
    user_count,
    ROUND(100.0 * user_count / (SELECT COUNT(*) FROM first_visit WHERE first_date = retention_calc.first_date), 2) as retention_rate
    FROM retention_calc;