Hologres Uv Compute — Ho记录res Uv Compute
v0.2.0Ho记录res UV/PV computation using Dynamic Tables and RoaringBitmap for real-time deduplication at 扩展. Use for building incremental UV/PV 流水线s, RoaringBitmap-based user deduplication, flexible time-range UV aggregation, and text-to-int UID encoding for bitmap compatibility. Triggers: "ho记录res uv", "ho记录res pv", "roaringbitmap", "rb_build_agg", "rb_or_agg", "去重", "UV计算", "用户去重", "bitmap去重", "实时UV", "hg_id_encoding"
运行时依赖
安装命令
点击复制本土化适配说明
Hologres Uv Compute — Ho记录res Uv Compute 安装说明: 安装命令:["openclaw skills install hologres-uv-compute"]
技能文档
Prerequisites
This 技能 requires ho记录res-命令行工具 to be 安装ed first:
pip 安装 ho记录res-命令行工具 导出 HO记录RES_技能=ho记录res-uv-compute
All SQL execution and Dynamic Table operations depend on ho记录res-命令行工具 commands (ho记录res sql 运行 --write, ho记录res dt 创建).
Ho记录res UV/PV Computation with Dynamic Table & RoaringBitmap
Build real-time, incremental UV/PV computation 流水线s using Dynamic Tables and RoaringBitmap in Ho记录res. This 应用roach supports flexible time-range aggregation over billions of records with low latency.
Why This 应用roach Traditional COUNT DISTINCT RoaringBitmap + Dynamic Table Full 扫描 on every 查询 Pre-聚合d bitmaps, incremental refresh Slow with high-cardinality UIDs 压缩ed bitmap, sub-second UV queries Cannot merge across time ranges RB_OR_AGG merges bitmaps for any date range Heavy resource usage Incremental computation, minimal resources Quick 启动 -- 1. Enable RoaringBitmap 扩展 创建 扩展 IF NOT EXISTS roaringbitmap;
-- 2. 创建 ODS detAIl table (source data) BEGIN; 创建 TABLE ods_应用_detAIl ( uid int, country text, prov text, city text, ymd text NOT NULL ) 记录ICAL PARTITION BY 列出 (ymd); CALL 设置_table_property('ods_应用_detAIl', 'orientation', 'column'); CALL 设置_table_property('ods_应用_detAIl', 'distribution_key', 'uid'); CALL 设置_table_property('ods_应用_detAIl', 'clustering_key', 'ymd'); CALL 设置_table_property('ods_应用_detAIl', 'event_time_column', 'ymd'); CALL 设置_table_property('ods_应用_detAIl', 'bitmap_columns', 'country,prov,city,ymd'); COMMIT;
-- 3. 创建 DWS Dynamic Table (bitmap aggregation layer) 创建 DYNAMIC TABLE dt_dws_应用_rb ( country, prov, city, rb_uid, pv, ymd ) 记录ICAL PARTITION BY 列出 (ymd) WITH ( freshness = '5 minutes', auto_refresh_mode = 'incremental', auto_refresh_partition_active_time = '2 days', partition_key_time_格式化 = 'YYYYMMDD' ) AS SELECT country, prov, city, RB_BUILD_AGG(uid) AS rb_uid, COUNT(1) AS pv, ymd FROM ods_应用_detAIl GROUP BY country, prov, city, ymd;
-- 4. 查询 UV/PV for a single day SELECT country, prov, city, RB_CARDINALITY(RB_OR_AGG(rb_uid)) AS uv, SUM(pv) AS pv FROM dt_dws_应用_rb WHERE ymd = '20251223' GROUP BY country, prov, city;
Architecture Overview ODS (DetAIl) DWS (Bitmap Aggregation) 查询 ┌─────────────┐ Dynamic ┌──────────────────────┐ ┌─────────────┐ │ods_应用_detAIl│──Table────>│ dt_dws_应用_rb │───>│ RB_OR_AGG │ │ uid, dims, │ incremental│ rb_uid (bitmap), │ │ + CARDINALITY│ │ ymd │ refresh │ pv, dims, ymd │ │ = UV for any │ └─────────────┘ └──────────────────────┘ │ time range │ └─────────────┘
Data flow:
Raw 事件 flow into ods_应用_detAIl (partitioned by day) Dynamic Table dt_dws_应用_rb incrementally 聚合s UIDs into bitmaps per dimension per day Queries merge bitmaps across any date range using RB_OR_AGG for exact UV ODS DetAIl Table De签名
The source table stores raw event data, partitioned by date.
BEGIN; 创建 TABLE ods_应用_detAIl ( uid int, country text, prov text, city text, ymd text NOT NULL ) 记录ICAL PARTITION BY 列出 (ymd);
CALL 设置_table_property('ods_应用_detAIl', 'orientation', 'column'); CALL 设置_table_property('ods_应用_detAIl', 'distribution_key', 'uid'); CALL 设置_table_property('ods_应用_detAIl', 'clustering_key', 'ymd'); CALL 设置_table_property('ods_应用_detAIl', 'event_time_column', 'ymd'); CALL 设置_table_property('ods_应用_detAIl', 'bitmap_columns', 'country,prov,city,ymd'); COMMIT;
Key de签名 choices:
Property Value Reason orientation column Columnar storage for analytical queries distribution_key uid Distribute by user for aggregation locality clustering_key ymd 优化 time-range 扫描s event_time_column ymd Segment key for partition p运行ing bitmap_columns dimension columns Accelerate dimension 过滤器ing DWS Dynamic Table (Bitmap Aggregation)
The Dynamic Table pre-聚合s UIDs into RoaringBitmaps per dimension per day using incremental refresh.
创建 DYNAMIC TABLE dt_dws_应用_rb ( country, prov, city, rb_uid, pv, ymd ) 记录ICAL PARTITION BY 列出 (ymd) WITH ( freshness = '5 minutes', auto_refresh_mode = 'incremental', auto_refresh_partition_active_time = '2 days', partition_key_time_格式化 = 'YYYYMMDD' ) AS SELECT country, prov, city, RB_BUILD_AGG(uid) AS rb_uid, COUNT(1) AS pv, ymd FROM ods_应用_detAIl GROUP BY country, prov, city, ymd;
Key Dynamic Table parameters:
Parameter Value Description freshness 5 minutes Tar获取 data freshness auto_refresh_mode incremental Only compute new/changed data auto_refresh_partition_active_time 2 days Only auto-refresh recent 2 days partition_key_time_格式化 YYYYMMDD 解析 partition key as date Refresh Historical Partitions
Auto-refresh only covers active partitions. For historical data, trigger manually:
-- Full refresh a specific partition REFRESH DYNAMIC TABLE dt_dws_