博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Junk Dimension
阅读量:7053 次
发布时间:2019-06-28

本文共 1942 字,大约阅读时间需要 6 分钟。

hot3.png

In data warehouse design, frequently we run into a situation where there are yes/no indicator fields in the source system. Through business analysis, we know it is necessary to keep such information in the fact table. However, if keep all those indicator fields in the fact table, not only do we need to build many small dimension tables, but the amount of information stored in the fact table also increases tremendously, leading to possible performance and management issues.

Junk dimension is the way to solve this problem. In a junk dimension, we combine these indicator fields into a single dimension. This way, we'll only need to build a single dimension table, and the number of fields in the fact table, as well as the size of the fact table, can be decreased. The content in the junk dimension table is the combination of all possible values of the individual indicator fields.

Let's look at an example. Assuming that we have the following fact table:

Fact Table Before Junk Dimension

In this example, TXN_CODE, COUPON_IND, and PREPAY_IND are all indicator fields. In this existing format, each one of them is a dimension. Using the junk dimension principle, we can combine them into a single junk dimension, resulting in the following fact table:

Fact Table With Junk Dimension

Note that now the number of dimensions in the fact table went from 7 to 5.

The content of the junk dimension table would look like the following:

Junk Dimension Example

In this case, we have 3 possible values for the TXN_CODE field, 2 possible values for the COUPON_IND field, and 2 possible values for the PREPAY_IND field. This results in a total of 3 x 2 x 2 = 12 rows for the junk dimension table.

By using a junk dimension to replace the 3 indicator fields, we have decreased the number of dimensions by 2 and also decreased the number of fields in the fact table by 2. This will result in a data warehousing environment that offer better performance as well as being easier to manage.

转载于:https://my.oschina.net/duanfangwei/blog/402227

你可能感兴趣的文章
cloudstack 修改显示名称
查看>>
Opennebula自定义VM 实现方法-Contextualizing Virtual Machines 2.2
查看>>
我的友情链接
查看>>
iOS 代码创建无声音乐方法
查看>>
react-native-vector-icons在TabNavigator使用示例
查看>>
Linux的Bonding配置方法
查看>>
DNS处理模块dnspython之域名轮循业务监控
查看>>
虚拟机系统的磁盘扩容妙招及案例
查看>>
Python数值和字符串
查看>>
python学习笔记一
查看>>
Gym 100283F Bakkar In The Army
查看>>
POJ 2947 2947 Widget Factory 高斯消元
查看>>
ExtJS-3.4.0系列:Ext.TabPanel
查看>>
在Eclipse中配置Heritrix-1.14.4版本
查看>>
设计模式笔记:开闭原则(OCP,The Open-Closed Principle)
查看>>
Sql Server系列:键和约束
查看>>
服务器负载感知的URL HASH
查看>>
java xml操作
查看>>
关于Android SDK无法正常下载
查看>>
LVS-DR演示
查看>>