cassandra 分区键增长限制？

本地数据01

我的分区变大意味着什么？我认为 cassandra 可以处理非常大的尺寸。为什么他们在这个例子中使用 2 个分区键？

我所做的也许两个分区键都太大了？

曼尼什·坎德尔瓦尔

您给出的示例是防止分区变得太大的方法之一。在 Cassandra 中partition key（主键的一部分）用于对相似的行集进行分组。

Here in left side data model, user_id is the partition key which means every video interaction by that user will be placed in same partition. As mentioned in example comment, if user is active and has 1000 interaction daily then in 60 days (2 months) you will have 60000 rows for that user. This may breach Cassandra permissible partition size (in terms of data size stored in single partirion).

So to avoid this situation there are many ways you can avoid partition size to grow too big. For example, you can do

Make another column from that table a part of partition key. This is done in the example above. The video_id is made part of partition key along with user_id.
Bucketing - This is the strategy which is used in time series data generally where you make multiple buckets of a partition key. For example if date is your partition key then you can create 24 buckets as date_1, date_2,.....,date_24. Now you have divided your partition key into smaller partition keys and hence you divided one big partition into 24 small partitions.

The main idea is to avoid your partition to grow too big in size. This is a data modeling technique which one should be aware of while creating data model for Cassandra.

如果仍然有较大的分区大小，则需要根据各种可用的数据建模技术重构数据模型。为此，我建议了解您的数据，估计增长率，计算分区的估计大小，如果您的数据模型不满足分区大小需求，则优化您的数据模型。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-09-1

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

cassandra 分区键增长限制？

cassandra 分区键增长限制？

计算数据帧R中的字符串频率

Android Studio Kotlin：提取为常量

Excel 2016图表将增长与4个参数进行比较

获取并汇总所有关联的数据

如何使用Redux-Toolkit重置Redux Store

http：// localhost：3000 /＃！/为什么我在localhost链接中得到“＃！/”。

将加号/减号添加到jQuery菜单

算术中的c ++常量类型转换

TYPO3：将 Formhandler 添加到新闻扩展

TreeMap中的自定义排序

如何开始为Ubuntu开发

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

无法使用 envoy 访问 .ssh/config

在Ubuntu和Windows中，触摸板有时会滞后。硬件问题？

遍历元素数组以每X秒在浏览器上显示

在Jenkins服务器中使用Selenium和Ruby进行的黄瓜测试失败，但在本地计算机中通过

警告消息：在matrix（unlist（drop.item），ncol = 10，byrow = TRUE）中：数据长度[16]不是列数的倍数[10]>？

未捕获的SyntaxError：带有Ajax帖子的意外令牌u

如何使用tweepy流式传输来自指定用户的推文（仅在该用户发布推文时流式传输）

尝试在Dell XPS13 9360上安装Windows 7时出错

如果从DB接收到的值为空，则JMeter JDBC调用将返回该值作为参数名称