我为法拉第咒语而苦苦扫描整个DDB表。以下函数产生输出,但返回的结果比表中已知的18M记录少得多。
(far/scan
common/client-opts
v2-index/layer-table-name
{:return #{:layer-key :range-key}})
=>
[{:range-key "soil&2015-07-22T15:13:09.101Z&ssurgo&v1", :layer-key "886985&886985"}
{:range-key "soil&2015-07-29T19:20:09.973Z&ssurgo&v1", :layer-key "886985&886985"}
...
{:range-key "veg&2014-05-29T16:16:31.000Z&true-color&v1", :layer-key "1674603&1674603"}
{:range-key "veg&2014-06-14T16:16:39.000Z&abs&v1", :layer-key "1674603&1674603"}]
我该怎么做才能使Faraday处理所有记录?源代码表明有一些:last-prim-kvs
选择,但是我不清楚在那里会有什么?此DDB表上的主键是由:layer-key
和组成的复合主键:range-key
。
如果它适合内存,则可以使用...
整个方案的关键是使用:limit 99
映射以及一些:span-reqs {:max 1}
映射来设置opts映射。该:span-reqs
映射是完全不起眼的我,但它似乎是背后是什么概念“页面大小”真正的驱动程序。我已经建立了一个10元素表,例如...
;; This only works on the whole table because the table is small!!!!
(far/scan
common/client-opts
"users.robert.kuhar.wtf_far"
{:return #{:part_key :sort_key :note}})
=>
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.rank", :note "\"456\",\"fha.rank\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.raw", :note "\"456\",\"fha.raw\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.true-color", :note "\"456\",\"fha.true-color\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "soil.ssurgo", :note "\"456\",\"soil.ssurgo\" created 2016-12-08T21:32:20.789Z."}
{:part_key "123", :sort_key "fha.abs", :note "\"123\",\"fha.abs\" created 2016-12-08T21:24:30.139Z."}
{:part_key "123", :sort_key "fha.rank", :note "\"123\",\"fha.rank\" created 2016-12-08T21:24:30.139Z"}
{:part_key "123", :sort_key "fha.raw", :note "\"123\",\"fha.raw\" created 2016-12-08T21:24:30.139Z."}
{:part_key "123", :sort_key "fha.true-color", :note "\"123\",\"fha.true-color\" created 2016-12-08T21:24:30.139Z."}
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}]
如果我想一次浏览这4个元素,则初始调用为...
(far/scan
common/client-opts
"users.robert.kuhar.wtf_far"
{:return #{:part_key :sort_key :note}
:limit 4
:span-reqs {:max 1}})
=>
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.rank", :note "\"456\",\"fha.rank\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.raw", :note "\"456\",\"fha.raw\" created 2016-12-08T21:32:20.789Z."}
{:part_key "456", :sort_key "fha.true-color", :note "\"456\",\"fha.true-color\" created 2016-12-08T21:32:20.789Z."}]
所有后续通话都需要在:last-prim-kvs {:part_key "xxx" :sort_key "yyy"}
opts映射中设置,以告诉法拉第在哪里接听。对于第二页,呼叫就像...
(far/scan
common/client-opts
"users.robert.kuhar.wtf_far"
{:return #{:part_key :sort_key :note}
:limit 4
:span-reqs {:max 1}
:last-prim-kvs {:part_key "456" :sort_key "fha.true-color"}})
=>
[{:part_key "456", :sort_key "soil.ssurgo", :note "\"456\",\"soil.ssurgo\" created 2016-12-08T21:32:20.789Z."}
{:part_key "123", :sort_key "fha.abs", :note "\"123\",\"fha.abs\" created 2016-12-08T21:24:30.139Z."}
{:part_key "123", :sort_key "fha.rank", :note "\"123\",\"fha.rank\" created 2016-12-08T21:24:30.139Z"}
{:part_key "123", :sort_key "fha.raw", :note "\"123\",\"fha.raw\" created 2016-12-08T21:24:30.139Z."}]
我的10元素表的最后一页是...
(far/scan
common/client-opts
"users.robert.kuhar.wtf_far"
{:return #{:part_key :sort_key :note}
:limit 4
:span-reqs {:max 1}
:last-prim-kvs {:part_key "123" :sort_key "fha.raw"}})
=>
[{:part_key "123", :sort_key "fha.true-color", :note "\"123\",\"fha.true-color\" created 2016-12-08T21:24:30.139Z."}
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}]
即使我要了4个元素,也只有2个元素。尝试超出此范围/扫描始终是空的。
(far/scan
common/client-opts
"users.robert.kuhar.wtf_far"
{:return #{:part_key :sort_key :note}
:limit 4
:span-reqs {:max 1}
:last-prim-kvs {:part_key "123" :sort_key "soil.ssurgo"}})
=> []
因此,只要一切都适合内存,就可以做到端到端。
(loop [accum []
page (far/scan
client-opts
"users.robert.kuhar.wtf_far"
{:limit 4
:span-reqs {:max 1}})]
(if (empty? page)
accum
(let [last-on-page (last page)
last-part-key (:part_key last-on-page)
last-sort-key (:sort_key last-on-page)]
(recur
(into accum page)
(far/scan
client-opts
"users.robert.kuhar.wtf_far"
{:limit 4
:span-reqs {:max 1}
:last-prim-kvs {:part_key last-part-key :sort_key last-sort-key}})))))
=>
[{:part_key "456", :sort_key "fha.abs", :note "\"456\",\"fha.abs\" created 2016-12-08T21:32:20.789Z."}
...
{:part_key "123", :sort_key "soil.ssurgo", :note "\"123\",\"soil.ssurgo\" created 2016-12-08T21:24:30.139Z."}]
我认为在“如何才能获得法拉第/扫描来遍历整个DynamoDB表?”的情况下,这是一个令人遗憾的最终答案。是它不能。您需要手工构建它。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句