提高 Neo4j 查询性能

Cassie 发表于 Dev

卡西

我有一个搜索多个实体的 Neo4j 查询，我想使用节点对象批量传递参数。但是，我的查询执行速度不是很高。如何优化此查询并使其性能更好？

WITH $nodes as nodes
    UNWIND nodes AS node
      with node.id AS id, node.lon AS lon, node.lat AS lat
    MATCH 
    (m:Member)-[mtg_r:MT_TO_MEMBER]->(mt:MemberTopics)-[mtt_r:MT_TO_TOPIC]->(t:Topic),
    (t1:Topic)-[tt_r:GT_TO_TOPIC]->(gt:GroupTopics)-[tg_r:GT_TO_GROUP]->(g:Group)-[h_r:HAS]->
    (e:Event)-[a_r:AT]->(v:Venue) 
    WHERE mt.topic_id = gt.topic_id AND 
    distance(point({ longitude: lon, latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) < 4000 AND 
    mt.member_id = id
RETURN 
    distinct id as member_id, 
    lat as member_lat, 
    lon as member_lon, 
    g.group_name as group_name, 
    e.event_name as event_name, 
    v.venue_name as venue_name, 
    v.lat as venue_lat, 
    v.lon as venue_lon, 
    distance(point({ longitude: lon, 
    latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) as distance

查询分析如下所示：

泰兹拉

因此，您当前的计划有 3 个并行线程。我们现在可以忽略一个，因为它有 0db 命中。

The biggest hit you are taking is the match for (mt:MemberTopics) ... WHERE mt.member_id = id. I'm guessing member_id is a unique id, so you will want to create an index on it CREATE INDEX ON :MemberTopics(member_id). That will allow Cypher to do an index lookup instead of a node scan, which will reduce the DB hits from ~30mill to ~1 (Also, in some cases, in-lining property matches is faster for more complex queries. So (mt:MemberTopics {member_id:id}) is better. It explicitly makes clear that this condition must always be true while matching, and will reinforce to use the index lookup)

The second biggest hit is the point-distance check. Right now, this is being done independently, because the node scan takes so long. Once you make the changes for MemberTopic, The planner should switch to finding all connected Venues, and then only doing the distance check on thous, so that should become cheaper as well.

Also, it looks like mt and gt are linked by a topic, and you are using a topic id to align them. If t and t1 are suppose to be the same Topic node, you could just use t for both nodes to enforce that, and then you don't need to do the id check to link mt and gt. If t and t1 are not the same node, the use of a foriegn key in your node's properties is a sign that you should have a relationship between the two nodes, and just travel along that edge (Relationships can have properties too, but the context looks a lot like t and t1 are suppose to be the same node. You can also enforce this by saying WHERE t = t1, but at that point, you should just use t for both nodes)

最后，根据查询返回的行数，您可能希望使用 LIMIT 和 SKIP 对结果进行分页。这看起来像是发送给用户的信息，我怀疑他们是否需要完整的转储。所以只返回最上面的结果，如果用户想看更多，只处理剩下的。（当结果接近一公吨时很有用）由于到目前为止您只有 21 个结果，因此现在这不会成为问题，但请记住，因为您需要扩展到 100,000 多个结果。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-07-2

我来说两句

0 条评论

登录后参与评论

上一篇：Tensorflow FailedPreconditionError：尝试使用未初始化的值 beta1_power

TOP 榜单

文章

提高 Neo4j 查询性能

提高 Neo4j 查询性能

UITableView的项目向下滚动后更改颜色，然后快速备份

Linux的官方Adobe Flash存储库是否已过时？

用日期数据透视表和日期顺序查询

应用发明者仅从列表中选择一个随机项一次

Mac OS X更新后的GRUB 2问题

验证REST API参数

Java Eclipse中的错误13，如何解决？

带有错误“ where”条件的查询如何返回结果？

ggplot：对齐多个分面图-所有大小不同的分面

尝试反复更改屏幕上按钮的位置 - kotlin android studio

如何从视图一次更新多行（ASP.NET - Core）

计算数据帧中每行的NA

蓝屏死机没有修复解决方案

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

离子动态工具栏背景色

VB.net将2条特定行导出到DataGridView

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

在Windows 7中无法删除文件（2）

python中的boto3文件上传

当我尝试下载 StanfordNLP en 模型时，出现错误

Node.js中未捕获的异常错误，发生调用