提高 Neo4j 查询性能

卡西

我有一个搜索多个实体的 Neo4j 查询,我想使用节点对象批量传递参数。但是,我的查询执行速度不是很高。如何优化此查询并使其性能更好?

WITH $nodes as nodes
    UNWIND nodes AS node
      with node.id AS id, node.lon AS lon, node.lat AS lat
    MATCH 
    (m:Member)-[mtg_r:MT_TO_MEMBER]->(mt:MemberTopics)-[mtt_r:MT_TO_TOPIC]->(t:Topic),
    (t1:Topic)-[tt_r:GT_TO_TOPIC]->(gt:GroupTopics)-[tg_r:GT_TO_GROUP]->(g:Group)-[h_r:HAS]->
    (e:Event)-[a_r:AT]->(v:Venue) 
    WHERE mt.topic_id = gt.topic_id AND 
    distance(point({ longitude: lon, latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) < 4000 AND 
    mt.member_id = id
RETURN 
    distinct id as member_id, 
    lat as member_lat, 
    lon as member_lon, 
    g.group_name as group_name, 
    e.event_name as event_name, 
    v.venue_name as venue_name, 
    v.lat as venue_lat, 
    v.lon as venue_lon, 
    distance(point({ longitude: lon, 
    latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) as distance

查询分析如下所示:

在此处输入图片说明

泰兹拉

因此,您当前的计划有 3 个并行线程。我们现在可以忽略一个,因为它有 0db 命中。

The biggest hit you are taking is the match for (mt:MemberTopics) ... WHERE mt.member_id = id. I'm guessing member_id is a unique id, so you will want to create an index on it CREATE INDEX ON :MemberTopics(member_id). That will allow Cypher to do an index lookup instead of a node scan, which will reduce the DB hits from ~30mill to ~1 (Also, in some cases, in-lining property matches is faster for more complex queries. So (mt:MemberTopics {member_id:id}) is better. It explicitly makes clear that this condition must always be true while matching, and will reinforce to use the index lookup)

The second biggest hit is the point-distance check. Right now, this is being done independently, because the node scan takes so long. Once you make the changes for MemberTopic, The planner should switch to finding all connected Venues, and then only doing the distance check on thous, so that should become cheaper as well.

Also, it looks like mt and gt are linked by a topic, and you are using a topic id to align them. If t and t1 are suppose to be the same Topic node, you could just use t for both nodes to enforce that, and then you don't need to do the id check to link mt and gt. If t and t1 are not the same node, the use of a foriegn key in your node's properties is a sign that you should have a relationship between the two nodes, and just travel along that edge (Relationships can have properties too, but the context looks a lot like t and t1 are suppose to be the same node. You can also enforce this by saying WHERE t = t1, but at that point, you should just use t for both nodes)

最后,根据查询返回的行数,您可能希望使用 LIMIT 和 SKIP 对结果进行分页。这看起来像是发送给用户的信息,我怀疑他们是否需要完整的转储。所以只返回最上面的结果,如果用户想看更多,只处理剩下的。(当结果接近一公吨时很有用)由于到目前为止您只有 21 个结果,因此现在这不会成为问题,但请记住,因为您需要扩展到 100,000 多个结果。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章