我有一个搜索多个实体的 Neo4j 查询,我想使用节点对象批量传递参数。但是,我的查询执行速度不是很高。如何优化此查询并使其性能更好?
WITH $nodes as nodes
UNWIND nodes AS node
with node.id AS id, node.lon AS lon, node.lat AS lat
MATCH
(m:Member)-[mtg_r:MT_TO_MEMBER]->(mt:MemberTopics)-[mtt_r:MT_TO_TOPIC]->(t:Topic),
(t1:Topic)-[tt_r:GT_TO_TOPIC]->(gt:GroupTopics)-[tg_r:GT_TO_GROUP]->(g:Group)-[h_r:HAS]->
(e:Event)-[a_r:AT]->(v:Venue)
WHERE mt.topic_id = gt.topic_id AND
distance(point({ longitude: lon, latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) < 4000 AND
mt.member_id = id
RETURN
distinct id as member_id,
lat as member_lat,
lon as member_lon,
g.group_name as group_name,
e.event_name as event_name,
v.venue_name as venue_name,
v.lat as venue_lat,
v.lon as venue_lon,
distance(point({ longitude: lon,
latitude: lat}),point({ longitude: v.lon, latitude: v.lat })) as distance
查询分析如下所示:
因此,您当前的计划有 3 个并行线程。我们现在可以忽略一个,因为它有 0db 命中。
The biggest hit you are taking is the match for (mt:MemberTopics) ... WHERE mt.member_id = id
. I'm guessing member_id is a unique id, so you will want to create an index on it CREATE INDEX ON :MemberTopics(member_id)
. That will allow Cypher to do an index lookup instead of a node scan, which will reduce the DB hits from ~30mill to ~1 (Also, in some cases, in-lining property matches is faster for more complex queries. So (mt:MemberTopics {member_id:id})
is better. It explicitly makes clear that this condition must always be true while matching, and will reinforce to use the index lookup)
The second biggest hit is the point-distance check. Right now, this is being done independently, because the node scan takes so long. Once you make the changes for MemberTopic, The planner should switch to finding all connected Venues, and then only doing the distance check on thous, so that should become cheaper as well.
Also, it looks like mt and gt are linked by a topic, and you are using a topic id to align them. If t and t1 are suppose to be the same Topic node, you could just use t for both nodes to enforce that, and then you don't need to do the id check to link mt and gt. If t and t1 are not the same node, the use of a foriegn key in your node's properties is a sign that you should have a relationship between the two nodes, and just travel along that edge (Relationships can have properties too, but the context looks a lot like t and t1 are suppose to be the same node. You can also enforce this by saying WHERE t = t1
, but at that point, you should just use t for both nodes)
最后,根据查询返回的行数,您可能希望使用 LIMIT 和 SKIP 对结果进行分页。这看起来像是发送给用户的信息,我怀疑他们是否需要完整的转储。所以只返回最上面的结果,如果用户想看更多,只处理剩下的。(当结果接近一公吨时很有用)由于到目前为止您只有 21 个结果,因此现在这不会成为问题,但请记住,因为您需要扩展到 100,000 多个结果。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句