mysqlと比較したneo4jのパフォーマンス（どのように改善できますか？）

Question

これはグラフデータベースのパフォーマンスクレームとアクションブックのneo4jを再現/検証できませんのフォローアップです。セットアップとテストを更新しましたが、元の質問をあまり変更したくありません。

全体の話（スクリプトなどを含む）は https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql

短いバージョン：「グラフデータベース」の本で行われたパフォーマンスの主張を検証しようとしたときに、次の結果が得られました（n人、それぞれ50人の友人を含むランダムなデータセットをクエリ）：

My results for 100k people depth neo4j mysql python 1 0.010 0.000 0.000 2 0.018 0.001 0.000 3 0.538 0.072 0.009 4 22.544 3.600 0.330 5 1269.942 180.143 0.758

「*」：シングルランのみ

My results for 1 million people depth neo4j mysql python 1 0.010 0.000 0.000 2 0.018 0.002 0.000 3 0.689 0.082 0.012 4 30.057 5.598 1.079 5 1441.397* 300.000 9.791

「*」：シングルランのみ

64ビットのubuntuで1.9.2を使用して、次の値でneo4j.propertiesをセットアップしました。

neostore.nodestore.db.mapped_memory=250M neostore.relationshipstore.db.mapped_memory=2048M

およびneo4j-wrapper.confと：

wrapper.Java.initmemory=1024 wrapper.Java.maxmemory=8192

Neo4jへの私のクエリは次のようになります（REST api）を使用）：

start person=node:node_auto_index(noscenda_name="person123") match (person)-[:friend]->()-[:friend]->(friend) return count(distinct friend);

Node_auto_indexは、明らかに適切に配置されています

neo4jを高速化するために（mysqlよりも高速にするために）できることはありますか？

また、 Stackoverflowの別のベンチマーク同じ問題があります。

Ian Robinson · Answer

結果を再現できず申し訳ありません。ただし、2GBのヒープとGCRキャッシュを備えたMacBookAir（1.8 GHz i7、4 GB RAM）では、キャッシュのウォーミングやその他の調整は行われず、同様のサイズのデータセット（100万ユーザー、1人あたり50人の友人）が使用されます。、1.9.2でTraversalFrameworkを使用して約900ミリ秒を繰り返し取得します。

public class FriendOfAFriendDepth4 { private static final TraversalDescription traversalDescription = Traversal.description() .depthFirst() .uniqueness( Uniqueness.NODE_GLOBAL ) .relationships( withName( "FRIEND" ), Direction.OUTGOING ) .evaluator( new Evaluator() { @Override public Evaluation evaluate( Path path ) { if ( path.length() >= 4 ) { return Evaluation.INCLUDE_AND_Prune; } return Evaluation.EXCLUDE_AND_CONTINUE; } } ); private final Index<Node> userIndex; public FriendOfAFriendDepth4( GraphDatabaseService db ) { this.userIndex = db.index().forNodes( "user" ); } public Iterator<Path> getFriends( String name ) { return traversalDescription.traverse( userIndex.get( "name", name ).getSingle() ) .iterator(); } public int countFriends( String name ) { return count( traversalDescription.traverse( userIndex.get( "name", name ).getSingle() ) .nodes().iterator() ); } }

暗号は遅いですが、あなたが提案するほど遅くはありません：約3秒：

START person=node:user(name={name}) MATCH (person)-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->(friend) RETURN count(friend)

敬具

ian

whistler · Answer

はい、REST APIは通常のバインディングよりも大幅に遅く、パフォーマンスの問題があると思います。