Creating relationships between nodes in neo4j is extremely slow The Next CEO of Stack OverflowNeo4j: REST API Cypher Query to find relationship between two nodesneo4j import slowing downFinding relationships between a group of nodes looked up from index in Neo4JHow to create multiple nodes and relationships in Neo4J with one Cypher / REST query?Is there some inherent filter in Neo4J Cypher?Creating nodes and relationships at the same time in neo4jNeo4j SPARQL Plugin on node propertyNeo4j: Fast query for getting relationships between a set of nodesCypher Neo4j - Query that uses the clause 'IN' on the collection is very slowNeo4j / Cypher: Returning sum of value in relationship between nodes within the node itself
What steps are necessary to read a Modern SSD in Medieval Europe?
Is it ever safe to open a suspicious HTML file (e.g. email attachment)?
What was the first Unix version to run on a microcomputer?
Where do students learn to solve polynomial equations these days?
What does "Its cash flow is deeply negative" mean?
How to install OpenCV on Raspbian Stretch?
How is this set of matrices closed under multiplication?
The exact meaning of 'Mom made me a sandwich'
TikZ: How to reverse arrow direction without switching start/end point?
Break Away Valves for Launch
Won the lottery - how do I keep the money?
How to write a definition with variants?
Should I tutor a student who I know has cheated on their homework?
Why do remote US companies require working in the US?
How to count occurrences of text in a file?
Is it convenient to ask the journal's editor for two additional days to complete a review?
Can this equation be simplified further?
Is it okay to majorly distort historical facts while writing a fiction story?
Method for adding error messages to a dictionary given a key
Would a grinding machine be a simple and workable propulsion system for an interplanetary spacecraft?
Grabbing quick drinks
What did we know about the Kessel run before the prequels?
Yu-Gi-Oh cards in Python 3
Make solar eclipses exceedingly rare, but still have new moons
Creating relationships between nodes in neo4j is extremely slow
The Next CEO of Stack OverflowNeo4j: REST API Cypher Query to find relationship between two nodesneo4j import slowing downFinding relationships between a group of nodes looked up from index in Neo4JHow to create multiple nodes and relationships in Neo4J with one Cypher / REST query?Is there some inherent filter in Neo4J Cypher?Creating nodes and relationships at the same time in neo4jNeo4j SPARQL Plugin on node propertyNeo4j: Fast query for getting relationships between a set of nodesCypher Neo4j - Query that uses the clause 'IN' on the collection is very slowNeo4j / Cypher: Returning sum of value in relationship between nodes within the node itself
I'm using a python script to generate and execute queries loaded from data in a CSV file. I've got a substantial amount of data that needs to be imported so speed is very important.
The problem I'm having is that merging between two nodes takes a very long time, and including the cypher to create the relations between the nodes causes a query to take around 3 seconds (for a query which takes around 100ms without).
Here's a small bit of the query I'm trying to execute:
MERGE (s0:Chemical`name`: "10074-g5")
SET s0.`name`="10074-g5"
MERGE (y0:Gene`gene-id`: "4149")
SET y0.`name`="MAX"
SET y0.`gene-id`="4149"
MERGE (s0)-[:INTERACTS_WITH]->(y0)
MERGE (s1:Chemical`name`: "10074-g5")
SET s1.`name`="10074-g5"
MERGE (y1:Gene`gene-id`: "4149")
SET y1.`name`="MAX"
SET y1.`gene-id`="4149"
MERGE (s1)-[:INTERACTS_WITH]->(y1)
Any suggestions on why this is running so slowly? I've got index's set up on Chemical->name and Gene->gene-id so I honestly don't understand why this runs so slowly.
neo4j cypher
add a comment |
I'm using a python script to generate and execute queries loaded from data in a CSV file. I've got a substantial amount of data that needs to be imported so speed is very important.
The problem I'm having is that merging between two nodes takes a very long time, and including the cypher to create the relations between the nodes causes a query to take around 3 seconds (for a query which takes around 100ms without).
Here's a small bit of the query I'm trying to execute:
MERGE (s0:Chemical`name`: "10074-g5")
SET s0.`name`="10074-g5"
MERGE (y0:Gene`gene-id`: "4149")
SET y0.`name`="MAX"
SET y0.`gene-id`="4149"
MERGE (s0)-[:INTERACTS_WITH]->(y0)
MERGE (s1:Chemical`name`: "10074-g5")
SET s1.`name`="10074-g5"
MERGE (y1:Gene`gene-id`: "4149")
SET y1.`name`="MAX"
SET y1.`gene-id`="4149"
MERGE (s1)-[:INTERACTS_WITH]->(y1)
Any suggestions on why this is running so slowly? I've got index's set up on Chemical->name and Gene->gene-id so I honestly don't understand why this runs so slowly.
neo4j cypher
2
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
@InverseFalcon Yes this does look handy. I'm currently experimenting withLOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.
– Top Cat
Mar 22 at 12:46
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52
add a comment |
I'm using a python script to generate and execute queries loaded from data in a CSV file. I've got a substantial amount of data that needs to be imported so speed is very important.
The problem I'm having is that merging between two nodes takes a very long time, and including the cypher to create the relations between the nodes causes a query to take around 3 seconds (for a query which takes around 100ms without).
Here's a small bit of the query I'm trying to execute:
MERGE (s0:Chemical`name`: "10074-g5")
SET s0.`name`="10074-g5"
MERGE (y0:Gene`gene-id`: "4149")
SET y0.`name`="MAX"
SET y0.`gene-id`="4149"
MERGE (s0)-[:INTERACTS_WITH]->(y0)
MERGE (s1:Chemical`name`: "10074-g5")
SET s1.`name`="10074-g5"
MERGE (y1:Gene`gene-id`: "4149")
SET y1.`name`="MAX"
SET y1.`gene-id`="4149"
MERGE (s1)-[:INTERACTS_WITH]->(y1)
Any suggestions on why this is running so slowly? I've got index's set up on Chemical->name and Gene->gene-id so I honestly don't understand why this runs so slowly.
neo4j cypher
I'm using a python script to generate and execute queries loaded from data in a CSV file. I've got a substantial amount of data that needs to be imported so speed is very important.
The problem I'm having is that merging between two nodes takes a very long time, and including the cypher to create the relations between the nodes causes a query to take around 3 seconds (for a query which takes around 100ms without).
Here's a small bit of the query I'm trying to execute:
MERGE (s0:Chemical`name`: "10074-g5")
SET s0.`name`="10074-g5"
MERGE (y0:Gene`gene-id`: "4149")
SET y0.`name`="MAX"
SET y0.`gene-id`="4149"
MERGE (s0)-[:INTERACTS_WITH]->(y0)
MERGE (s1:Chemical`name`: "10074-g5")
SET s1.`name`="10074-g5"
MERGE (y1:Gene`gene-id`: "4149")
SET y1.`name`="MAX"
SET y1.`gene-id`="4149"
MERGE (s1)-[:INTERACTS_WITH]->(y1)
Any suggestions on why this is running so slowly? I've got index's set up on Chemical->name and Gene->gene-id so I honestly don't understand why this runs so slowly.
neo4j cypher
neo4j cypher
edited Mar 21 at 20:38
Top Cat
asked Mar 21 at 17:59
Top CatTop Cat
108210
108210
2
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
@InverseFalcon Yes this does look handy. I'm currently experimenting withLOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.
– Top Cat
Mar 22 at 12:46
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52
add a comment |
2
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
@InverseFalcon Yes this does look handy. I'm currently experimenting withLOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.
– Top Cat
Mar 22 at 12:46
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52
2
2
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
@InverseFalcon Yes this does look handy. I'm currently experimenting with
LOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.– Top Cat
Mar 22 at 12:46
@InverseFalcon Yes this does look handy. I'm currently experimenting with
LOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.– Top Cat
Mar 22 at 12:46
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52
add a comment |
1 Answer
1
active
oldest
votes
- Most of your
SET
clauses are just setting properties to the same values they already have (as guaranteed by the precedingMERGE
clauses). - The remaining
SET
clauses probably only need to be executed if theMERGE
had created a new node. So, they should probably be preceded byON CREATE
. - You should never generate a long sequence of almost identical Cypher code. Instead, your Cypher code should use parameters, and you should pass your data as parameter(s).
- You said you have a
:Gene(id)
index, whereas your code actually requires a:Gene(gene-id)
index.
Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemicalname: d.sName)
MERGE (y:Gene`gene-id`: d.yId)
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
Most of these problems have now been solved as I'm usingLOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query
– Top Cat
Mar 25 at 16:36
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55286577%2fcreating-relationships-between-nodes-in-neo4j-is-extremely-slow%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
- Most of your
SET
clauses are just setting properties to the same values they already have (as guaranteed by the precedingMERGE
clauses). - The remaining
SET
clauses probably only need to be executed if theMERGE
had created a new node. So, they should probably be preceded byON CREATE
. - You should never generate a long sequence of almost identical Cypher code. Instead, your Cypher code should use parameters, and you should pass your data as parameter(s).
- You said you have a
:Gene(id)
index, whereas your code actually requires a:Gene(gene-id)
index.
Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemicalname: d.sName)
MERGE (y:Gene`gene-id`: d.yId)
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
Most of these problems have now been solved as I'm usingLOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query
– Top Cat
Mar 25 at 16:36
add a comment |
- Most of your
SET
clauses are just setting properties to the same values they already have (as guaranteed by the precedingMERGE
clauses). - The remaining
SET
clauses probably only need to be executed if theMERGE
had created a new node. So, they should probably be preceded byON CREATE
. - You should never generate a long sequence of almost identical Cypher code. Instead, your Cypher code should use parameters, and you should pass your data as parameter(s).
- You said you have a
:Gene(id)
index, whereas your code actually requires a:Gene(gene-id)
index.
Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemicalname: d.sName)
MERGE (y:Gene`gene-id`: d.yId)
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
Most of these problems have now been solved as I'm usingLOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query
– Top Cat
Mar 25 at 16:36
add a comment |
- Most of your
SET
clauses are just setting properties to the same values they already have (as guaranteed by the precedingMERGE
clauses). - The remaining
SET
clauses probably only need to be executed if theMERGE
had created a new node. So, they should probably be preceded byON CREATE
. - You should never generate a long sequence of almost identical Cypher code. Instead, your Cypher code should use parameters, and you should pass your data as parameter(s).
- You said you have a
:Gene(id)
index, whereas your code actually requires a:Gene(gene-id)
index.
Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemicalname: d.sName)
MERGE (y:Gene`gene-id`: d.yId)
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
- Most of your
SET
clauses are just setting properties to the same values they already have (as guaranteed by the precedingMERGE
clauses). - The remaining
SET
clauses probably only need to be executed if theMERGE
had created a new node. So, they should probably be preceded byON CREATE
. - You should never generate a long sequence of almost identical Cypher code. Instead, your Cypher code should use parameters, and you should pass your data as parameter(s).
- You said you have a
:Gene(id)
index, whereas your code actually requires a:Gene(gene-id)
index.
Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemicalname: d.sName)
MERGE (y:Gene`gene-id`: d.yId)
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
edited Mar 22 at 19:11
answered Mar 21 at 18:56
cybersamcybersam
40.6k53252
40.6k53252
Most of these problems have now been solved as I'm usingLOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query
– Top Cat
Mar 25 at 16:36
add a comment |
Most of these problems have now been solved as I'm usingLOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query
– Top Cat
Mar 25 at 16:36
Most of these problems have now been solved as I'm using
LOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query– Top Cat
Mar 25 at 16:36
Most of these problems have now been solved as I'm using
LOAD CSV
which has sped up the non-relation part significantly. The creation of the relationship was still slow though. I managed to speed this up by creating the nodes first, and then building the relationships in a separate query– Top Cat
Mar 25 at 16:36
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55286577%2fcreating-relationships-between-nodes-in-neo4j-is-extremely-slow%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
It will help if you can PROFILE your query and (after expanding all elements) attach the plan to your question.
– InverseFalcon
Mar 22 at 0:24
@InverseFalcon Yes this does look handy. I'm currently experimenting with
LOAD CSV
instead of using my scripts as it seems to be able to go through a lot of data extremely quickly but I'll keep this in mind if I have any more issues.– Top Cat
Mar 22 at 12:46
Something else I've just noticed is that I can make the query significantly quicker if I put all the merges into a separate query, and run it separately after all the nodes in my CSV file have been created. Not sure why this is but it allows my CSV file (which is a few million lines long) to be imported in less than a minute or so.
– Top Cat
Mar 22 at 12:52