Internal Server Error due to long XQuery duration (MarkLogic)Xquery on MarkLogic using ORMarkLogic Search XqueryHow to parse “>” and “<” in Xquery in Marklogic?MarkLogic xquery errorsMarklogic xquery gives different result on different serversMarklogic collate sequence in XQueryMarklogic, XqueryServer 500 error on a MarkLogic serverMarkLogic XqueryMarklogic xquery is not evaluating in html
Why are Tucker and Malcolm not dead?
What is my malfunctioning AI harvesting from humans?
How can Radagast come across Gandalf and Thorin's company?
AsyncDictionary - Can you break thread safety?
Help me aout with this summation
How to remove threat that antivirus program indicates has to be manually deleted?
Why are Gatwick's runways too close together?
Is this curved text blend possible in Illustrator?
Can a PC use the Levitate spell to avoid movement speed reduction from exhaustion?
PhD advisor lost funding, need advice
How to reduce Sinas Chinam
A torrent of foreign terms
Why I have higher ping to the VLAN interface than to other local interfaces
Loading military units into ships optimally, using backtracking
Why is the result of ('b'+'a'+ + 'a' + 'a').toLowerCase() 'banana'?
What is a good class if we remove subclasses?
TEMPO: play a (mp3) sound in animated GIF/PDF/SVG
What is the status of the F-1B engine development?
Why does the standard fingering / strumming for a D maj chord leave out the 5th string?
Visa National - No Exit Stamp From France on Return to the UK
What gave Harry Potter the idea of writing in Tom Riddle's diary?
Is it legal for a company to enter an agreement not to hire employees from another company?
80's/90's superhero cartoon with a man on fire and a man who made ice runways like Frozone
How far did Gandalf and the Balrog drop from the bridge in Moria?
Internal Server Error due to long XQuery duration (MarkLogic)
Xquery on MarkLogic using ORMarkLogic Search XqueryHow to parse “>” and “<” in Xquery in Marklogic?MarkLogic xquery errorsMarklogic xquery gives different result on different serversMarklogic collate sequence in XQueryMarklogic, XqueryServer 500 error on a MarkLogic serverMarkLogic XqueryMarklogic xquery is not evaluating in html
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have am currently running through some queries using the Java API provided by MarkLogic. I have installed it through adding the required dependencies to my library. The connection is set up using
DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, secContext, ConnectionType.DIRECT);
From here some XQueries are ran using the code shown below
ServerEvaluationCall evl = client.newServerEval().xquery(query);
EvalResultIterator evr = evl.eval();
while(evr.hasNext())
//Do something with the results
However, certain queries takes a long time to process causing an internal error.So Other then reducing the query time required, I am wondering if there is there a way to overcome this? Such as increasing of connection time limit for instance.
====Update===
Query used
xquery version "1.0-ml";
let $query-opts := /comments[fn:matches(text,".*generation.*")]
return(
$query-opts, fn:count($query-opts), xdmp:elapsed-time()
)
I know the regular expression used can be easily replaced by word-query. But for this instance I would like to just used regular expression for searching.
Example Data
<comments>
<date_commented>1998-01-14T04:32:30</date_commented>
<text>iCloud sync settings are not supposed to change after an iOS update. In the case of iOS 10.3 this was due to a bug.</text>
<uri>/comment/000000001415898</uri>
</comments>
java marklogic
|
show 1 more comment
I have am currently running through some queries using the Java API provided by MarkLogic. I have installed it through adding the required dependencies to my library. The connection is set up using
DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, secContext, ConnectionType.DIRECT);
From here some XQueries are ran using the code shown below
ServerEvaluationCall evl = client.newServerEval().xquery(query);
EvalResultIterator evr = evl.eval();
while(evr.hasNext())
//Do something with the results
However, certain queries takes a long time to process causing an internal error.So Other then reducing the query time required, I am wondering if there is there a way to overcome this? Such as increasing of connection time limit for instance.
====Update===
Query used
xquery version "1.0-ml";
let $query-opts := /comments[fn:matches(text,".*generation.*")]
return(
$query-opts, fn:count($query-opts), xdmp:elapsed-time()
)
I know the regular expression used can be easily replaced by word-query. But for this instance I would like to just used regular expression for searching.
Example Data
<comments>
<date_commented>1998-01-14T04:32:30</date_commented>
<text>iCloud sync settings are not supposed to change after an iOS update. In the case of iOS 10.3 this was due to a bug.</text>
<uri>/comment/000000001415898</uri>
</comments>
java marklogic
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
1
Instead offn:count()you could usexdmp:estimatewhich does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate
– Wagner Michael
Mar 27 at 8:00
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18
|
show 1 more comment
I have am currently running through some queries using the Java API provided by MarkLogic. I have installed it through adding the required dependencies to my library. The connection is set up using
DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, secContext, ConnectionType.DIRECT);
From here some XQueries are ran using the code shown below
ServerEvaluationCall evl = client.newServerEval().xquery(query);
EvalResultIterator evr = evl.eval();
while(evr.hasNext())
//Do something with the results
However, certain queries takes a long time to process causing an internal error.So Other then reducing the query time required, I am wondering if there is there a way to overcome this? Such as increasing of connection time limit for instance.
====Update===
Query used
xquery version "1.0-ml";
let $query-opts := /comments[fn:matches(text,".*generation.*")]
return(
$query-opts, fn:count($query-opts), xdmp:elapsed-time()
)
I know the regular expression used can be easily replaced by word-query. But for this instance I would like to just used regular expression for searching.
Example Data
<comments>
<date_commented>1998-01-14T04:32:30</date_commented>
<text>iCloud sync settings are not supposed to change after an iOS update. In the case of iOS 10.3 this was due to a bug.</text>
<uri>/comment/000000001415898</uri>
</comments>
java marklogic
I have am currently running through some queries using the Java API provided by MarkLogic. I have installed it through adding the required dependencies to my library. The connection is set up using
DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, secContext, ConnectionType.DIRECT);
From here some XQueries are ran using the code shown below
ServerEvaluationCall evl = client.newServerEval().xquery(query);
EvalResultIterator evr = evl.eval();
while(evr.hasNext())
//Do something with the results
However, certain queries takes a long time to process causing an internal error.So Other then reducing the query time required, I am wondering if there is there a way to overcome this? Such as increasing of connection time limit for instance.
====Update===
Query used
xquery version "1.0-ml";
let $query-opts := /comments[fn:matches(text,".*generation.*")]
return(
$query-opts, fn:count($query-opts), xdmp:elapsed-time()
)
I know the regular expression used can be easily replaced by word-query. But for this instance I would like to just used regular expression for searching.
Example Data
<comments>
<date_commented>1998-01-14T04:32:30</date_commented>
<text>iCloud sync settings are not supposed to change after an iOS update. In the case of iOS 10.3 this was due to a bug.</text>
<uri>/comment/000000001415898</uri>
</comments>
java marklogic
java marklogic
edited Mar 27 at 9:03
WhiteSolstice
asked Mar 27 at 7:09
WhiteSolsticeWhiteSolstice
1721 gold badge2 silver badges12 bronze badges
1721 gold badge2 silver badges12 bronze badges
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
1
Instead offn:count()you could usexdmp:estimatewhich does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate
– Wagner Michael
Mar 27 at 8:00
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18
|
show 1 more comment
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
1
Instead offn:count()you could usexdmp:estimatewhich does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate
– Wagner Michael
Mar 27 at 8:00
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
1
1
Instead of
fn:count() you could use xdmp:estimate which does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate– Wagner Michael
Mar 27 at 8:00
Instead of
fn:count() you could use xdmp:estimate which does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate– Wagner Michael
Mar 27 at 8:00
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18
|
show 1 more comment
1 Answer
1
active
oldest
votes
On the basis of your provided data I'd use xdmp:estimate and a cts query.
xdmp:estimate(cts:search(doc(), cts:and-query((
cts:directory-query('/comment/'),
cts:element-word-query(xs:QName("text"), "generation")
))))
This will search all documents in your /comments/ directory for an element text containing the word generation. As you already know, this will only use indexes and does not require loading/parsing documents.
This also will not find any false-positives because there is only one text element per document/fragment (if your shown data is correct).
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
Wildcard searches are the culprit. Depending on your use case you can look at the options forlexicon-expand,lexicon-expansion-limitandlimit-check(docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)
– Michael Gardner
Mar 28 at 3:09
1
@WhiteSolstice The XPath expression loads every document fragment containing acommentselement. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using thexdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.
– Wagner Michael
Mar 28 at 7:28
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55371598%2finternal-server-error-due-to-long-xquery-duration-marklogic%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
On the basis of your provided data I'd use xdmp:estimate and a cts query.
xdmp:estimate(cts:search(doc(), cts:and-query((
cts:directory-query('/comment/'),
cts:element-word-query(xs:QName("text"), "generation")
))))
This will search all documents in your /comments/ directory for an element text containing the word generation. As you already know, this will only use indexes and does not require loading/parsing documents.
This also will not find any false-positives because there is only one text element per document/fragment (if your shown data is correct).
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
Wildcard searches are the culprit. Depending on your use case you can look at the options forlexicon-expand,lexicon-expansion-limitandlimit-check(docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)
– Michael Gardner
Mar 28 at 3:09
1
@WhiteSolstice The XPath expression loads every document fragment containing acommentselement. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using thexdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.
– Wagner Michael
Mar 28 at 7:28
add a comment |
On the basis of your provided data I'd use xdmp:estimate and a cts query.
xdmp:estimate(cts:search(doc(), cts:and-query((
cts:directory-query('/comment/'),
cts:element-word-query(xs:QName("text"), "generation")
))))
This will search all documents in your /comments/ directory for an element text containing the word generation. As you already know, this will only use indexes and does not require loading/parsing documents.
This also will not find any false-positives because there is only one text element per document/fragment (if your shown data is correct).
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
Wildcard searches are the culprit. Depending on your use case you can look at the options forlexicon-expand,lexicon-expansion-limitandlimit-check(docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)
– Michael Gardner
Mar 28 at 3:09
1
@WhiteSolstice The XPath expression loads every document fragment containing acommentselement. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using thexdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.
– Wagner Michael
Mar 28 at 7:28
add a comment |
On the basis of your provided data I'd use xdmp:estimate and a cts query.
xdmp:estimate(cts:search(doc(), cts:and-query((
cts:directory-query('/comment/'),
cts:element-word-query(xs:QName("text"), "generation")
))))
This will search all documents in your /comments/ directory for an element text containing the word generation. As you already know, this will only use indexes and does not require loading/parsing documents.
This also will not find any false-positives because there is only one text element per document/fragment (if your shown data is correct).
On the basis of your provided data I'd use xdmp:estimate and a cts query.
xdmp:estimate(cts:search(doc(), cts:and-query((
cts:directory-query('/comment/'),
cts:element-word-query(xs:QName("text"), "generation")
))))
This will search all documents in your /comments/ directory for an element text containing the word generation. As you already know, this will only use indexes and does not require loading/parsing documents.
This also will not find any false-positives because there is only one text element per document/fragment (if your shown data is correct).
answered Mar 27 at 9:26
Wagner MichaelWagner Michael
1,64510 silver badges25 bronze badges
1,64510 silver badges25 bronze badges
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
Wildcard searches are the culprit. Depending on your use case you can look at the options forlexicon-expand,lexicon-expansion-limitandlimit-check(docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)
– Michael Gardner
Mar 28 at 3:09
1
@WhiteSolstice The XPath expression loads every document fragment containing acommentselement. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using thexdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.
– Wagner Michael
Mar 28 at 7:28
add a comment |
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
Wildcard searches are the culprit. Depending on your use case you can look at the options forlexicon-expand,lexicon-expansion-limitandlimit-check(docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)
– Michael Gardner
Mar 28 at 3:09
1
@WhiteSolstice The XPath expression loads every document fragment containing acommentselement. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using thexdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.
– Wagner Michael
Mar 28 at 7:28
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
One footnote on this good answer: instead of increasing the timeout (which could start failing again if the size of the database increases), the typical strategy is to page over results in multiple requests.
– ehennum
Mar 27 at 16:32
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
Thank you for your answer. But i would like to ask one final thing. Based on your codes I attempted to replace "cts:element....)" with "/comments[fn:matches(text,".*generation.*")]" and the time taken suddenly increases by a lot. Is there a certain reasoning for this? I tried to run "/comments[fn:matches(text,".*generation.*")]" by itself and it completed in less then a second. @Wagner Michael
– WhiteSolstice
Mar 28 at 2:08
1
1
Wildcard searches are the culprit. Depending on your use case you can look at the options for
lexicon-expand, lexicon-expansion-limit and limit-check (docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)– Michael Gardner
Mar 28 at 3:09
Wildcard searches are the culprit. Depending on your use case you can look at the options for
lexicon-expand, lexicon-expansion-limit and limit-check (docs.marklogic.com/cts:element-word-query) and/or Wild Card indexes (docs.marklogic.com/guide/search-dev/wildcard)– Michael Gardner
Mar 28 at 3:09
1
1
@WhiteSolstice The XPath expression loads every document fragment containing a
comments element. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using the xdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.– Wagner Michael
Mar 28 at 7:28
@WhiteSolstice The XPath expression loads every document fragment containing a
comments element. In the second filtering stage it then filters out all fragments not containing the word "generation" by regexp. This sucks up alot performance. To better understand this compare both queries using the xdmp:plan(...). I am not sure why the XPath alone is so much faster though. Might be because the query/fragments are already cached.– Wagner Michael
Mar 28 at 7:28
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55371598%2finternal-server-error-due-to-long-xquery-duration-marklogic%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Do you have issues with both Select and Update? or Can you explain what kind of queries taking more time?
– Ramachandra Reddy
Mar 27 at 7:16
It's a very simple query that checks for a certain word using word-query and returns the document, from there I use fn:count() to determine the # of documents. But it is counting millions of documents, which takes up a lot of time. @Ramachandra Reddy
– WhiteSolstice
Mar 27 at 7:30
1
Instead of
fn:count()you could usexdmp:estimatewhich does not require loading/parsing the documents into memory and should be alot faster. docs.marklogic.com/xdmp:estimate– Wagner Michael
Mar 27 at 8:00
Yes i thought of using that. But the issue is xdmp:estimate counts based on index. So if say I use a path /data[some condition], it will still return me the count of all documents entries with path/data even if the results returned is only 1 document due to the condition applied. @Wagner Michael
– WhiteSolstice
Mar 27 at 8:09
Yea thats true. Can you post your query + data? You could try transforming your data so it does not have multiple nodes per fragments. So path/data is unique per fragment.
– Wagner Michael
Mar 27 at 8:18