Can one recover the content of an indexed Apache Lucene field Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceLucene index updation and performanceread file contents from lucene indexlucene indexing of html filesapache lucene indexing and searching on the filepathhow to refine the search using apache lucene indexApache Nutch with LuceneIs it possible to store files in Apache Lucene?Apache Lucene - Creating and Storing an Index?Apache Lucene Search programApache Lucene Indexer Search
What to do with post with dry rot?
Classification of bundles, Postnikov towers, obstruction theory, local coefficients
Can smartphones with the same camera sensor have different image quality?
How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time
Was credit for the black hole image misattributed?
New Order #5: where Fibonacci and Beatty meet at Wythoff
Autumning in love
Mortgage adviser recommends a longer term than necessary combined with overpayments
Estimated State payment too big --> money back; + 2018 Tax Reform
Why use gamma over alpha radiation?
How does modal jazz use chord progressions?
I'm having difficulty getting my players to do stuff in a sandbox campaign
Active filter with series inductor and resistor - do these exist?
If A makes B more likely then B makes A more likely"
Two different pronunciation of "понял"
How to rotate it perfectly?
Unable to start mainnet node docker container
Why is there no army of Iron-Mans in the MCU?
Why does this iterative way of solving of equation work?
Is there folklore associating late breastfeeding with low intelligence and/or gullibility?
Jazz greats knew nothing of modes. Why are they used to improvise on standards?
Using "nakedly" instead of "with nothing on"
Can a zero nonce be safely used with AES-GCM if the key is random and never used again?
Cauchy Sequence Characterized only By Directly Neighbouring Sequence Members
Can one recover the content of an indexed Apache Lucene field
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experienceLucene index updation and performanceread file contents from lucene indexlucene indexing of html filesapache lucene indexing and searching on the filepathhow to refine the search using apache lucene indexApache Nutch with LuceneIs it possible to store files in Apache Lucene?Apache Lucene - Creating and Storing an Index?Apache Lucene Search programApache Lucene Indexer Search
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I am indexing String entities in Apache Lucene. E.g.
doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));
I want useres to be able to search for fieldNameSecret
and return fieldNameMeta
.
As the content of fieldValueSecret
is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret
?
lucene
add a comment |
I am indexing String entities in Apache Lucene. E.g.
doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));
I want useres to be able to search for fieldNameSecret
and return fieldNameMeta
.
As the content of fieldValueSecret
is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret
?
lucene
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42
add a comment |
I am indexing String entities in Apache Lucene. E.g.
doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));
I want useres to be able to search for fieldNameSecret
and return fieldNameMeta
.
As the content of fieldValueSecret
is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret
?
lucene
I am indexing String entities in Apache Lucene. E.g.
doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));
I want useres to be able to search for fieldNameSecret
and return fieldNameMeta
.
As the content of fieldValueSecret
is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret
?
lucene
lucene
edited Mar 27 at 14:43
niqueco
1,3281027
1,3281027
asked Mar 22 at 7:41
matgmatg
4615
4615
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42
add a comment |
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42
add a comment |
1 Answer
1
active
oldest
votes
Some of the field content can be restored, how much will depend on the indexing options used.
- Individual terms will be stored, and their presence in a particular field value will be exposed.
- Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a
LowerCaseFilter
the original case could not be reconstructed. - If you index with
IndexOptions.DOCS_AND_FREQS
somebody will be able to tell how many times a term has been mentioned. - If you index with
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS
somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).
ADDED: As mentioned by femtoRgon in the particular case of StringField
you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55294944%2fcan-one-recover-the-content-of-an-indexed-apache-lucene-field%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Some of the field content can be restored, how much will depend on the indexing options used.
- Individual terms will be stored, and their presence in a particular field value will be exposed.
- Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a
LowerCaseFilter
the original case could not be reconstructed. - If you index with
IndexOptions.DOCS_AND_FREQS
somebody will be able to tell how many times a term has been mentioned. - If you index with
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS
somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).
ADDED: As mentioned by femtoRgon in the particular case of StringField
you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).
add a comment |
Some of the field content can be restored, how much will depend on the indexing options used.
- Individual terms will be stored, and their presence in a particular field value will be exposed.
- Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a
LowerCaseFilter
the original case could not be reconstructed. - If you index with
IndexOptions.DOCS_AND_FREQS
somebody will be able to tell how many times a term has been mentioned. - If you index with
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS
somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).
ADDED: As mentioned by femtoRgon in the particular case of StringField
you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).
add a comment |
Some of the field content can be restored, how much will depend on the indexing options used.
- Individual terms will be stored, and their presence in a particular field value will be exposed.
- Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a
LowerCaseFilter
the original case could not be reconstructed. - If you index with
IndexOptions.DOCS_AND_FREQS
somebody will be able to tell how many times a term has been mentioned. - If you index with
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS
somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).
ADDED: As mentioned by femtoRgon in the particular case of StringField
you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).
Some of the field content can be restored, how much will depend on the indexing options used.
- Individual terms will be stored, and their presence in a particular field value will be exposed.
- Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a
LowerCaseFilter
the original case could not be reconstructed. - If you index with
IndexOptions.DOCS_AND_FREQS
somebody will be able to tell how many times a term has been mentioned. - If you index with
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS
somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).
ADDED: As mentioned by femtoRgon in the particular case of StringField
you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).
edited Mar 25 at 21:35
answered Mar 22 at 17:08
niqueconiqueco
1,3281027
1,3281027
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55294944%2fcan-one-recover-the-content-of-an-indexed-apache-lucene-field%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.
– femtoRgon
Mar 25 at 15:42