Can one recover the content of an indexed Apache Lucene field Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) The Ask Question Wizard is Live! Data science time! April 2019 and salary with experienceLucene index updation and performanceread file contents from lucene indexlucene indexing of html filesapache lucene indexing and searching on the filepathhow to refine the search using apache lucene indexApache Nutch with LuceneIs it possible to store files in Apache Lucene?Apache Lucene - Creating and Storing an Index?Apache Lucene Search programApache Lucene Indexer Search

What to do with post with dry rot?

Classification of bundles, Postnikov towers, obstruction theory, local coefficients

Can smartphones with the same camera sensor have different image quality?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

Was credit for the black hole image misattributed?

New Order #5: where Fibonacci and Beatty meet at Wythoff

Autumning in love

Mortgage adviser recommends a longer term than necessary combined with overpayments

Estimated State payment too big --> money back; + 2018 Tax Reform

Why use gamma over alpha radiation?

How does modal jazz use chord progressions?

I'm having difficulty getting my players to do stuff in a sandbox campaign

Active filter with series inductor and resistor - do these exist?

If A makes B more likely then B makes A more likely"

Two different pronunciation of "понял"

How to rotate it perfectly?

Unable to start mainnet node docker container

Why is there no army of Iron-Mans in the MCU?

Why does this iterative way of solving of equation work?

Is there folklore associating late breastfeeding with low intelligence and/or gullibility?

Jazz greats knew nothing of modes. Why are they used to improvise on standards?

Using "nakedly" instead of "with nothing on"

Can a zero nonce be safely used with AES-GCM if the key is random and never used again?

Cauchy Sequence Characterized only By Directly Neighbouring Sequence Members



Can one recover the content of an indexed Apache Lucene field



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
The Ask Question Wizard is Live!
Data science time! April 2019 and salary with experienceLucene index updation and performanceread file contents from lucene indexlucene indexing of html filesapache lucene indexing and searching on the filepathhow to refine the search using apache lucene indexApache Nutch with LuceneIs it possible to store files in Apache Lucene?Apache Lucene - Creating and Storing an Index?Apache Lucene Search programApache Lucene Indexer Search



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I am indexing String entities in Apache Lucene. E.g.



doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));


I want useres to be able to search for fieldNameSecret and return fieldNameMeta.
As the content of fieldValueSecret is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret?










share|improve this question
























  • As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

    – femtoRgon
    Mar 25 at 15:42

















0















I am indexing String entities in Apache Lucene. E.g.



doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));


I want useres to be able to search for fieldNameSecret and return fieldNameMeta.
As the content of fieldValueSecret is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret?










share|improve this question
























  • As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

    – femtoRgon
    Mar 25 at 15:42













0












0








0








I am indexing String entities in Apache Lucene. E.g.



doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));


I want useres to be able to search for fieldNameSecret and return fieldNameMeta.
As the content of fieldValueSecret is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret?










share|improve this question
















I am indexing String entities in Apache Lucene. E.g.



doc.add(new StringField(fieldNameSecret, fieldValueSecret, Field.Store.NO));
doc.add(new StringField(fieldNameMeta, fieldValueMeta, Field.Store.YES));


I want useres to be able to search for fieldNameSecret and return fieldNameMeta.
As the content of fieldValueSecret is sensible my question is: is it possible to reconstruct/restore the content of fieldValueSecret?







lucene






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 27 at 14:43









niqueco

1,3281027




1,3281027










asked Mar 22 at 7:41









matgmatg

4615




4615












  • As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

    – femtoRgon
    Mar 25 at 15:42

















  • As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

    – femtoRgon
    Mar 25 at 15:42
















As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

– femtoRgon
Mar 25 at 15:42





As per niqueco's answer, generally yes, with some caveats. In your particular case, you are using a Stringfield, so unequivocally yes. The full content of the field is being indexed without analysis, so you wouldn't need to reconstruct anything, you could simply read them.

– femtoRgon
Mar 25 at 15:42












1 Answer
1






active

oldest

votes


















0














Some of the field content can be restored, how much will depend on the indexing options used.



  • Individual terms will be stored, and their presence in a particular field value will be exposed.

  • Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a LowerCaseFilter the original case could not be reconstructed.

  • If you index with IndexOptions.DOCS_AND_FREQS somebody will be able to tell how many times a term has been mentioned.

  • If you index with IndexOptions.DOCS_AND_FREQS_AND_POSITIONS somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).

ADDED: As mentioned by femtoRgon in the particular case of StringField you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55294944%2fcan-one-recover-the-content-of-an-indexed-apache-lucene-field%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Some of the field content can be restored, how much will depend on the indexing options used.



    • Individual terms will be stored, and their presence in a particular field value will be exposed.

    • Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a LowerCaseFilter the original case could not be reconstructed.

    • If you index with IndexOptions.DOCS_AND_FREQS somebody will be able to tell how many times a term has been mentioned.

    • If you index with IndexOptions.DOCS_AND_FREQS_AND_POSITIONS somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).

    ADDED: As mentioned by femtoRgon in the particular case of StringField you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).






    share|improve this answer





























      0














      Some of the field content can be restored, how much will depend on the indexing options used.



      • Individual terms will be stored, and their presence in a particular field value will be exposed.

      • Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a LowerCaseFilter the original case could not be reconstructed.

      • If you index with IndexOptions.DOCS_AND_FREQS somebody will be able to tell how many times a term has been mentioned.

      • If you index with IndexOptions.DOCS_AND_FREQS_AND_POSITIONS somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).

      ADDED: As mentioned by femtoRgon in the particular case of StringField you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).






      share|improve this answer



























        0












        0








        0







        Some of the field content can be restored, how much will depend on the indexing options used.



        • Individual terms will be stored, and their presence in a particular field value will be exposed.

        • Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a LowerCaseFilter the original case could not be reconstructed.

        • If you index with IndexOptions.DOCS_AND_FREQS somebody will be able to tell how many times a term has been mentioned.

        • If you index with IndexOptions.DOCS_AND_FREQS_AND_POSITIONS somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).

        ADDED: As mentioned by femtoRgon in the particular case of StringField you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).






        share|improve this answer















        Some of the field content can be restored, how much will depend on the indexing options used.



        • Individual terms will be stored, and their presence in a particular field value will be exposed.

        • Stemming will mangle terms, so if stemming removes plurals then the reconstructed value will not show a plural. Similarly, if you apply a LowerCaseFilter the original case could not be reconstructed.

        • If you index with IndexOptions.DOCS_AND_FREQS somebody will be able to tell how many times a term has been mentioned.

        • If you index with IndexOptions.DOCS_AND_FREQS_AND_POSITIONS somebody will be able to reconstruct the ordering of terms (however he won't be able to see things discarded during analysis, like punctuation).

        ADDED: As mentioned by femtoRgon in the particular case of StringField you are explicitly asking Lucene to treat the whole field value as a single term, with no other processing. That will always expose the field value, because terms are stored (as mentioned in my first point above).







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 25 at 21:35

























        answered Mar 22 at 17:08









        niqueconiqueco

        1,3281027




        1,3281027





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55294944%2fcan-one-recover-the-content-of-an-indexed-apache-lucene-field%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

            Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

            Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript