Regular Expression '[w-]+(.[w-]+)*' doesn't get matchedIs there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?Regular Expression for alphanumeric and underscoresRegular expression to match a line that doesn't contain a wordHow do you access the matched groups in a JavaScript regular expression?Regular Expressions: Is there an AND operator?How do you use a variable in a regular expression?RegEx match open tags except XHTML self-contained tagsRegular expression to stop at first matchNot recognized as an internal or external command in Windows 10

Who filmed the Apollo 11 trans-lunar injection?

When did England stop being a Papal fief?

What is a common way to tell if an academic is "above average," or outstanding in their field? Is their h-index (Hirsh index) one of them?

Would a "Permanence" spell in 5e be overpowered?

Can the Tidal Wave spell trigger a vampire's weakness to running water?

Dirichlet series with a single zero

Would a small hole in a Faraday cage drastically reduce its effectiveness at blocking interference?

What Kind of Wooden Beam is this

Is Iron Man stronger than the Hulk?

Sci-fi/fantasy book - ships on steel runners skating across ice sheets

Disabling quote conversion in docstrings

Switch Function Not working Properly

Is there a word that describes the unjustified use of a more complex word?

How can I get people to remember my character's gender?

Drawing an hexagonal cone in TikZ 2D

What happens if I accidentally leave an app running and click "Install Now" in Software Updater?

How did the Apollo guidance computer handle parity bit errors?

Sheared off exhasut pipe: How to fix without a welder?

GitLab account hacked and repo wiped

Counting the Number of Real Roots of A Polynomial

Why would one crossvalidate the random state number?

What was the first story to feature the plot "the monsters were human all along"?

Is there a word for food that's gone 'bad', but is still edible?

It isn’t that you must stop now



Regular Expression '[w-]+(.[w-]+)*' doesn't get matched


Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?Regular Expression for alphanumeric and underscoresRegular expression to match a line that doesn't contain a wordHow do you access the matched groups in a JavaScript regular expression?Regular Expressions: Is there an AND operator?How do you use a variable in a regular expression?RegEx match open tags except XHTML self-contained tagsRegular expression to stop at first matchNot recognized as an internal or external command in Windows 10






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports
. It's quite strange that I cann't get the expected answer using this regex in Python.



Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)


I expected to get a list of words:



['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']


But I only got a list of empty string:



['', '', '', '', '', '', '', '']


Does any one have any idea what is wrong with my code? Thanks a lot.










share|improve this question



















  • 1





    You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

    – pandalai
    Mar 23 at 5:48











  • This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

    – Chanlen
    Mar 23 at 8:03











  • @Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

    – accdias
    Mar 23 at 23:05












  • Sorry for making it confused. I have re-edited the post and it should be more clear.

    – Chanlen
    Mar 25 at 0:31


















1















I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports
. It's quite strange that I cann't get the expected answer using this regex in Python.



Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)


I expected to get a list of words:



['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']


But I only got a list of empty string:



['', '', '', '', '', '', '', '']


Does any one have any idea what is wrong with my code? Thanks a lot.










share|improve this question



















  • 1





    You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

    – pandalai
    Mar 23 at 5:48











  • This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

    – Chanlen
    Mar 23 at 8:03











  • @Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

    – accdias
    Mar 23 at 23:05












  • Sorry for making it confused. I have re-edited the post and it should be more clear.

    – Chanlen
    Mar 25 at 0:31














1












1








1








I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports
. It's quite strange that I cann't get the expected answer using this regex in Python.



Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)


I expected to get a list of words:



['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']


But I only got a list of empty string:



['', '', '', '', '', '', '', '']


Does any one have any idea what is wrong with my code? Thanks a lot.










share|improve this question
















I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports
. It's quite strange that I cann't get the expected answer using this regex in Python.



Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)


I expected to get a list of words:



['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']


But I only got a list of empty string:



['', '', '', '', '', '', '', '']


Does any one have any idea what is wrong with my code? Thanks a lot.







regex python-3.x






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 25 at 0:26







Chanlen

















asked Mar 23 at 3:11









ChanlenChanlen

235




235







  • 1





    You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

    – pandalai
    Mar 23 at 5:48











  • This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

    – Chanlen
    Mar 23 at 8:03











  • @Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

    – accdias
    Mar 23 at 23:05












  • Sorry for making it confused. I have re-edited the post and it should be more clear.

    – Chanlen
    Mar 25 at 0:31













  • 1





    You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

    – pandalai
    Mar 23 at 5:48











  • This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

    – Chanlen
    Mar 23 at 8:03











  • @Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

    – accdias
    Mar 23 at 23:05












  • Sorry for making it confused. I have re-edited the post and it should be more clear.

    – Chanlen
    Mar 25 at 0:31








1




1





You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

– pandalai
Mar 23 at 5:48





You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.

– pandalai
Mar 23 at 5:48













This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

– Chanlen
Mar 23 at 8:03





This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.

– Chanlen
Mar 23 at 8:03













@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

– accdias
Mar 23 at 23:05






@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.

– accdias
Mar 23 at 23:05














Sorry for making it confused. I have re-edited the post and it should be more clear.

– Chanlen
Mar 25 at 0:31






Sorry for making it confused. I have re-edited the post and it should be more clear.

– Chanlen
Mar 25 at 0:31













2 Answers
2






active

oldest

votes


















0














This works the way you were expecting:



Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>


Those square brackets on your regular expression don't feel right. I guess they are the cause.






share|improve this answer

























  • This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

    – Aloso
    Mar 23 at 22:54











  • Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

    – accdias
    Mar 23 at 23:01











  • Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

    – Aloso
    Mar 23 at 23:08


















0














The expected strings are matched, but they aren't in a capturing group. Use this regex instead:



r'([w-]+(?:.[w-]+)*)'


Note that I added ?: to the inner parentheses to make them non-capturing.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310252%2fregular-expression-w-w-doesnt-get-matched%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    This works the way you were expecting:



    Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
    [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> split = re.compile('(w+)')
    >>> split.findall('Specifies the directory to use for data storage.')
    ['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
    >>>


    Those square brackets on your regular expression don't feel right. I guess they are the cause.






    share|improve this answer

























    • This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

      – Aloso
      Mar 23 at 22:54











    • Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

      – accdias
      Mar 23 at 23:01











    • Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

      – Aloso
      Mar 23 at 23:08















    0














    This works the way you were expecting:



    Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
    [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> split = re.compile('(w+)')
    >>> split.findall('Specifies the directory to use for data storage.')
    ['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
    >>>


    Those square brackets on your regular expression don't feel right. I guess they are the cause.






    share|improve this answer

























    • This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

      – Aloso
      Mar 23 at 22:54











    • Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

      – accdias
      Mar 23 at 23:01











    • Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

      – Aloso
      Mar 23 at 23:08













    0












    0








    0







    This works the way you were expecting:



    Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
    [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> split = re.compile('(w+)')
    >>> split.findall('Specifies the directory to use for data storage.')
    ['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
    >>>


    Those square brackets on your regular expression don't feel right. I guess they are the cause.






    share|improve this answer















    This works the way you were expecting:



    Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
    [GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> split = re.compile('(w+)')
    >>> split.findall('Specifies the directory to use for data storage.')
    ['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
    >>>


    Those square brackets on your regular expression don't feel right. I guess they are the cause.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 23 at 21:33

























    answered Mar 23 at 21:22









    accdiasaccdias

    868714




    868714












    • This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

      – Aloso
      Mar 23 at 22:54











    • Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

      – accdias
      Mar 23 at 23:01











    • Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

      – Aloso
      Mar 23 at 23:08

















    • This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

      – Aloso
      Mar 23 at 22:54











    • Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

      – accdias
      Mar 23 at 23:01











    • Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

      – Aloso
      Mar 23 at 23:08
















    This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

    – Aloso
    Mar 23 at 22:54





    This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)

    – Aloso
    Mar 23 at 22:54













    Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

    – accdias
    Mar 23 at 23:01





    Thanks for the explanation @Aloso, but yours will allow a string like one.two being matched as a one single word. Is that the right thing?

    – accdias
    Mar 23 at 23:01













    Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

    – Aloso
    Mar 23 at 23:08





    Apparently it is. Just look at the regex [w-]+(.[w-]+)*. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.

    – Aloso
    Mar 23 at 23:08













    0














    The expected strings are matched, but they aren't in a capturing group. Use this regex instead:



    r'([w-]+(?:.[w-]+)*)'


    Note that I added ?: to the inner parentheses to make them non-capturing.






    share|improve this answer



























      0














      The expected strings are matched, but they aren't in a capturing group. Use this regex instead:



      r'([w-]+(?:.[w-]+)*)'


      Note that I added ?: to the inner parentheses to make them non-capturing.






      share|improve this answer

























        0












        0








        0







        The expected strings are matched, but they aren't in a capturing group. Use this regex instead:



        r'([w-]+(?:.[w-]+)*)'


        Note that I added ?: to the inner parentheses to make them non-capturing.






        share|improve this answer













        The expected strings are matched, but they aren't in a capturing group. Use this regex instead:



        r'([w-]+(?:.[w-]+)*)'


        Note that I added ?: to the inner parentheses to make them non-capturing.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 23 at 22:51









        AlosoAloso

        1,94131728




        1,94131728



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310252%2fregular-expression-w-w-doesnt-get-matched%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

            Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

            Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript