Regular Expression '[w-]+(.[w-]+)*' doesn't get matchedIs there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?Regular Expression for alphanumeric and underscoresRegular expression to match a line that doesn't contain a wordHow do you access the matched groups in a JavaScript regular expression?Regular Expressions: Is there an AND operator?How do you use a variable in a regular expression?RegEx match open tags except XHTML self-contained tagsRegular expression to stop at first matchNot recognized as an internal or external command in Windows 10
Who filmed the Apollo 11 trans-lunar injection?
When did England stop being a Papal fief?
What is a common way to tell if an academic is "above average," or outstanding in their field? Is their h-index (Hirsh index) one of them?
Would a "Permanence" spell in 5e be overpowered?
Can the Tidal Wave spell trigger a vampire's weakness to running water?
Dirichlet series with a single zero
Would a small hole in a Faraday cage drastically reduce its effectiveness at blocking interference?
What Kind of Wooden Beam is this
Is Iron Man stronger than the Hulk?
Sci-fi/fantasy book - ships on steel runners skating across ice sheets
Disabling quote conversion in docstrings
Switch Function Not working Properly
Is there a word that describes the unjustified use of a more complex word?
How can I get people to remember my character's gender?
Drawing an hexagonal cone in TikZ 2D
What happens if I accidentally leave an app running and click "Install Now" in Software Updater?
How did the Apollo guidance computer handle parity bit errors?
Sheared off exhasut pipe: How to fix without a welder?
GitLab account hacked and repo wiped
Counting the Number of Real Roots of A Polynomial
Why would one crossvalidate the random state number?
What was the first story to feature the plot "the monsters were human all along"?
Is there a word for food that's gone 'bad', but is still edible?
It isn’t that you must stop now
Regular Expression '[w-]+(.[w-]+)*' doesn't get matched
Is there a regular expression to detect a valid regular expression?How to validate an email address using a regular expression?Regular Expression for alphanumeric and underscoresRegular expression to match a line that doesn't contain a wordHow do you access the matched groups in a JavaScript regular expression?Regular Expressions: Is there an AND operator?How do you use a variable in a regular expression?RegEx match open tags except XHTML self-contained tagsRegular expression to stop at first matchNot recognized as an internal or external command in Windows 10
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports. It's quite strange that I cann't get the expected answer using this regex in Python.
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)
I expected to get a list of words:
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
But I only got a list of empty string:
['', '', '', '', '', '', '', '']
Does any one have any idea what is wrong with my code? Thanks a lot.
regex python-3.x
add a comment |
I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports. It's quite strange that I cann't get the expected answer using this regex in Python.
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)
I expected to get a list of words:
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
But I only got a list of empty string:
['', '', '', '', '', '', '', '']
Does any one have any idea what is wrong with my code? Thanks a lot.
regex python-3.x
1
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31
add a comment |
I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports. It's quite strange that I cann't get the expected answer using this regex in Python.
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)
I expected to get a list of words:
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
But I only got a list of empty string:
['', '', '', '', '', '', '', '']
Does any one have any idea what is wrong with my code? Thanks a lot.
regex python-3.x
I want to process some sentences in the document of PostgreSQL and do some analysis. In the word spliting stage, I tried to use the regex '[w-]+(.[w-]+)*' proposed by Lotufo et al. in the article Modelling the Hurried bug report reading process to summarize
bug reports. It's quite strange that I cann't get the expected answer using this regex in Python.
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 6.4.0 -- An enhanced Interactive Python.
>>> import re
>>> result = re.findall(r'[w-]+(.[w-]+)*', 'Specifies the directory to use for data storage.')
>>> print(result)
I expected to get a list of words:
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
But I only got a list of empty string:
['', '', '', '', '', '', '', '']
Does any one have any idea what is wrong with my code? Thanks a lot.
regex python-3.x
regex python-3.x
edited Mar 25 at 0:26
Chanlen
asked Mar 23 at 3:11
ChanlenChanlen
235
235
1
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31
add a comment |
1
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31
1
1
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31
add a comment |
2 Answers
2
active
oldest
votes
This works the way you were expecting:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>
Those square brackets on your regular expression don't feel right. I guess they are the cause.
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string likeone.two
being matched as a one single word. Is that the right thing?
– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.
– Aloso
Mar 23 at 23:08
add a comment |
The expected strings are matched, but they aren't in a capturing group. Use this regex instead:
r'([w-]+(?:.[w-]+)*)'
Note that I added ?:
to the inner parentheses to make them non-capturing.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310252%2fregular-expression-w-w-doesnt-get-matched%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
This works the way you were expecting:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>
Those square brackets on your regular expression don't feel right. I guess they are the cause.
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string likeone.two
being matched as a one single word. Is that the right thing?
– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.
– Aloso
Mar 23 at 23:08
add a comment |
This works the way you were expecting:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>
Those square brackets on your regular expression don't feel right. I guess they are the cause.
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string likeone.two
being matched as a one single word. Is that the right thing?
– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.
– Aloso
Mar 23 at 23:08
add a comment |
This works the way you were expecting:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>
Those square brackets on your regular expression don't feel right. I guess they are the cause.
This works the way you were expecting:
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> split = re.compile('(w+)')
>>> split.findall('Specifies the directory to use for data storage.')
['Specifies', 'the', 'directory', 'to', 'use', 'for', 'data', 'storage']
>>>
Those square brackets on your regular expression don't feel right. I guess they are the cause.
edited Mar 23 at 21:33
answered Mar 23 at 21:22
accdiasaccdias
868714
868714
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string likeone.two
being matched as a one single word. Is that the right thing?
– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.
– Aloso
Mar 23 at 23:08
add a comment |
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string likeone.two
being matched as a one single word. Is that the right thing?
– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.
– Aloso
Mar 23 at 23:08
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
This regex is not equivalent to the original one. The regex is supposed to allow dots (except at the beginning and at the end of a word)
– Aloso
Mar 23 at 22:54
Thanks for the explanation @Aloso, but yours will allow a string like
one.two
being matched as a one single word. Is that the right thing?– accdias
Mar 23 at 23:01
Thanks for the explanation @Aloso, but yours will allow a string like
one.two
being matched as a one single word. Is that the right thing?– accdias
Mar 23 at 23:01
Apparently it is. Just look at the regex
[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.– Aloso
Mar 23 at 23:08
Apparently it is. Just look at the regex
[w-]+(.[w-]+)*
. We don't know what @Chanlen needs it for, so let's not jump to any conclusions.– Aloso
Mar 23 at 23:08
add a comment |
The expected strings are matched, but they aren't in a capturing group. Use this regex instead:
r'([w-]+(?:.[w-]+)*)'
Note that I added ?:
to the inner parentheses to make them non-capturing.
add a comment |
The expected strings are matched, but they aren't in a capturing group. Use this regex instead:
r'([w-]+(?:.[w-]+)*)'
Note that I added ?:
to the inner parentheses to make them non-capturing.
add a comment |
The expected strings are matched, but they aren't in a capturing group. Use this regex instead:
r'([w-]+(?:.[w-]+)*)'
Note that I added ?:
to the inner parentheses to make them non-capturing.
The expected strings are matched, but they aren't in a capturing group. Use this regex instead:
r'([w-]+(?:.[w-]+)*)'
Note that I added ?:
to the inner parentheses to make them non-capturing.
answered Mar 23 at 22:51
AlosoAloso
1,94131728
1,94131728
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55310252%2fregular-expression-w-w-doesnt-get-matched%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You can use string.split(' ') to segment the sentence, it's easy to understand and improve runtime efficiency.
– pandalai
Mar 23 at 5:48
This regex has special meaning in spliting the computer words. The function string.split(' ') may work well on the above sentence but for certain sentences it cannot reach the aim of spliting.
– Chanlen
Mar 23 at 8:03
@Chanlen, could you please post a better explanation of what are you trying to accomplish with that regex? More examples of right and wrong input/outputs would be great.
– accdias
Mar 23 at 23:05
Sorry for making it confused. I have re-edited the post and it should be more clear.
– Chanlen
Mar 25 at 0:31