Why do I get a different result every time I save and extract a response string from a web service?RegEx match open tags except XHTML self-contained tagsmatching any character including newlines in a Python regex subexpression, not globallyHow to get string objects instead of Unicode from JSON?Why does comparing strings using either '==' or 'is' sometimes produce a different result?Why are empty strings returned in split() results?Writing string to a file on a new line every timeHow to extract numbers from a string in Python?md5 a string multiple times get different result on different platformGetting Different Results For Web ScrapingCannot display HTML stringWhy is np.random.choice giving the same result every time?Different result every time

In xXx, is Xander Cage's 10th vehicle a specific reference to another franchise?

Changing a TGV booking

Beth cardinals and inacceesible cardinals

Are there any legal requirements concerning airline pilots and their watches?

Can my Boyfriend, who lives in the UK and has a Polish passport, visit me in the USA?

Do predators tend to have vertical slit pupils versus horizontal for prey animals?

Have ejective consonants ever arisen on their own?

Is it appropriate for a business to ask me for my credit report?

What professions does medieval village with a population of 100 need?

To "hit home" in German

Are required indicators necessary for radio buttons?

How did Apollo 15's depressurization work?

Stuffing in the middle

What is a "click" in Greek or Latin?

My two team members in a remote location don't get along with each other; how can I improve working relations?

Unsolved Problems (Not Independent of ZFC) due to Lack of Computational Power

90s(?) book series about two people transported to a parallel medieval world, she joins city watch, he becomes wizard

How much code would a codegolf golf if a codegolf could golf code?

Does git delete empty folders?

Chess software to analyze games

Does Denmark lose almost $700 million a year "carrying" Greenland?

Is "stainless" a bulk or a surface property of stainless steel?

How to get distinct values from an array of arrays in JavaScript using the filter() method?

What is the latest version of SQL Server native client that is compatible with Sql Server 2008 r2



Why do I get a different result every time I save and extract a response string from a web service?


RegEx match open tags except XHTML self-contained tagsmatching any character including newlines in a Python regex subexpression, not globallyHow to get string objects instead of Unicode from JSON?Why does comparing strings using either '==' or 'is' sometimes produce a different result?Why are empty strings returned in split() results?Writing string to a file on a new line every timeHow to extract numbers from a string in Python?md5 a string multiple times get different result on different platformGetting Different Results For Web ScrapingCannot display HTML stringWhy is np.random.choice giving the same result every time?Different result every time






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.



NOTE: the suggested duplicate questions answers don't work for me, this isn't a duplicate question.



I'm consuming a web service. the answer I get is stored in the variable answerService, this is a very long string and after this I extract what is inside the tag span that has this structure:



<span style = "font-weight: bold"> xxx </ span>
"xxx" is what I want to extract
#with that I get the "xxx"
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)


I get an array of "n" length according to the span existing with this structure.



If I do this directly from the web service it does not work and I only get this answer:



['áGILMENTE']


Now, if I put the response of the web service sameStringOfAnswer in my code, the result is different:



print(arraySpan)
['ADV', 'áGILMENTE']


By logic the answer is the same and never changes, for some strange reason in real time when I get the response from the web service, I only get ['áGILMENTE'] when the answer I expect is ['ADV', 'áGILMENTE']



This is the key piece that shows that 2 span is always coming with the structure I need:



Here is my code:



import requests
import re
session = requests.Session()

getId=session.get('http://cartago.lllf.uam.es/grampal/grampal.cgi')
cookie=session.cookies.get_dict()
getId=session.cookies.get_dict()
getId=getId["CGISESSID"]
#getting an ID for request a webservice
getService=requests.get("http://cartago.lllf.uam.es/grampal/grampal.cgi?m=analiza&csrf="+getId+"&e="+"ágilmente", cookies=cookie)

answerService=getService.text
#get the value of the <span>
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
print(answerService)
print("array",arraySpan)

#same code but using the result of service web
sameStringOfAnswer='<html xmlns="http://www.w3.org/TR/REC-html40"><head><title>Grampal </title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><meta name="Content-Language" content="EN"><meta name="author" content="jmguirao@ugr.es"><link rel="icon" type="image/ico" href="/favicon.ico"/><style type="text/css">html,body,form,ul,li,h1,h3,pmargin:0; padding:0bodyfont-family: Arial, Helvetica, sans-serif; background-color:#fffatext-decoration: none;a:hovertext-decoration: underlineullist-style-type: nonetdpadding: 0.5pc 2pc 0pc 0pc.navfloat: right; padding: 0.5pc 0.5pc 0.5pc 0.5pc; margin-left:5px.nav lidisplay:inline; border-left: 1px solid #444; padding:0 0.4em;.nav li.firstborder-left:0.hidedisplay:noneinputtext-indent: 2pxinput[type="submit"]text-indent: 0DIV.delPagepadding: 0.5ex 5em 0.5em 5em; background-color:#ffd6ba;.delMainpadding: 2ex 0.5em 0.5pc 0.5em;.postmargin-bottom: 0.25pc; font-size: 100%; padding-top: 0.5ex;.posts, #postspadding: 0.5ex 0.5em 0.5pc 50px;.bannerpadding: 0.5ex 0 0.5pc 0.5em;background-color: #ffc6aa;clear: both.banner h1font-weight: bolder; font-size: 150%;margin:0; padding:0 0 0 26px; display: inline;h2font-weight: bolder; font-size: 140%; color: red; margin:0; padding:0 0 0 26px; display: inline;.resaltadofont-weight: bolder;font-size: 100%</style></head><body><div class="banner"><ul class="hide"><li><a href="#content">skip to content</a></li></ul><ul class="nav">Análsis de:<li class="first"><a title="Analizador morfosintáctico" href="/grampal/grampal.cgi?m=analiza&e=ágilmente">palabras</a></li><li><a title="Desambiguador contextual" href="/grampal/grampal.cgi?m=etiqueta&e=ágilmente">oraciones</a></li><li><a title="Etiquetado de textos" href="/grampal/grampal.cgi?m=xml">textos</a></li><li><a title="Formas de una palabra" href="/grampal/grampal.cgi?m=genera&e=ágilmente">Generación de formas</a></li><!--<li><a title="Transcripción fonética" href="/grampal/grampal.cgi?m=transcribe&e=ágilmente">Transcripción</a></li>--><li><a href="/grampal/grampal.cgi?m=etiquetario">Etiquetario</a></li><li><a href="/grampal/grampal.cgi?m=autores">Autores</a></li></ul><h1>Grampal</h1></div><div class="delPage" style="font-size: 80%;"><form method="GET" action="/grampal/grampal.cgi"><input type="hidden" name="m" value="analiza"><input type="hidden" name="csrf" value="94508700a0ae409a90718299ae00b0e0"><span class="resaltado">Palabra : </span><input name="e" size="60" value="ágilmente"><input type="submit" value="Analiza"> &nbsp;</form></div><br><h2>ágilmente</h2><div class="delMain"><div id="posts"><table><tr><td style="font-style:italic;font-size:90%">categoría&nbsp;<span style="font-weight:bold"> ADV </span></td><td style="font-style:italic;font-size:90%">lema&nbsp;<span style="font-weight:bold"> áGILMENTE </span></td></tr></table></div></div></body></html>'
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', sameStringOfAnswer)
print(arraySpan)


What am I doing wrong?










share|improve this question





















  • 1





    Why are you using regex to parse html?

    – TheIncorrigible1
    Mar 27 at 14:52











  • @TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

    – unusuario
    Mar 27 at 14:53











  • @TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

    – unusuario
    Mar 27 at 14:57











  • Possible duplicate of RegEx match open tags except XHTML self-contained tags

    – Ralf
    Mar 27 at 14:59











  • @Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

    – unusuario
    Mar 27 at 15:02

















0















My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.



NOTE: the suggested duplicate questions answers don't work for me, this isn't a duplicate question.



I'm consuming a web service. the answer I get is stored in the variable answerService, this is a very long string and after this I extract what is inside the tag span that has this structure:



<span style = "font-weight: bold"> xxx </ span>
"xxx" is what I want to extract
#with that I get the "xxx"
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)


I get an array of "n" length according to the span existing with this structure.



If I do this directly from the web service it does not work and I only get this answer:



['áGILMENTE']


Now, if I put the response of the web service sameStringOfAnswer in my code, the result is different:



print(arraySpan)
['ADV', 'áGILMENTE']


By logic the answer is the same and never changes, for some strange reason in real time when I get the response from the web service, I only get ['áGILMENTE'] when the answer I expect is ['ADV', 'áGILMENTE']



This is the key piece that shows that 2 span is always coming with the structure I need:



Here is my code:



import requests
import re
session = requests.Session()

getId=session.get('http://cartago.lllf.uam.es/grampal/grampal.cgi')
cookie=session.cookies.get_dict()
getId=session.cookies.get_dict()
getId=getId["CGISESSID"]
#getting an ID for request a webservice
getService=requests.get("http://cartago.lllf.uam.es/grampal/grampal.cgi?m=analiza&csrf="+getId+"&e="+"ágilmente", cookies=cookie)

answerService=getService.text
#get the value of the <span>
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
print(answerService)
print("array",arraySpan)

#same code but using the result of service web
sameStringOfAnswer='<html xmlns="http://www.w3.org/TR/REC-html40"><head><title>Grampal </title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><meta name="Content-Language" content="EN"><meta name="author" content="jmguirao@ugr.es"><link rel="icon" type="image/ico" href="/favicon.ico"/><style type="text/css">html,body,form,ul,li,h1,h3,pmargin:0; padding:0bodyfont-family: Arial, Helvetica, sans-serif; background-color:#fffatext-decoration: none;a:hovertext-decoration: underlineullist-style-type: nonetdpadding: 0.5pc 2pc 0pc 0pc.navfloat: right; padding: 0.5pc 0.5pc 0.5pc 0.5pc; margin-left:5px.nav lidisplay:inline; border-left: 1px solid #444; padding:0 0.4em;.nav li.firstborder-left:0.hidedisplay:noneinputtext-indent: 2pxinput[type="submit"]text-indent: 0DIV.delPagepadding: 0.5ex 5em 0.5em 5em; background-color:#ffd6ba;.delMainpadding: 2ex 0.5em 0.5pc 0.5em;.postmargin-bottom: 0.25pc; font-size: 100%; padding-top: 0.5ex;.posts, #postspadding: 0.5ex 0.5em 0.5pc 50px;.bannerpadding: 0.5ex 0 0.5pc 0.5em;background-color: #ffc6aa;clear: both.banner h1font-weight: bolder; font-size: 150%;margin:0; padding:0 0 0 26px; display: inline;h2font-weight: bolder; font-size: 140%; color: red; margin:0; padding:0 0 0 26px; display: inline;.resaltadofont-weight: bolder;font-size: 100%</style></head><body><div class="banner"><ul class="hide"><li><a href="#content">skip to content</a></li></ul><ul class="nav">Análsis de:<li class="first"><a title="Analizador morfosintáctico" href="/grampal/grampal.cgi?m=analiza&e=ágilmente">palabras</a></li><li><a title="Desambiguador contextual" href="/grampal/grampal.cgi?m=etiqueta&e=ágilmente">oraciones</a></li><li><a title="Etiquetado de textos" href="/grampal/grampal.cgi?m=xml">textos</a></li><li><a title="Formas de una palabra" href="/grampal/grampal.cgi?m=genera&e=ágilmente">Generación de formas</a></li><!--<li><a title="Transcripción fonética" href="/grampal/grampal.cgi?m=transcribe&e=ágilmente">Transcripción</a></li>--><li><a href="/grampal/grampal.cgi?m=etiquetario">Etiquetario</a></li><li><a href="/grampal/grampal.cgi?m=autores">Autores</a></li></ul><h1>Grampal</h1></div><div class="delPage" style="font-size: 80%;"><form method="GET" action="/grampal/grampal.cgi"><input type="hidden" name="m" value="analiza"><input type="hidden" name="csrf" value="94508700a0ae409a90718299ae00b0e0"><span class="resaltado">Palabra : </span><input name="e" size="60" value="ágilmente"><input type="submit" value="Analiza"> &nbsp;</form></div><br><h2>ágilmente</h2><div class="delMain"><div id="posts"><table><tr><td style="font-style:italic;font-size:90%">categoría&nbsp;<span style="font-weight:bold"> ADV </span></td><td style="font-style:italic;font-size:90%">lema&nbsp;<span style="font-weight:bold"> áGILMENTE </span></td></tr></table></div></div></body></html>'
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', sameStringOfAnswer)
print(arraySpan)


What am I doing wrong?










share|improve this question





















  • 1





    Why are you using regex to parse html?

    – TheIncorrigible1
    Mar 27 at 14:52











  • @TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

    – unusuario
    Mar 27 at 14:53











  • @TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

    – unusuario
    Mar 27 at 14:57











  • Possible duplicate of RegEx match open tags except XHTML self-contained tags

    – Ralf
    Mar 27 at 14:59











  • @Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

    – unusuario
    Mar 27 at 15:02













0












0








0








My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.



NOTE: the suggested duplicate questions answers don't work for me, this isn't a duplicate question.



I'm consuming a web service. the answer I get is stored in the variable answerService, this is a very long string and after this I extract what is inside the tag span that has this structure:



<span style = "font-weight: bold"> xxx </ span>
"xxx" is what I want to extract
#with that I get the "xxx"
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)


I get an array of "n" length according to the span existing with this structure.



If I do this directly from the web service it does not work and I only get this answer:



['áGILMENTE']


Now, if I put the response of the web service sameStringOfAnswer in my code, the result is different:



print(arraySpan)
['ADV', 'áGILMENTE']


By logic the answer is the same and never changes, for some strange reason in real time when I get the response from the web service, I only get ['áGILMENTE'] when the answer I expect is ['ADV', 'áGILMENTE']



This is the key piece that shows that 2 span is always coming with the structure I need:



Here is my code:



import requests
import re
session = requests.Session()

getId=session.get('http://cartago.lllf.uam.es/grampal/grampal.cgi')
cookie=session.cookies.get_dict()
getId=session.cookies.get_dict()
getId=getId["CGISESSID"]
#getting an ID for request a webservice
getService=requests.get("http://cartago.lllf.uam.es/grampal/grampal.cgi?m=analiza&csrf="+getId+"&e="+"ágilmente", cookies=cookie)

answerService=getService.text
#get the value of the <span>
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
print(answerService)
print("array",arraySpan)

#same code but using the result of service web
sameStringOfAnswer='<html xmlns="http://www.w3.org/TR/REC-html40"><head><title>Grampal </title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><meta name="Content-Language" content="EN"><meta name="author" content="jmguirao@ugr.es"><link rel="icon" type="image/ico" href="/favicon.ico"/><style type="text/css">html,body,form,ul,li,h1,h3,pmargin:0; padding:0bodyfont-family: Arial, Helvetica, sans-serif; background-color:#fffatext-decoration: none;a:hovertext-decoration: underlineullist-style-type: nonetdpadding: 0.5pc 2pc 0pc 0pc.navfloat: right; padding: 0.5pc 0.5pc 0.5pc 0.5pc; margin-left:5px.nav lidisplay:inline; border-left: 1px solid #444; padding:0 0.4em;.nav li.firstborder-left:0.hidedisplay:noneinputtext-indent: 2pxinput[type="submit"]text-indent: 0DIV.delPagepadding: 0.5ex 5em 0.5em 5em; background-color:#ffd6ba;.delMainpadding: 2ex 0.5em 0.5pc 0.5em;.postmargin-bottom: 0.25pc; font-size: 100%; padding-top: 0.5ex;.posts, #postspadding: 0.5ex 0.5em 0.5pc 50px;.bannerpadding: 0.5ex 0 0.5pc 0.5em;background-color: #ffc6aa;clear: both.banner h1font-weight: bolder; font-size: 150%;margin:0; padding:0 0 0 26px; display: inline;h2font-weight: bolder; font-size: 140%; color: red; margin:0; padding:0 0 0 26px; display: inline;.resaltadofont-weight: bolder;font-size: 100%</style></head><body><div class="banner"><ul class="hide"><li><a href="#content">skip to content</a></li></ul><ul class="nav">Análsis de:<li class="first"><a title="Analizador morfosintáctico" href="/grampal/grampal.cgi?m=analiza&e=ágilmente">palabras</a></li><li><a title="Desambiguador contextual" href="/grampal/grampal.cgi?m=etiqueta&e=ágilmente">oraciones</a></li><li><a title="Etiquetado de textos" href="/grampal/grampal.cgi?m=xml">textos</a></li><li><a title="Formas de una palabra" href="/grampal/grampal.cgi?m=genera&e=ágilmente">Generación de formas</a></li><!--<li><a title="Transcripción fonética" href="/grampal/grampal.cgi?m=transcribe&e=ágilmente">Transcripción</a></li>--><li><a href="/grampal/grampal.cgi?m=etiquetario">Etiquetario</a></li><li><a href="/grampal/grampal.cgi?m=autores">Autores</a></li></ul><h1>Grampal</h1></div><div class="delPage" style="font-size: 80%;"><form method="GET" action="/grampal/grampal.cgi"><input type="hidden" name="m" value="analiza"><input type="hidden" name="csrf" value="94508700a0ae409a90718299ae00b0e0"><span class="resaltado">Palabra : </span><input name="e" size="60" value="ágilmente"><input type="submit" value="Analiza"> &nbsp;</form></div><br><h2>ágilmente</h2><div class="delMain"><div id="posts"><table><tr><td style="font-style:italic;font-size:90%">categoría&nbsp;<span style="font-weight:bold"> ADV </span></td><td style="font-style:italic;font-size:90%">lema&nbsp;<span style="font-weight:bold"> áGILMENTE </span></td></tr></table></div></div></body></html>'
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', sameStringOfAnswer)
print(arraySpan)


What am I doing wrong?










share|improve this question
















My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.



NOTE: the suggested duplicate questions answers don't work for me, this isn't a duplicate question.



I'm consuming a web service. the answer I get is stored in the variable answerService, this is a very long string and after this I extract what is inside the tag span that has this structure:



<span style = "font-weight: bold"> xxx </ span>
"xxx" is what I want to extract
#with that I get the "xxx"
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)


I get an array of "n" length according to the span existing with this structure.



If I do this directly from the web service it does not work and I only get this answer:



['áGILMENTE']


Now, if I put the response of the web service sameStringOfAnswer in my code, the result is different:



print(arraySpan)
['ADV', 'áGILMENTE']


By logic the answer is the same and never changes, for some strange reason in real time when I get the response from the web service, I only get ['áGILMENTE'] when the answer I expect is ['ADV', 'áGILMENTE']



This is the key piece that shows that 2 span is always coming with the structure I need:



Here is my code:



import requests
import re
session = requests.Session()

getId=session.get('http://cartago.lllf.uam.es/grampal/grampal.cgi')
cookie=session.cookies.get_dict()
getId=session.cookies.get_dict()
getId=getId["CGISESSID"]
#getting an ID for request a webservice
getService=requests.get("http://cartago.lllf.uam.es/grampal/grampal.cgi?m=analiza&csrf="+getId+"&e="+"ágilmente", cookies=cookie)

answerService=getService.text
#get the value of the <span>
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', answerService)
print(answerService)
print("array",arraySpan)

#same code but using the result of service web
sameStringOfAnswer='<html xmlns="http://www.w3.org/TR/REC-html40"><head><title>Grampal </title><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><meta name="Content-Language" content="EN"><meta name="author" content="jmguirao@ugr.es"><link rel="icon" type="image/ico" href="/favicon.ico"/><style type="text/css">html,body,form,ul,li,h1,h3,pmargin:0; padding:0bodyfont-family: Arial, Helvetica, sans-serif; background-color:#fffatext-decoration: none;a:hovertext-decoration: underlineullist-style-type: nonetdpadding: 0.5pc 2pc 0pc 0pc.navfloat: right; padding: 0.5pc 0.5pc 0.5pc 0.5pc; margin-left:5px.nav lidisplay:inline; border-left: 1px solid #444; padding:0 0.4em;.nav li.firstborder-left:0.hidedisplay:noneinputtext-indent: 2pxinput[type="submit"]text-indent: 0DIV.delPagepadding: 0.5ex 5em 0.5em 5em; background-color:#ffd6ba;.delMainpadding: 2ex 0.5em 0.5pc 0.5em;.postmargin-bottom: 0.25pc; font-size: 100%; padding-top: 0.5ex;.posts, #postspadding: 0.5ex 0.5em 0.5pc 50px;.bannerpadding: 0.5ex 0 0.5pc 0.5em;background-color: #ffc6aa;clear: both.banner h1font-weight: bolder; font-size: 150%;margin:0; padding:0 0 0 26px; display: inline;h2font-weight: bolder; font-size: 140%; color: red; margin:0; padding:0 0 0 26px; display: inline;.resaltadofont-weight: bolder;font-size: 100%</style></head><body><div class="banner"><ul class="hide"><li><a href="#content">skip to content</a></li></ul><ul class="nav">Análsis de:<li class="first"><a title="Analizador morfosintáctico" href="/grampal/grampal.cgi?m=analiza&e=ágilmente">palabras</a></li><li><a title="Desambiguador contextual" href="/grampal/grampal.cgi?m=etiqueta&e=ágilmente">oraciones</a></li><li><a title="Etiquetado de textos" href="/grampal/grampal.cgi?m=xml">textos</a></li><li><a title="Formas de una palabra" href="/grampal/grampal.cgi?m=genera&e=ágilmente">Generación de formas</a></li><!--<li><a title="Transcripción fonética" href="/grampal/grampal.cgi?m=transcribe&e=ágilmente">Transcripción</a></li>--><li><a href="/grampal/grampal.cgi?m=etiquetario">Etiquetario</a></li><li><a href="/grampal/grampal.cgi?m=autores">Autores</a></li></ul><h1>Grampal</h1></div><div class="delPage" style="font-size: 80%;"><form method="GET" action="/grampal/grampal.cgi"><input type="hidden" name="m" value="analiza"><input type="hidden" name="csrf" value="94508700a0ae409a90718299ae00b0e0"><span class="resaltado">Palabra : </span><input name="e" size="60" value="ágilmente"><input type="submit" value="Analiza"> &nbsp;</form></div><br><h2>ágilmente</h2><div class="delMain"><div id="posts"><table><tr><td style="font-style:italic;font-size:90%">categoría&nbsp;<span style="font-weight:bold"> ADV </span></td><td style="font-style:italic;font-size:90%">lema&nbsp;<span style="font-weight:bold"> áGILMENTE </span></td></tr></table></div></div></body></html>'
arraySpan = re.findall(r'<span style="font-weight:bold">(.*?)<', sameStringOfAnswer)
print(arraySpan)


What am I doing wrong?







python






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 27 at 15:34









LogicalBranch

2,3162 gold badges10 silver badges40 bronze badges




2,3162 gold badges10 silver badges40 bronze badges










asked Mar 27 at 14:50









unusuariounusuario

4911 bronze badges




4911 bronze badges










  • 1





    Why are you using regex to parse html?

    – TheIncorrigible1
    Mar 27 at 14:52











  • @TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

    – unusuario
    Mar 27 at 14:53











  • @TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

    – unusuario
    Mar 27 at 14:57











  • Possible duplicate of RegEx match open tags except XHTML self-contained tags

    – Ralf
    Mar 27 at 14:59











  • @Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

    – unusuario
    Mar 27 at 15:02












  • 1





    Why are you using regex to parse html?

    – TheIncorrigible1
    Mar 27 at 14:52











  • @TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

    – unusuario
    Mar 27 at 14:53











  • @TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

    – unusuario
    Mar 27 at 14:57











  • Possible duplicate of RegEx match open tags except XHTML self-contained tags

    – Ralf
    Mar 27 at 14:59











  • @Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

    – unusuario
    Mar 27 at 15:02







1




1





Why are you using regex to parse html?

– TheIncorrigible1
Mar 27 at 14:52





Why are you using regex to parse html?

– TheIncorrigible1
Mar 27 at 14:52













@TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

– unusuario
Mar 27 at 14:53





@TheIncorrigible1 I'm new to python, maybe I'm doing some bad practice, but it's the way I found to extract what I need.

– unusuario
Mar 27 at 14:53













@TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

– unusuario
Mar 27 at 14:57





@TheIncorrigible1 I ask you please do not mark my answer as resolved, beyond whether I am doing a bad practice, I have a functional code, and the problem I have could also occur if done differently. please I want you to see my problem, it's kind of weird.

– unusuario
Mar 27 at 14:57













Possible duplicate of RegEx match open tags except XHTML self-contained tags

– Ralf
Mar 27 at 14:59





Possible duplicate of RegEx match open tags except XHTML self-contained tags

– Ralf
Mar 27 at 14:59













@Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

– unusuario
Mar 27 at 15:02





@Ralf is not duplicated, I ask you please do not mark my answer as duplicate. My code works well, I have no problem extracting what I need. my problem is in some difference coming from using the response of the web service to a different result of doing the same but with the value of the web service saved in a variable. I have this blocker for days, and I hope you please help.

– unusuario
Mar 27 at 15:02












1 Answer
1






active

oldest

votes


















2














The HTML from the webservice contains:



<span style="font-weight:bold"> ADVn </span>


But your minified code contains the tag without the newline n:



<span style="font-weight:bold"> ADV </span>


You can test the difference yourself:



>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAAn<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']



That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.



This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags






share|improve this answer



























  • You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

    – unusuario
    Mar 27 at 15:30











  • The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

    – Ralf
    Mar 27 at 15:41











  • @unusuario you should read more about regex to get a good solution for your use case.

    – Ralf
    Mar 27 at 15:41











  • You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

    – akent
    Mar 27 at 15:46










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55380168%2fwhy-do-i-get-a-different-result-every-time-i-save-and-extract-a-response-string%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














The HTML from the webservice contains:



<span style="font-weight:bold"> ADVn </span>


But your minified code contains the tag without the newline n:



<span style="font-weight:bold"> ADV </span>


You can test the difference yourself:



>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAAn<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']



That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.



This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags






share|improve this answer



























  • You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

    – unusuario
    Mar 27 at 15:30











  • The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

    – Ralf
    Mar 27 at 15:41











  • @unusuario you should read more about regex to get a good solution for your use case.

    – Ralf
    Mar 27 at 15:41











  • You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

    – akent
    Mar 27 at 15:46















2














The HTML from the webservice contains:



<span style="font-weight:bold"> ADVn </span>


But your minified code contains the tag without the newline n:



<span style="font-weight:bold"> ADV </span>


You can test the difference yourself:



>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAAn<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']



That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.



This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags






share|improve this answer



























  • You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

    – unusuario
    Mar 27 at 15:30











  • The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

    – Ralf
    Mar 27 at 15:41











  • @unusuario you should read more about regex to get a good solution for your use case.

    – Ralf
    Mar 27 at 15:41











  • You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

    – akent
    Mar 27 at 15:46













2












2








2







The HTML from the webservice contains:



<span style="font-weight:bold"> ADVn </span>


But your minified code contains the tag without the newline n:



<span style="font-weight:bold"> ADV </span>


You can test the difference yourself:



>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAAn<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']



That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.



This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags






share|improve this answer















The HTML from the webservice contains:



<span style="font-weight:bold"> ADVn </span>


But your minified code contains the tag without the newline n:



<span style="font-weight:bold"> ADV </span>


You can test the difference yourself:



>>> pattern = r'<span style="font-weight:bold">(.*?)<'
>>> re.findall(pattern, '<span style="font-weight:bold">AAAn<')
[]
>>> re.findall(pattern, '<span style="font-weight:bold">AAA<')
['AAA']



That is why the are different. You should have mentioned that you use a minifier, as they alter the HTML and you can not use regex after that and still expect the same output.



This whole problem would have been avoided if you used an XML parser instead of regex, just like the linked question suggests: RegEx match open tags except XHTML self-contained tags







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 27 at 15:26

























answered Mar 27 at 15:21









RalfRalf

8,8594 gold badges18 silver badges40 bronze badges




8,8594 gold badges18 silver badges40 bronze badges















  • You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

    – unusuario
    Mar 27 at 15:30











  • The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

    – Ralf
    Mar 27 at 15:41











  • @unusuario you should read more about regex to get a good solution for your use case.

    – Ralf
    Mar 27 at 15:41











  • You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

    – akent
    Mar 27 at 15:46

















  • You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

    – unusuario
    Mar 27 at 15:30











  • The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

    – Ralf
    Mar 27 at 15:41











  • @unusuario you should read more about regex to get a good solution for your use case.

    – Ralf
    Mar 27 at 15:41











  • You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

    – akent
    Mar 27 at 15:46
















You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

– unusuario
Mar 27 at 15:30





You are a genius, I think I finally understand my problem, although in theory I am getting everything that is inside the <span>, What is the best way or the solution to get what I need inside those tags <span>?

– unusuario
Mar 27 at 15:30













The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

– Ralf
Mar 27 at 15:41





The answers in this question suggest using ([sS]*?) (or some variation of it) instead of (.*?).

– Ralf
Mar 27 at 15:41













@unusuario you should read more about regex to get a good solution for your use case.

– Ralf
Mar 27 at 15:41





@unusuario you should read more about regex to get a good solution for your use case.

– Ralf
Mar 27 at 15:41













You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

– akent
Mar 27 at 15:46





You should really really use a parser. Try BeautifulSoup. Here's some code that does what you want to get you started. gist.github.com/akent/86dd72a085d452e8db5f4d76c3cce2c9

– akent
Mar 27 at 15:46








Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55380168%2fwhy-do-i-get-a-different-result-every-time-i-save-and-extract-a-response-string%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript