How to replace hex value in a stringRemoving control characters from a string in pythonWhat is the difference between String and string in C#?How do I iterate over the words of a string?How do I read / convert an InputStream into a String in Java?Case insensitive 'Contains(string)'How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string?How to check whether a string contains a substring in JavaScript?Does Python have a string 'contains' substring method?How do I convert a String to an int in Java?Why is char[] preferred over String for passwords?

Get file name and directory in .vimrc file

How could Tony Stark wield the Infinity Nano Gauntlet - at all?

What are some tips and tricks for finding the cheapest flight when luggage and other fees are not revealed until far into the booking process?

C++ Least cost swapping 2

Subgroup generated by a subgroup and a conjugate of it

Have there ever been other TV shows or Films that told a similiar story to the new 90210 show?

Alignment of different align environment

Gofer work in exchange for LoR

Why is the battery jumpered to a resistor in this schematic?

Tikz: The position of a label change step-wise and not in a continuous way

Is it alright to say good afternoon Sirs and Madams in a panel interview?

Have made several mistakes during the course of my PhD. Can't help but feel resentment. Can I get some advice about how to move forward?

What was the intention with the Commodore 128?

Which manga depicts Doraemon and Nobita on Easter Island?

Heyawacky: Ace of Cups

Meaning and structure of headline "Hair it is: A List of ..."

Why should P.I be willing to write strong LOR even if that means losing a undergraduate from his/her lab?

When and which board game was the first to be ever invented?

Has there ever been a truly bilingual country prior to the contemporary period?

How does the illumination of the sky from the sun compare to that of the moon?

Reducing contention in thread-safe LruCache

How do I answer an interview question about how to handle a hard deadline I won't be able to meet?

Did Michelle Obama have a staff of 23; and Melania have a staff of 4?

Eric Andre had a dream



How to replace hex value in a string


Removing control characters from a string in pythonWhat is the difference between String and string in C#?How do I iterate over the words of a string?How do I read / convert an InputStream into a String in Java?Case insensitive 'Contains(string)'How do I make the first letter of a string uppercase in JavaScript?How to replace all occurrences of a string?How to check whether a string contains a substring in JavaScript?Does Python have a string 'contains' substring method?How do I convert a String to an int in Java?Why is char[] preferred over String for passwords?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















While importing data from a flat file, I noticed some embedded hex-values in the string (<0x00>, <0x01>).



I want to replace them with specific characters, but am unable to do so. Removing them won't work either.
What it looks like in the exported flat file: https://i.imgur.com/7MQpoMH.png
Another example: https://i.imgur.com/3ZUSGIr.png




This is what I've tried:
(and mind, <0x01> represents a none-editable entity. It's not recognized here.)



import io
with io.open('1.txt', 'r+', encoding="utf-8") as p:
s=p.read()
# included in case it bears any significance


import re
import binascii

s = "Some string with hex: <0x01>"

s = s.encode('latin1').decode('utf-8')
# throws e.g.: >>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 114: invalid start byte

s = re.sub(r'<0x01>', r'.', s)
s = re.sub(r'\0x01', r'.', s)
s = re.sub(r'\\0x01', r'.', s)
s = s.replace('x01', '.')
s = s.replace('<0x01>', '.')
s = s.replace('0x01', '.')


or something along these lines in hopes to get a grasp of it while iterating through the whole string:



for x in s:
try:
base64.encodebytes(x)
base64.decodebytes(x)
s.strip(binascii.unhexlify(x))
s.decode('utf-8')
s.encode('latin1').decode('utf-8')
except:
pass


Nothing seems to get the job done.



I'd expect the characters to be replacable with the methods I've dug up, but they are not. What am I missing?
NB: I have to preserve umlauts (äöüÄÖÜ)



-- edit:



Could I introduce the hex-values in the first place when exporting? If so, is there a way to avoid that?



with io.open('out.txt', 'w', encoding="utf-8") as temp:
temp.write(s)









share|improve this question
































    0















    While importing data from a flat file, I noticed some embedded hex-values in the string (<0x00>, <0x01>).



    I want to replace them with specific characters, but am unable to do so. Removing them won't work either.
    What it looks like in the exported flat file: https://i.imgur.com/7MQpoMH.png
    Another example: https://i.imgur.com/3ZUSGIr.png




    This is what I've tried:
    (and mind, <0x01> represents a none-editable entity. It's not recognized here.)



    import io
    with io.open('1.txt', 'r+', encoding="utf-8") as p:
    s=p.read()
    # included in case it bears any significance


    import re
    import binascii

    s = "Some string with hex: <0x01>"

    s = s.encode('latin1').decode('utf-8')
    # throws e.g.: >>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 114: invalid start byte

    s = re.sub(r'<0x01>', r'.', s)
    s = re.sub(r'\0x01', r'.', s)
    s = re.sub(r'\\0x01', r'.', s)
    s = s.replace('x01', '.')
    s = s.replace('<0x01>', '.')
    s = s.replace('0x01', '.')


    or something along these lines in hopes to get a grasp of it while iterating through the whole string:



    for x in s:
    try:
    base64.encodebytes(x)
    base64.decodebytes(x)
    s.strip(binascii.unhexlify(x))
    s.decode('utf-8')
    s.encode('latin1').decode('utf-8')
    except:
    pass


    Nothing seems to get the job done.



    I'd expect the characters to be replacable with the methods I've dug up, but they are not. What am I missing?
    NB: I have to preserve umlauts (äöüÄÖÜ)



    -- edit:



    Could I introduce the hex-values in the first place when exporting? If so, is there a way to avoid that?



    with io.open('out.txt', 'w', encoding="utf-8") as temp:
    temp.write(s)









    share|improve this question




























      0












      0








      0








      While importing data from a flat file, I noticed some embedded hex-values in the string (<0x00>, <0x01>).



      I want to replace them with specific characters, but am unable to do so. Removing them won't work either.
      What it looks like in the exported flat file: https://i.imgur.com/7MQpoMH.png
      Another example: https://i.imgur.com/3ZUSGIr.png




      This is what I've tried:
      (and mind, <0x01> represents a none-editable entity. It's not recognized here.)



      import io
      with io.open('1.txt', 'r+', encoding="utf-8") as p:
      s=p.read()
      # included in case it bears any significance


      import re
      import binascii

      s = "Some string with hex: <0x01>"

      s = s.encode('latin1').decode('utf-8')
      # throws e.g.: >>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 114: invalid start byte

      s = re.sub(r'<0x01>', r'.', s)
      s = re.sub(r'\0x01', r'.', s)
      s = re.sub(r'\\0x01', r'.', s)
      s = s.replace('x01', '.')
      s = s.replace('<0x01>', '.')
      s = s.replace('0x01', '.')


      or something along these lines in hopes to get a grasp of it while iterating through the whole string:



      for x in s:
      try:
      base64.encodebytes(x)
      base64.decodebytes(x)
      s.strip(binascii.unhexlify(x))
      s.decode('utf-8')
      s.encode('latin1').decode('utf-8')
      except:
      pass


      Nothing seems to get the job done.



      I'd expect the characters to be replacable with the methods I've dug up, but they are not. What am I missing?
      NB: I have to preserve umlauts (äöüÄÖÜ)



      -- edit:



      Could I introduce the hex-values in the first place when exporting? If so, is there a way to avoid that?



      with io.open('out.txt', 'w', encoding="utf-8") as temp:
      temp.write(s)









      share|improve this question
















      While importing data from a flat file, I noticed some embedded hex-values in the string (<0x00>, <0x01>).



      I want to replace them with specific characters, but am unable to do so. Removing them won't work either.
      What it looks like in the exported flat file: https://i.imgur.com/7MQpoMH.png
      Another example: https://i.imgur.com/3ZUSGIr.png




      This is what I've tried:
      (and mind, <0x01> represents a none-editable entity. It's not recognized here.)



      import io
      with io.open('1.txt', 'r+', encoding="utf-8") as p:
      s=p.read()
      # included in case it bears any significance


      import re
      import binascii

      s = "Some string with hex: <0x01>"

      s = s.encode('latin1').decode('utf-8')
      # throws e.g.: >>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 114: invalid start byte

      s = re.sub(r'<0x01>', r'.', s)
      s = re.sub(r'\0x01', r'.', s)
      s = re.sub(r'\\0x01', r'.', s)
      s = s.replace('x01', '.')
      s = s.replace('<0x01>', '.')
      s = s.replace('0x01', '.')


      or something along these lines in hopes to get a grasp of it while iterating through the whole string:



      for x in s:
      try:
      base64.encodebytes(x)
      base64.decodebytes(x)
      s.strip(binascii.unhexlify(x))
      s.decode('utf-8')
      s.encode('latin1').decode('utf-8')
      except:
      pass


      Nothing seems to get the job done.



      I'd expect the characters to be replacable with the methods I've dug up, but they are not. What am I missing?
      NB: I have to preserve umlauts (äöüÄÖÜ)



      -- edit:



      Could I introduce the hex-values in the first place when exporting? If so, is there a way to avoid that?



      with io.open('out.txt', 'w', encoding="utf-8") as temp:
      temp.write(s)






      python-3.x string encoding utf-8 hex






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 27 at 13:37







      P. A. Monsaille

















      asked Mar 27 at 13:22









      P. A. MonsailleP. A. Monsaille

      195 bronze badges




      195 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          1














          Judging from the images, these are actually control characters.
          Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
          You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.



          In Python, these characters can be produced in string literals with escape sequences using the notation xHH, with two hexadecimal digits.
          The fragment from the first image is probably equal to the following string:



          "sich zx01 B. irgendeine"


          Your attempts to remove them were close.
          s = s.replace('x01', '.') should work.






          share|improve this answer

























          • Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

            – P. A. Monsaille
            Mar 27 at 15:18












          • I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

            – lenz
            Mar 27 at 18:57










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378278%2fhow-to-replace-hex-value-in-a-string%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Judging from the images, these are actually control characters.
          Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
          You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.



          In Python, these characters can be produced in string literals with escape sequences using the notation xHH, with two hexadecimal digits.
          The fragment from the first image is probably equal to the following string:



          "sich zx01 B. irgendeine"


          Your attempts to remove them were close.
          s = s.replace('x01', '.') should work.






          share|improve this answer

























          • Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

            – P. A. Monsaille
            Mar 27 at 15:18












          • I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

            – lenz
            Mar 27 at 18:57















          1














          Judging from the images, these are actually control characters.
          Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
          You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.



          In Python, these characters can be produced in string literals with escape sequences using the notation xHH, with two hexadecimal digits.
          The fragment from the first image is probably equal to the following string:



          "sich zx01 B. irgendeine"


          Your attempts to remove them were close.
          s = s.replace('x01', '.') should work.






          share|improve this answer

























          • Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

            – P. A. Monsaille
            Mar 27 at 15:18












          • I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

            – lenz
            Mar 27 at 18:57













          1












          1








          1







          Judging from the images, these are actually control characters.
          Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
          You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.



          In Python, these characters can be produced in string literals with escape sequences using the notation xHH, with two hexadecimal digits.
          The fragment from the first image is probably equal to the following string:



          "sich zx01 B. irgendeine"


          Your attempts to remove them were close.
          s = s.replace('x01', '.') should work.






          share|improve this answer













          Judging from the images, these are actually control characters.
          Your editor displays them in this greyed-out way showing you the value of the bytes using hex notation.
          You don't have the characters "0x01" in your data, but really a single byte with the value 1, so unhexlify and friends won't help.



          In Python, these characters can be produced in string literals with escape sequences using the notation xHH, with two hexadecimal digits.
          The fragment from the first image is probably equal to the following string:



          "sich zx01 B. irgendeine"


          Your attempts to remove them were close.
          s = s.replace('x01', '.') should work.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 27 at 14:03









          lenzlenz

          3,3994 gold badges18 silver badges32 bronze badges




          3,3994 gold badges18 silver badges32 bronze badges















          • Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

            – P. A. Monsaille
            Mar 27 at 15:18












          • I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

            – lenz
            Mar 27 at 18:57

















          • Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

            – P. A. Monsaille
            Mar 27 at 15:18












          • I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

            – lenz
            Mar 27 at 18:57
















          Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

          – P. A. Monsaille
          Mar 27 at 15:18






          Yep, that did it … thank you. Fyi, I figured out that I introduced the characters myself during re.sub replacements. For example, re.sub('(?<=w)([,.!?;])(?=w)', u'1 ', s) backreferenced the replaced character and thus introduced the "single byte with the value 1". The regex-module apparently does a better job at this: '(?<=w)p;,.!?(?=w)'. (via reference)

          – P. A. Monsaille
          Mar 27 at 15:18














          I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

          – lenz
          Mar 27 at 18:57





          I don't think re.sub with n backreferences will introduce control characters. That is, unless you mispell the backreferences as x01, of course. Btw, if this answer solved the problem you described, consider accepting it through the tick on the left.

          – lenz
          Mar 27 at 18:57








          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378278%2fhow-to-replace-hex-value-in-a-string%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

          Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

          Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript