Extract names from textRegular expression to extract text between square bracketsHow do I remove all non alphanumeric characters from a string except dash?How to extract numbers from a string in Python?Drop data frame columns by nameHow to extract a substring using regexFind and extract a number from a stringChanging column names of a data frameExtracting the last n characters from a string in RExtract hostname name from stringExtracting specific columns from a data frame

Why does the trade federation become so alarmed upon learning the ambassadors are Jedi Knights?

Would letting a multiclass character rebuild their character to be single-classed be game-breaking?

In which ways do anagamis still experience ignorance?

Was adding milk to tea started to reduce employee tea break time?

How can I deal with a player trying to insert real-world mythology into my homebrew setting?

What is temperature on a quantum level?

TikZ Can I draw an arrow by specifying the initial point, direction, and length?

Do native speakers use ZVE or CPU?

QGIS Linestring rendering curves between vertex

Is this floating-point optimization allowed?

Why is dry soil hydrophobic? Bad gardener paradox

Is `curl something | sudo bash -` a reasonably safe installation method?

CPU overheating in Ubuntu 18.04

What would the EU do if an EU member declared war on another EU member?

Bob's unnecessary trip to the shops

What does `[$'rn']` mean?

Cubic programming and beyond?

What is the German equivalent of 干物女 (dried fish woman)?

What would be the ideal melee weapon made of "Phase Metal"?

latinate or other words of foreign origin as opposed to Germanic words

Dropping outliers based on "2.5 times the RMSE"

Where is my understanding of TikZ styles wrong?

How to make 1,1-diphenyl-1-butene from benzophenone and 1-bromopropane?

Why hasn't the U.S. government paid war reparations to any country it attacked?



Extract names from text


Regular expression to extract text between square bracketsHow do I remove all non alphanumeric characters from a string except dash?How to extract numbers from a string in Python?Drop data frame columns by nameHow to extract a substring using regexFind and extract a number from a stringChanging column names of a data frameExtracting the last n characters from a string in RExtract hostname name from stringExtracting specific columns from a data frame






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I'm trying to extract a list of rugby players names from a string. The string contains all of the information from a table, containing the headers (team names) as well as the name of the player in each position for each team. It also has the player ranking but I don't care about that.



Note, the 1-15 numbers indicate positions, and there's always two names following each position (home player and away player).



Here's the string:



"Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78"


So basically what I want is just the list of names with the team names as the headers e.g.



Lions Jaguares

Dylan Smith Juan Pablo Zeiss
Malcolm Marx Julian Montoya
... ...


Any help would be much appreciated!










share|improve this question



















  • 1





    Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

    – R.S.
    Mar 26 at 6:26

















0















I'm trying to extract a list of rugby players names from a string. The string contains all of the information from a table, containing the headers (team names) as well as the name of the player in each position for each team. It also has the player ranking but I don't care about that.



Note, the 1-15 numbers indicate positions, and there's always two names following each position (home player and away player).



Here's the string:



"Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78"


So basically what I want is just the list of names with the team names as the headers e.g.



Lions Jaguares

Dylan Smith Juan Pablo Zeiss
Malcolm Marx Julian Montoya
... ...


Any help would be much appreciated!










share|improve this question



















  • 1





    Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

    – R.S.
    Mar 26 at 6:26













0












0








0








I'm trying to extract a list of rugby players names from a string. The string contains all of the information from a table, containing the headers (team names) as well as the name of the player in each position for each team. It also has the player ranking but I don't care about that.



Note, the 1-15 numbers indicate positions, and there's always two names following each position (home player and away player).



Here's the string:



"Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78"


So basically what I want is just the list of names with the team names as the headers e.g.



Lions Jaguares

Dylan Smith Juan Pablo Zeiss
Malcolm Marx Julian Montoya
... ...


Any help would be much appreciated!










share|improve this question
















I'm trying to extract a list of rugby players names from a string. The string contains all of the information from a table, containing the headers (team names) as well as the name of the player in each position for each team. It also has the player ranking but I don't care about that.



Note, the 1-15 numbers indicate positions, and there's always two names following each position (home player and away player).



Here's the string:



"Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78"


So basically what I want is just the list of names with the team names as the headers e.g.



Lions Jaguares

Dylan Smith Juan Pablo Zeiss
Malcolm Marx Julian Montoya
... ...


Any help would be much appreciated!







r regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 5:42









Uwe Keim

28.1k32 gold badges141 silver badges225 bronze badges




28.1k32 gold badges141 silver badges225 bronze badges










asked Mar 26 at 5:40









Liam YoungLiam Young

63 bronze badges




63 bronze badges







  • 1





    Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

    – R.S.
    Mar 26 at 6:26












  • 1





    Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

    – R.S.
    Mar 26 at 6:26







1




1





Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

– R.S.
Mar 26 at 6:26





Just a suggestion- looks like the data is already in tabular form. It might be lot more logical and easier to read it as a dataframe. Is it so?

– R.S.
Mar 26 at 6:26












2 Answers
2






active

oldest

votes


















1














While I agree with R.S.'s comment to read the data as dataframe directly, here's my solution using regex:



# build a "player name - RPI" pattern
pattern = "[a-zA-Z]+(\s[a-zA-Z]+)+\s+\d1,2"

# find all matches in string
m = gregexpr(pattern, x)

# extract all matches from string
plyrs = regmatches(x, m)[[1]]

# build dataframe
data.frame(lions = plyrs[c(TRUE, FALSE)],
jaguares = plyrs[c(FALSE, TRUE)],
stringsAsFactors=FALSE)





share|improve this answer






























    0














    First of all, you could try to create a table structure instead of a huge long string.
    Something like this could give you a little start.



    data = 'Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78'
    import re
    data = re.sub(r'(s)11,', r'1', data)
    data = re.sub(r'RPIs(d+)', r'n1', data)
    data = re.sub(r'(#)s', r'n1', data)
    print(re.sub(r'd+s(d+)', r'n1', data))





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55350501%2fextract-names-from-text%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      While I agree with R.S.'s comment to read the data as dataframe directly, here's my solution using regex:



      # build a "player name - RPI" pattern
      pattern = "[a-zA-Z]+(\s[a-zA-Z]+)+\s+\d1,2"

      # find all matches in string
      m = gregexpr(pattern, x)

      # extract all matches from string
      plyrs = regmatches(x, m)[[1]]

      # build dataframe
      data.frame(lions = plyrs[c(TRUE, FALSE)],
      jaguares = plyrs[c(FALSE, TRUE)],
      stringsAsFactors=FALSE)





      share|improve this answer



























        1














        While I agree with R.S.'s comment to read the data as dataframe directly, here's my solution using regex:



        # build a "player name - RPI" pattern
        pattern = "[a-zA-Z]+(\s[a-zA-Z]+)+\s+\d1,2"

        # find all matches in string
        m = gregexpr(pattern, x)

        # extract all matches from string
        plyrs = regmatches(x, m)[[1]]

        # build dataframe
        data.frame(lions = plyrs[c(TRUE, FALSE)],
        jaguares = plyrs[c(FALSE, TRUE)],
        stringsAsFactors=FALSE)





        share|improve this answer

























          1












          1








          1







          While I agree with R.S.'s comment to read the data as dataframe directly, here's my solution using regex:



          # build a "player name - RPI" pattern
          pattern = "[a-zA-Z]+(\s[a-zA-Z]+)+\s+\d1,2"

          # find all matches in string
          m = gregexpr(pattern, x)

          # extract all matches from string
          plyrs = regmatches(x, m)[[1]]

          # build dataframe
          data.frame(lions = plyrs[c(TRUE, FALSE)],
          jaguares = plyrs[c(FALSE, TRUE)],
          stringsAsFactors=FALSE)





          share|improve this answer













          While I agree with R.S.'s comment to read the data as dataframe directly, here's my solution using regex:



          # build a "player name - RPI" pattern
          pattern = "[a-zA-Z]+(\s[a-zA-Z]+)+\s+\d1,2"

          # find all matches in string
          m = gregexpr(pattern, x)

          # extract all matches from string
          plyrs = regmatches(x, m)[[1]]

          # build dataframe
          data.frame(lions = plyrs[c(TRUE, FALSE)],
          jaguares = plyrs[c(FALSE, TRUE)],
          stringsAsFactors=FALSE)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 26 at 8:04









          MarkusNMarkusN

          1,5129 silver badges15 bronze badges




          1,5129 silver badges15 bronze badges























              0














              First of all, you could try to create a table structure instead of a huge long string.
              Something like this could give you a little start.



              data = 'Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78'
              import re
              data = re.sub(r'(s)11,', r'1', data)
              data = re.sub(r'RPIs(d+)', r'n1', data)
              data = re.sub(r'(#)s', r'n1', data)
              print(re.sub(r'd+s(d+)', r'n1', data))





              share|improve this answer



























                0














                First of all, you could try to create a table structure instead of a huge long string.
                Something like this could give you a little start.



                data = 'Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78'
                import re
                data = re.sub(r'(s)11,', r'1', data)
                data = re.sub(r'RPIs(d+)', r'n1', data)
                data = re.sub(r'(#)s', r'n1', data)
                print(re.sub(r'd+s(d+)', r'n1', data))





                share|improve this answer

























                  0












                  0








                  0







                  First of all, you could try to create a table structure instead of a huge long string.
                  Something like this could give you a little start.



                  data = 'Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78'
                  import re
                  data = re.sub(r'(s)11,', r'1', data)
                  data = re.sub(r'RPIs(d+)', r'n1', data)
                  data = re.sub(r'(#)s', r'n1', data)
                  print(re.sub(r'd+s(d+)', r'n1', data))





                  share|improve this answer













                  First of all, you could try to create a table structure instead of a huge long string.
                  Something like this could give you a little start.



                  data = 'Team Sheets # LIO Lions RPI JAG Jaguares RPI 1 Dylan Smith 83 Juan Pablo Zeiss 59 2 Malcolm Marx 90 Julian Montoya 73 3 Carlu Sadie 78 Enrique Pieretto Heilan 54 4 Ruan Vermaak 72 Guido Petti Pagadizaval 77 5 Rhyno Herbst 72 Matias Alemanno 67 6 Marnus Schoeman 82 Juan Manuel Leguizamon 58 7 Vincent Tshituka 64 Marcos Kremer 55 8 Kwagga Smith 88 Rodrigo Bruni 62 9 Ross Cronje 74 Martin Landajo 52 10 Elton Jantjies 80 Joaquin Diaz Bonilla 62 11 Courtnall Skosan 76 Emiliano Boffelli 75 12 Franco Naude 52 Bautista Ezcurra 66 13 Wandisile Simelane 73 Matias Moroni 75 14 Sylvian Mahuza 76 Sebastian Cancelliere 65 15 Andries Coetzee 73 Joaquin Tuculet 68 Substitutes # LIO Lions RPI JAG Jaguares RPI 16 Pieter Jansen 58 Gaspar Baldunciel 61 17 Nathan McBeth 60 Santiago Garcia Botta 65 18 Frans van Wyk 58 Santiago Medrano 72 19 Stephan Lewies 81 Tomas Lavanini 68 20 James Venter 61 Tomas Lezana 62 21 Dillon Smit 61 Tomas Cubelli 63 22 Harold Vorster 69 Juan Cruz Mallia 66 23 Gianni Lombard 64 Ramiro Moyano 78'
                  import re
                  data = re.sub(r'(s)11,', r'1', data)
                  data = re.sub(r'RPIs(d+)', r'n1', data)
                  data = re.sub(r'(#)s', r'n1', data)
                  print(re.sub(r'd+s(d+)', r'n1', data))






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 26 at 7:35









                  XenobiologistXenobiologist

                  1,6281 gold badge8 silver badges14 bronze badges




                  1,6281 gold badge8 silver badges14 bronze badges



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55350501%2fextract-names-from-text%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                      Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                      Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript