How can method which evaluates a list to determine if it contains specific consecutive items be improved?Checking if list is a sublistFinding the index of an item given a list containing it in PythonHow to randomly select an item from a list?How can I get a list of locally installed Python modules?How to remove items from a list while iterating?How can I count the occurrences of a list item?Python: determine if all items of a list are the same itemHow can I reverse a list in Python?How do I check if a string contains a specific word?How to remove the first Item from a list?Check if a Python list item contains a string inside another string

Is it correct to join training and validation set before inferring on test-set?

Swapping "Good" and "Bad"

Historical experience as a guide to warship design?

Extract string from each line of a file

What is the measurable difference between dry basil and fresh?

Why did Harry Potter get a bedroom?

How to drill holes in 3/8" thick steel plates?

How to deal with moral/legal subjects in writing?

Why weren't bootable game disks ever common on the IBM PC?

RPI3B+: What are the four components below the HDMI connector called?

Why is the ladder of the LM always in the dark side of the LM?

What is a "shilicashe?"

How are mathematicians paid to do research?

What does the phrase "head down the rat's hole" mean here?

Why is the air gap between the stator and rotor on a motor kept as small as it is?

Is it OK to leave real names & info visible in business card portfolio?

What were the main German words for a prostitute before 1800?

Single word for "refusing to move to next activity unless present one is completed."

Credit score and financing new car

How do you move up one folder in Finder?

Does throwing a penny at a train stop the train?

Why does this potentiometer in an op-amp feedback path cause noise when adjusted?

Salt, pepper, herbs and spices

How can I fix the dull colors I am getting in Ubuntu 19.04 Terminal?



How can method which evaluates a list to determine if it contains specific consecutive items be improved?


Checking if list is a sublistFinding the index of an item given a list containing it in PythonHow to randomly select an item from a list?How can I get a list of locally installed Python modules?How to remove items from a list while iterating?How can I count the occurrences of a list item?Python: determine if all items of a list are the same itemHow can I reverse a list in Python?How do I check if a string contains a specific word?How to remove the first Item from a list?Check if a Python list item contains a string inside another string






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)



#Example nestedList: 

nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]


I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.



I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.



def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False


I will do this,



bad = ['O', 'I']

for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)

#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.


I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.










share|improve this question
























  • (1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

    – Michael Butscher
    Mar 26 at 1:55












  • (3) How many different string items are there roughly?

    – Michael Butscher
    Mar 26 at 1:59











  • See Checking if list is a sublist.

    – martineau
    Mar 26 at 2:12











  • Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

    – sloganq
    Mar 26 at 3:00











  • I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

    – sloganq
    Mar 26 at 21:08

















1















I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)



#Example nestedList: 

nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]


I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.



I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.



def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False


I will do this,



bad = ['O', 'I']

for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)

#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.


I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.










share|improve this question
























  • (1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

    – Michael Butscher
    Mar 26 at 1:55












  • (3) How many different string items are there roughly?

    – Michael Butscher
    Mar 26 at 1:59











  • See Checking if list is a sublist.

    – martineau
    Mar 26 at 2:12











  • Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

    – sloganq
    Mar 26 at 3:00











  • I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

    – sloganq
    Mar 26 at 21:08













1












1








1








I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)



#Example nestedList: 

nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]


I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.



I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.



def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False


I will do this,



bad = ['O', 'I']

for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)

#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.


I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.










share|improve this question
















I have a nested list of tens of millions of lists (I can use tuples also). Each list is 2-7 items long. Each item in a list is a string of 1-5 characters and occurs no more than once per list. (I use single char items in my example below for simplicity)



#Example nestedList: 

nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]


I need to find which lists in my nested list contain a pair of items so I can do stuff to these lists while ignoring the rest. This needs to be as efficient as possible.



I am using the following function but it seems pretty slow and I just know there has to be a smarter way to do this.



def isBadInList(bad, checkThisList):
numChecks = len(list) - 1
for x in range(numChecks):
if checkThisList[x] == bad[0] and checkThisList[x + 1] == bad[1]:
return True
elif checkThisList[x] == bad[1] and checkThisList[x + 1] == bad[0]:
return True
return False


I will do this,



bad = ['O', 'I']

for checkThisList in nestedLists:
result = isBadInList(bad, checkThisList)
if result:
doStuffToList(checkThisList)

#The function isBadInList() only returns true for the first and third list in nestedList and false for all else.


I need a way to do this faster if possible. I can use tuples instead of lists, or whatever it takes.







python list set contains intersection






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 3:05







sloganq

















asked Mar 26 at 1:40









sloganqsloganq

134 bronze badges




134 bronze badges












  • (1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

    – Michael Butscher
    Mar 26 at 1:55












  • (3) How many different string items are there roughly?

    – Michael Butscher
    Mar 26 at 1:59











  • See Checking if list is a sublist.

    – martineau
    Mar 26 at 2:12











  • Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

    – sloganq
    Mar 26 at 3:00











  • I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

    – sloganq
    Mar 26 at 21:08

















  • (1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

    – Michael Butscher
    Mar 26 at 1:55












  • (3) How many different string items are there roughly?

    – Michael Butscher
    Mar 26 at 1:59











  • See Checking if list is a sublist.

    – martineau
    Mar 26 at 2:12











  • Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

    – sloganq
    Mar 26 at 3:00











  • I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

    – sloganq
    Mar 26 at 21:08
















(1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

– Michael Butscher
Mar 26 at 1:55






(1) I assume the string items aren't all one character long? (2) Do you plan to run this operation often for same value of nestedLists and different bad or vice versa or is everything different at each run?

– Michael Butscher
Mar 26 at 1:55














(3) How many different string items are there roughly?

– Michael Butscher
Mar 26 at 1:59





(3) How many different string items are there roughly?

– Michael Butscher
Mar 26 at 1:59













See Checking if list is a sublist.

– martineau
Mar 26 at 2:12





See Checking if list is a sublist.

– martineau
Mar 26 at 2:12













Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

– sloganq
Mar 26 at 3:00





Each string item is 1 - 5 characters long. Also, for the future, I'm considering switching the string items to ints, which will represent unique strings items.

– sloganq
Mar 26 at 3:00













I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

– sloganq
Mar 26 at 21:08





I kind of cheated to solve my problem. I found a way to represent pairs of items as integers. So now I can simply use: is intX in listY.

– sloganq
Mar 26 at 21:08












2 Answers
2






active

oldest

votes


















0














nestedList = [
['a', 'e', 'O', 'I', 'g', 's'],
['w', 'I', 'u', 'O', 's', 'g'],
['e', 'z', 's', 'I', 'O', 'g']
]

#first create a map
pairdict = dict()


for i in range(len(nestedList)):
for j in range(len(nestedList[i])-1):
pair1 = (nestedList[i][j],nestedList[i][j+1])
if pair1 in pairdict:
pairdict[pair1].append(i+1)
else:
pairdict[pair1] = [i+1]
pair2 = (nestedList[i][j+1],nestedList[i][j])
if pair2 in pairdict:
pairdict[pair2].append(i+1)
else:
pairdict[pair2] = [i+1]

del nestedList

print(pairdict.get(('e','z'),None))


create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.






share|improve this answer






























    0














    I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).



    import re

    p = re.compile('[OI]2|[IO]2') # match only OI or IO

    def is_bad(pattern, to_check):
    for item in to_check:
    maybe_found = pattern.search(''.join(item))
    if maybe_found:
    yield True
    else:
    yield False


    l = list(is_bad(p, nestedList))

    print(l)
    # [True, False, True]





    share|improve this answer

























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348688%2fhow-can-method-which-evaluates-a-list-to-determine-if-it-contains-specific-conse%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      nestedList = [
      ['a', 'e', 'O', 'I', 'g', 's'],
      ['w', 'I', 'u', 'O', 's', 'g'],
      ['e', 'z', 's', 'I', 'O', 'g']
      ]

      #first create a map
      pairdict = dict()


      for i in range(len(nestedList)):
      for j in range(len(nestedList[i])-1):
      pair1 = (nestedList[i][j],nestedList[i][j+1])
      if pair1 in pairdict:
      pairdict[pair1].append(i+1)
      else:
      pairdict[pair1] = [i+1]
      pair2 = (nestedList[i][j+1],nestedList[i][j])
      if pair2 in pairdict:
      pairdict[pair2].append(i+1)
      else:
      pairdict[pair2] = [i+1]

      del nestedList

      print(pairdict.get(('e','z'),None))


      create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
      and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.






      share|improve this answer



























        0














        nestedList = [
        ['a', 'e', 'O', 'I', 'g', 's'],
        ['w', 'I', 'u', 'O', 's', 'g'],
        ['e', 'z', 's', 'I', 'O', 'g']
        ]

        #first create a map
        pairdict = dict()


        for i in range(len(nestedList)):
        for j in range(len(nestedList[i])-1):
        pair1 = (nestedList[i][j],nestedList[i][j+1])
        if pair1 in pairdict:
        pairdict[pair1].append(i+1)
        else:
        pairdict[pair1] = [i+1]
        pair2 = (nestedList[i][j+1],nestedList[i][j])
        if pair2 in pairdict:
        pairdict[pair2].append(i+1)
        else:
        pairdict[pair2] = [i+1]

        del nestedList

        print(pairdict.get(('e','z'),None))


        create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
        and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.






        share|improve this answer

























          0












          0








          0







          nestedList = [
          ['a', 'e', 'O', 'I', 'g', 's'],
          ['w', 'I', 'u', 'O', 's', 'g'],
          ['e', 'z', 's', 'I', 'O', 'g']
          ]

          #first create a map
          pairdict = dict()


          for i in range(len(nestedList)):
          for j in range(len(nestedList[i])-1):
          pair1 = (nestedList[i][j],nestedList[i][j+1])
          if pair1 in pairdict:
          pairdict[pair1].append(i+1)
          else:
          pairdict[pair1] = [i+1]
          pair2 = (nestedList[i][j+1],nestedList[i][j])
          if pair2 in pairdict:
          pairdict[pair2].append(i+1)
          else:
          pairdict[pair2] = [i+1]

          del nestedList

          print(pairdict.get(('e','z'),None))


          create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
          and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.






          share|improve this answer













          nestedList = [
          ['a', 'e', 'O', 'I', 'g', 's'],
          ['w', 'I', 'u', 'O', 's', 'g'],
          ['e', 'z', 's', 'I', 'O', 'g']
          ]

          #first create a map
          pairdict = dict()


          for i in range(len(nestedList)):
          for j in range(len(nestedList[i])-1):
          pair1 = (nestedList[i][j],nestedList[i][j+1])
          if pair1 in pairdict:
          pairdict[pair1].append(i+1)
          else:
          pairdict[pair1] = [i+1]
          pair2 = (nestedList[i][j+1],nestedList[i][j])
          if pair2 in pairdict:
          pairdict[pair2].append(i+1)
          else:
          pairdict[pair2] = [i+1]

          del nestedList

          print(pairdict.get(('e','z'),None))


          create a value pair and store them into map,the key is pair,value is index,and then del your list(this maybe takes too much memory),
          and then ,you can take advantage of the dict for look up,and print the indexes where the value appears.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 26 at 2:07









          goudangoudan

          484 bronze badges




          484 bronze badges























              0














              I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).



              import re

              p = re.compile('[OI]2|[IO]2') # match only OI or IO

              def is_bad(pattern, to_check):
              for item in to_check:
              maybe_found = pattern.search(''.join(item))
              if maybe_found:
              yield True
              else:
              yield False


              l = list(is_bad(p, nestedList))

              print(l)
              # [True, False, True]





              share|improve this answer



























                0














                I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).



                import re

                p = re.compile('[OI]2|[IO]2') # match only OI or IO

                def is_bad(pattern, to_check):
                for item in to_check:
                maybe_found = pattern.search(''.join(item))
                if maybe_found:
                yield True
                else:
                yield False


                l = list(is_bad(p, nestedList))

                print(l)
                # [True, False, True]





                share|improve this answer

























                  0












                  0








                  0







                  I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).



                  import re

                  p = re.compile('[OI]2|[IO]2') # match only OI or IO

                  def is_bad(pattern, to_check):
                  for item in to_check:
                  maybe_found = pattern.search(''.join(item))
                  if maybe_found:
                  yield True
                  else:
                  yield False


                  l = list(is_bad(p, nestedList))

                  print(l)
                  # [True, False, True]





                  share|improve this answer













                  I think you could use some regex here to speed this up, although it will still be a sequential operation so your best case is O(n) using this approach since you have to iterate through each list, however since we have to iterate over every sublist as well that would make it O(n^2).



                  import re

                  p = re.compile('[OI]2|[IO]2') # match only OI or IO

                  def is_bad(pattern, to_check):
                  for item in to_check:
                  maybe_found = pattern.search(''.join(item))
                  if maybe_found:
                  yield True
                  else:
                  yield False


                  l = list(is_bad(p, nestedList))

                  print(l)
                  # [True, False, True]






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 26 at 2:15









                  aws_apprenticeaws_apprentice

                  4,3852 gold badges8 silver badges27 bronze badges




                  4,3852 gold badges8 silver badges27 bronze badges



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55348688%2fhow-can-method-which-evaluates-a-list-to-determine-if-it-contains-specific-conse%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                      Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                      Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript