How can I merge two dataframes not by values but by ranges of values?How to merge two dictionaries in a single expression?How can I safely create a nested directory?How can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?How to change the order of DataFrame columns?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas

Phonetic distortion when words are borrowed among languages

What is the use of command?

How dangerous is a very out-of-true disc brake wheel?

Why aren't faces sharp in my f/1.8 portraits even though I'm carefully using center-point autofocus?

Job interview by video at home and privacy concerns

Can Familiars read and use spell scrolls?

Avoiding dust scattering when you drill

How do we know neutrons have no charge?

Sending mail to the Professor for PhD, after seeing his tweet

How is this situation not a checkmate?

Can UK supreme court justices be evaluated ideologically?

Knights and Knaves: What does C say?

Why the first octet of a MAC address always end with a binary 0?

MaxCounters solution in C# from Codility

Meaning of "fin" in "fin dai tempi"

Can a passenger predict that an airline or a tour operator is about to go bankrupt?

What is the meaning of first flight and introduction in aircraft production?

What are one's options when facing religious discrimination at the airport?

Confusion regarding control system of Mars Rover?

Is "weekend warrior" derogatory?

Why does `FindFit` fail so badly in this simple case?

Why most footers have a background color as a divider of section?

Replace zeros in a list with last nonzero value

Caro-Kann c4-c5 push



How can I merge two dataframes not by values but by ranges of values?


How to merge two dictionaries in a single expression?How can I safely create a nested directory?How can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?How to change the order of DataFrame columns?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









1















Dataframe A:



pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)


Dataframe B:



pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)


How can I merge these dataframes on a condition like:

- Dataframe A's price is between Dataframe B's low and high prices.

AND

- Dataframe A's group is between Dataframe B's low and high groups.



Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.




EDIT:



PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.



PS2: A has millions of records, B has hundreds.










share|improve this question


























  • 1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

    – cs95
    Mar 28 at 20:43












  • 1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

    – Victor Valente
    Mar 28 at 20:50












  • What do you mean by merging two data? What's your expected output for the given two?

    – Quang Hoang
    Mar 28 at 21:37

















1















Dataframe A:



pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)


Dataframe B:



pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)


How can I merge these dataframes on a condition like:

- Dataframe A's price is between Dataframe B's low and high prices.

AND

- Dataframe A's group is between Dataframe B's low and high groups.



Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.




EDIT:



PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.



PS2: A has millions of records, B has hundreds.










share|improve this question


























  • 1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

    – cs95
    Mar 28 at 20:43












  • 1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

    – Victor Valente
    Mar 28 at 20:50












  • What do you mean by merging two data? What's your expected output for the given two?

    – Quang Hoang
    Mar 28 at 21:37













1












1








1


1






Dataframe A:



pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)


Dataframe B:



pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)


How can I merge these dataframes on a condition like:

- Dataframe A's price is between Dataframe B's low and high prices.

AND

- Dataframe A's group is between Dataframe B's low and high groups.



Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.




EDIT:



PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.



PS2: A has millions of records, B has hundreds.










share|improve this question
















Dataframe A:



pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)


Dataframe B:



pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)


How can I merge these dataframes on a condition like:

- Dataframe A's price is between Dataframe B's low and high prices.

AND

- Dataframe A's group is between Dataframe B's low and high groups.



Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.




EDIT:



PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.



PS2: A has millions of records, B has hundreds.







python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 28 at 20:49







Victor Valente

















asked Mar 28 at 20:39









Victor ValenteVictor Valente

4196 silver badges15 bronze badges




4196 silver badges15 bronze badges















  • 1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

    – cs95
    Mar 28 at 20:43












  • 1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

    – Victor Valente
    Mar 28 at 20:50












  • What do you mean by merging two data? What's your expected output for the given two?

    – Quang Hoang
    Mar 28 at 21:37

















  • 1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

    – cs95
    Mar 28 at 20:43












  • 1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

    – Victor Valente
    Mar 28 at 20:50












  • What do you mean by merging two data? What's your expected output for the given two?

    – Quang Hoang
    Mar 28 at 21:37
















1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

– cs95
Mar 28 at 20:43






1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?

– cs95
Mar 28 at 20:43














1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

– Victor Valente
Mar 28 at 20:50






1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.

– Victor Valente
Mar 28 at 20:50














What do you mean by merging two data? What's your expected output for the given two?

– Quang Hoang
Mar 28 at 21:37





What do you mean by merging two data? What's your expected output for the given two?

– Quang Hoang
Mar 28 at 21:37












2 Answers
2






active

oldest

votes


















0
















If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).



If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.






share|improve this answer
































    0
















    You cannot easily do that for simultaneously price and group, but merge_asof can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc will only keep rows meeting other conditions.



    pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
    '(price<=high_price)&(group>=low_group)&(group<=high_group)')





    share|improve this answer


























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );














      draft saved

      draft discarded
















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406497%2fhow-can-i-merge-two-dataframes-not-by-values-but-by-ranges-of-values%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0
















      If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).



      If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.






      share|improve this answer





























        0
















        If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).



        If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.






        share|improve this answer



























          0














          0










          0









          If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).



          If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.






          share|improve this answer













          If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).



          If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 28 at 20:57









          Matthew Salvatore ViglioneMatthew Salvatore Viglione

          6653 silver badges21 bronze badges




          6653 silver badges21 bronze badges


























              0
















              You cannot easily do that for simultaneously price and group, but merge_asof can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc will only keep rows meeting other conditions.



              pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
              '(price<=high_price)&(group>=low_group)&(group<=high_group)')





              share|improve this answer





























                0
















                You cannot easily do that for simultaneously price and group, but merge_asof can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc will only keep rows meeting other conditions.



                pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
                '(price<=high_price)&(group>=low_group)&(group<=high_group)')





                share|improve this answer



























                  0














                  0










                  0









                  You cannot easily do that for simultaneously price and group, but merge_asof can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc will only keep rows meeting other conditions.



                  pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
                  '(price<=high_price)&(group>=low_group)&(group<=high_group)')





                  share|improve this answer













                  You cannot easily do that for simultaneously price and group, but merge_asof can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc will only keep rows meeting other conditions.



                  pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
                  '(price<=high_price)&(group>=low_group)&(group<=high_group)')






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Mar 28 at 22:22









                  Serge BallestaSerge Ballesta

                  89.9k9 gold badges71 silver badges148 bronze badges




                  89.9k9 gold badges71 silver badges148 bronze badges































                      draft saved

                      draft discarded















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406497%2fhow-can-i-merge-two-dataframes-not-by-values-but-by-ranges-of-values%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

                      Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

                      Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript