How can I merge two dataframes not by values but by ranges of values?How to merge two dictionaries in a single expression?How can I safely create a nested directory?How can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?How to change the order of DataFrame columns?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
Phonetic distortion when words are borrowed among languages
What is the use of command?
How dangerous is a very out-of-true disc brake wheel?
Why aren't faces sharp in my f/1.8 portraits even though I'm carefully using center-point autofocus?
Job interview by video at home and privacy concerns
Can Familiars read and use spell scrolls?
Avoiding dust scattering when you drill
How do we know neutrons have no charge?
Sending mail to the Professor for PhD, after seeing his tweet
How is this situation not a checkmate?
Can UK supreme court justices be evaluated ideologically?
Knights and Knaves: What does C say?
Why the first octet of a MAC address always end with a binary 0?
MaxCounters solution in C# from Codility
Meaning of "fin" in "fin dai tempi"
Can a passenger predict that an airline or a tour operator is about to go bankrupt?
What is the meaning of first flight and introduction in aircraft production?
What are one's options when facing religious discrimination at the airport?
Confusion regarding control system of Mars Rover?
Is "weekend warrior" derogatory?
Why does `FindFit` fail so badly in this simple case?
Why most footers have a background color as a divider of section?
Replace zeros in a list with last nonzero value
Caro-Kann c4-c5 push
How can I merge two dataframes not by values but by ranges of values?
How to merge two dictionaries in a single expression?How can I safely create a nested directory?How can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?How to change the order of DataFrame columns?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
Dataframe A:
pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)
Dataframe B:
pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)
How can I merge these dataframes on a condition like:
- Dataframe A's price is between Dataframe B's low and high prices.
AND
- Dataframe A's group is between Dataframe B's low and high groups.
Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.
EDIT:
PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.
PS2: A has millions of records, B has hundreds.
python pandas dataframe
add a comment
|
Dataframe A:
pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)
Dataframe B:
pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)
How can I merge these dataframes on a condition like:
- Dataframe A's price is between Dataframe B's low and high prices.
AND
- Dataframe A's group is between Dataframe B's low and high groups.
Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.
EDIT:
PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.
PS2: A has millions of records, B has hundreds.
python pandas dataframe
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37
add a comment
|
Dataframe A:
pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)
Dataframe B:
pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)
How can I merge these dataframes on a condition like:
- Dataframe A's price is between Dataframe B's low and high prices.
AND
- Dataframe A's group is between Dataframe B's low and high groups.
Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.
EDIT:
PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.
PS2: A has millions of records, B has hundreds.
python pandas dataframe
Dataframe A:
pd.Dataframe(
'price': [50, 150, 250],
'group':[2, 5, 10]
)
Dataframe B:
pd.Dataframe(
'low_price': [0, 100, 200],
'high_price': [99, 199, 299],
'low_group':[0, 4, 8],
'high_group':[3, 6, 12],
'something':['A', 'B', 'C']
)
How can I merge these dataframes on a condition like:
- Dataframe A's price is between Dataframe B's low and high prices.
AND
- Dataframe A's group is between Dataframe B's low and high groups.
Is there a way to do this without iterating through rows and checking one by one? I already did it like that and it costs O(n^2), not scalable.
EDIT:
PS1: Every row in A has 0 or 1 match in B. I'm looking for an "inner" merge.
PS2: A has millions of records, B has hundreds.
python pandas dataframe
python pandas dataframe
edited Mar 28 at 20:49
Victor Valente
asked Mar 28 at 20:39
Victor ValenteVictor Valente
4196 silver badges15 bronze badges
4196 silver badges15 bronze badges
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37
add a comment
|
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37
add a comment
|
2 Answers
2
active
oldest
votes
If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).
If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.
add a comment
|
You cannot easily do that for simultaneously price and group, but merge_asof
can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc
will only keep rows meeting other conditions.
pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
'(price<=high_price)&(group>=low_group)&(group<=high_group)')
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406497%2fhow-can-i-merge-two-dataframes-not-by-values-but-by-ranges-of-values%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).
If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.
add a comment
|
If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).
If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.
add a comment
|
If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).
If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.
If your data is already sorted, you may be able to get it down to O(log n) if you implement some sort of binary search to find where to partition the data (Assuming the merge process doesn't take more than O(log n) as well).
If your data isn't sorted, the best you could do would be O(n), merging at the same time as you discover a merge candidate.
answered Mar 28 at 20:57
Matthew Salvatore ViglioneMatthew Salvatore Viglione
6653 silver badges21 bronze badges
6653 silver badges21 bronze badges
add a comment
|
add a comment
|
You cannot easily do that for simultaneously price and group, but merge_asof
can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc
will only keep rows meeting other conditions.
pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
'(price<=high_price)&(group>=low_group)&(group<=high_group)')
add a comment
|
You cannot easily do that for simultaneously price and group, but merge_asof
can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc
will only keep rows meeting other conditions.
pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
'(price<=high_price)&(group>=low_group)&(group<=high_group)')
add a comment
|
You cannot easily do that for simultaneously price and group, but merge_asof
can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc
will only keep rows meeting other conditions.
pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
'(price<=high_price)&(group>=low_group)&(group<=high_group)')
You cannot easily do that for simultaneously price and group, but merge_asof
can join the relevant low_price of B for each row from A in one single pass. Then a query of a selection with loc
will only keep rows meeting other conditions.
pd.merge_asof(A, B, left_on='price', right_on='low_price').query(
'(price<=high_price)&(group>=low_group)&(group<=high_group)')
answered Mar 28 at 22:22
Serge BallestaSerge Ballesta
89.9k9 gold badges71 silver badges148 bronze badges
89.9k9 gold badges71 silver badges148 bronze badges
add a comment
|
add a comment
|
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55406497%2fhow-can-i-merge-two-dataframes-not-by-values-but-by-ranges-of-values%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1) can a row in A match multiple rows in B? 2) Will every row in A match a row in B? 3) What are the lengths of A and B in your actual data?
– cs95
Mar 28 at 20:43
1) No. 2) Nope. 3) A: millions of records, B: hundreds. Thank you, I updated the description.
– Victor Valente
Mar 28 at 20:50
What do you mean by merging two data? What's your expected output for the given two?
– Quang Hoang
Mar 28 at 21:37