Grouping observations by cutoff time“Large data” work flows using pandasCollapsing Data using GroupBy PandasPytables efficiently read and process thousands of groupsSplit dataframes in groups and sub-groups and store the output in a CSV fileDataFrame by especific columns in Python pandas to a JSON response?Grouping Varying Size of Lists in a TupleDetermining Optimal Group Configuration Using PandasUsing pd.cut & pd.vales_count then results as 2d arrayFinding last possible index value to satisfy filtering requirementsOrdering across columns in a dataframe based on a custom list
How were the names on the memorial stones in Avengers: Endgame chosen, out-of-universe?
Fantasy Military Arms and Armor: the Dwarven Grand Armory
Low quality postdoc application and deadline extension
Why is a pressure canner needed when canning?
Why did Boris Johnson call for new elections?
Is every coset of a group closed under taking inverses?
If I sell my PS4 game disc and buy a digital version, can I still access my saved game?
Is the interior of a Bag of Holding actually an extradimensional space?
Go for an isolated pawn
Shoes for commuting
Is there any reason to change the ISO manually?
How can I implement regular expressions on an embedded device?
Can doublestrike kill a creature with totem armor?
Is mathematics truth?
Life post thesis submission is terrifying - Help!
Bidirectional Dictionary
Is it possible to retrieve/get the query hash of a query without searching the DMOs?
What's the point of this macro?
Does an antenna tuner remove standing waves from a transmission line?
Resizing attribute form in QGIS 3
What drugs were used in England during the High Middle Ages?
Tiny image scraper for xkcd.com
What are some countries where you can be imprisoned for reading or owning a Bible?
How do I stop making people jump at home and at work?
Grouping observations by cutoff time
“Large data” work flows using pandasCollapsing Data using GroupBy PandasPytables efficiently read and process thousands of groupsSplit dataframes in groups and sub-groups and store the output in a CSV fileDataFrame by especific columns in Python pandas to a JSON response?Grouping Varying Size of Lists in a TupleDetermining Optimal Group Configuration Using PandasUsing pd.cut & pd.vales_count then results as 2d arrayFinding last possible index value to satisfy filtering requirementsOrdering across columns in a dataframe based on a custom list
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a list of cutoff times list = [16:30:00.100, 16:30:00.200, 16:30:00.350, 16:30:00.450].
And my observations are as follows:
16:30:00.095 A
16:30:00.097 B
16:30:00.122 C
16:30:00.255 D
16:30:00.322 E
16:30:00.420 F
16:30:00.569 G
What I want to achieve here is to group my observations based on the cutoff times (specifically, I want to see which one of my cutoff times are able to capture the observations - i.e. first cutoff time is fast enough to catch C, but too slow for A/B). Desired output should look something like this:
cutoff observations captured
16:30:00.100 C
16:30:00.200 D E
16:30:00.350 F
16:30:00.450 G
not possible A B
I have tried using pd.cut, but it doesn't allow for time sensitivity up to the milliseconds, or at least not that I am aware of. Any help will be greatly appreciated. Thanks!
python-3.x pandas
add a comment |
I have a list of cutoff times list = [16:30:00.100, 16:30:00.200, 16:30:00.350, 16:30:00.450].
And my observations are as follows:
16:30:00.095 A
16:30:00.097 B
16:30:00.122 C
16:30:00.255 D
16:30:00.322 E
16:30:00.420 F
16:30:00.569 G
What I want to achieve here is to group my observations based on the cutoff times (specifically, I want to see which one of my cutoff times are able to capture the observations - i.e. first cutoff time is fast enough to catch C, but too slow for A/B). Desired output should look something like this:
cutoff observations captured
16:30:00.100 C
16:30:00.200 D E
16:30:00.350 F
16:30:00.450 G
not possible A B
I have tried using pd.cut, but it doesn't allow for time sensitivity up to the milliseconds, or at least not that I am aware of. Any help will be greatly appreciated. Thanks!
python-3.x pandas
add a comment |
I have a list of cutoff times list = [16:30:00.100, 16:30:00.200, 16:30:00.350, 16:30:00.450].
And my observations are as follows:
16:30:00.095 A
16:30:00.097 B
16:30:00.122 C
16:30:00.255 D
16:30:00.322 E
16:30:00.420 F
16:30:00.569 G
What I want to achieve here is to group my observations based on the cutoff times (specifically, I want to see which one of my cutoff times are able to capture the observations - i.e. first cutoff time is fast enough to catch C, but too slow for A/B). Desired output should look something like this:
cutoff observations captured
16:30:00.100 C
16:30:00.200 D E
16:30:00.350 F
16:30:00.450 G
not possible A B
I have tried using pd.cut, but it doesn't allow for time sensitivity up to the milliseconds, or at least not that I am aware of. Any help will be greatly appreciated. Thanks!
python-3.x pandas
I have a list of cutoff times list = [16:30:00.100, 16:30:00.200, 16:30:00.350, 16:30:00.450].
And my observations are as follows:
16:30:00.095 A
16:30:00.097 B
16:30:00.122 C
16:30:00.255 D
16:30:00.322 E
16:30:00.420 F
16:30:00.569 G
What I want to achieve here is to group my observations based on the cutoff times (specifically, I want to see which one of my cutoff times are able to capture the observations - i.e. first cutoff time is fast enough to catch C, but too slow for A/B). Desired output should look something like this:
cutoff observations captured
16:30:00.100 C
16:30:00.200 D E
16:30:00.350 F
16:30:00.450 G
not possible A B
I have tried using pd.cut, but it doesn't allow for time sensitivity up to the milliseconds, or at least not that I am aware of. Any help will be greatly appreciated. Thanks!
python-3.x pandas
python-3.x pandas
asked Mar 28 at 3:47
Adrian YAdrian Y
1361 silver badge9 bronze badges
1361 silver badge9 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I think idea with cut working nice, also time data are converted to timedeltas by to_timedelta, replace non matching values by fillna and last aggregate join:
print (df)
time col
0 16:30:00.095 A
1 16:30:00.097 B
2 16:30:00.122 C
3 16:30:00.255 D
4 16:30:00.322 E
5 16:30:00.420 F
6 16:30:00.569 G
df['time'] = pd.to_timedelta(df['time'].astype(str))
L = ['16:30:00.100', '16:30:00.200', '16:30:00.350', '16:30:00.450']
v = pd.to_timedelta(L + [pd.Timedelta.max])
df['b'] = pd.cut(df['time'], bins=v, labels = L)
df['b'] = df['b'].cat.add_categories(['not possible'])
df['b'] = df['b'].fillna('not possible')
print (df)
time col b
0 16:30:00.095000 A not possible
1 16:30:00.097000 B not possible
2 16:30:00.122000 C 16:30:00.100
3 16:30:00.255000 D 16:30:00.200
4 16:30:00.322000 E 16:30:00.200
5 16:30:00.420000 F 16:30:00.350
6 16:30:00.569000 G 16:30:00.450
df2 = df.groupby('b')['col'].apply(', '.join).reset_index()
print (df2)
b col
0 16:30:00.100 C
1 16:30:00.200 D, E
2 16:30:00.350 F
3 16:30:00.450 G
4 not possible A, B
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to appendcolto the last instance of duplicates, instead of usingduplicates='drop'?
– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element incolon the next cell, instead of it being separated by,
– Adrian Y
Apr 1 at 4:07
@AdrianY - UnfortunatelyLcannon contains duplicates forpd.cut, for next cell do you think omitdf2 = df.groupby('b')['col'].apply(', '.join).reset_index()?
– jezrael
Apr 2 at 5:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55389869%2fgrouping-observations-by-cutoff-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think idea with cut working nice, also time data are converted to timedeltas by to_timedelta, replace non matching values by fillna and last aggregate join:
print (df)
time col
0 16:30:00.095 A
1 16:30:00.097 B
2 16:30:00.122 C
3 16:30:00.255 D
4 16:30:00.322 E
5 16:30:00.420 F
6 16:30:00.569 G
df['time'] = pd.to_timedelta(df['time'].astype(str))
L = ['16:30:00.100', '16:30:00.200', '16:30:00.350', '16:30:00.450']
v = pd.to_timedelta(L + [pd.Timedelta.max])
df['b'] = pd.cut(df['time'], bins=v, labels = L)
df['b'] = df['b'].cat.add_categories(['not possible'])
df['b'] = df['b'].fillna('not possible')
print (df)
time col b
0 16:30:00.095000 A not possible
1 16:30:00.097000 B not possible
2 16:30:00.122000 C 16:30:00.100
3 16:30:00.255000 D 16:30:00.200
4 16:30:00.322000 E 16:30:00.200
5 16:30:00.420000 F 16:30:00.350
6 16:30:00.569000 G 16:30:00.450
df2 = df.groupby('b')['col'].apply(', '.join).reset_index()
print (df2)
b col
0 16:30:00.100 C
1 16:30:00.200 D, E
2 16:30:00.350 F
3 16:30:00.450 G
4 not possible A, B
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to appendcolto the last instance of duplicates, instead of usingduplicates='drop'?
– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element incolon the next cell, instead of it being separated by,
– Adrian Y
Apr 1 at 4:07
@AdrianY - UnfortunatelyLcannon contains duplicates forpd.cut, for next cell do you think omitdf2 = df.groupby('b')['col'].apply(', '.join).reset_index()?
– jezrael
Apr 2 at 5:17
add a comment |
I think idea with cut working nice, also time data are converted to timedeltas by to_timedelta, replace non matching values by fillna and last aggregate join:
print (df)
time col
0 16:30:00.095 A
1 16:30:00.097 B
2 16:30:00.122 C
3 16:30:00.255 D
4 16:30:00.322 E
5 16:30:00.420 F
6 16:30:00.569 G
df['time'] = pd.to_timedelta(df['time'].astype(str))
L = ['16:30:00.100', '16:30:00.200', '16:30:00.350', '16:30:00.450']
v = pd.to_timedelta(L + [pd.Timedelta.max])
df['b'] = pd.cut(df['time'], bins=v, labels = L)
df['b'] = df['b'].cat.add_categories(['not possible'])
df['b'] = df['b'].fillna('not possible')
print (df)
time col b
0 16:30:00.095000 A not possible
1 16:30:00.097000 B not possible
2 16:30:00.122000 C 16:30:00.100
3 16:30:00.255000 D 16:30:00.200
4 16:30:00.322000 E 16:30:00.200
5 16:30:00.420000 F 16:30:00.350
6 16:30:00.569000 G 16:30:00.450
df2 = df.groupby('b')['col'].apply(', '.join).reset_index()
print (df2)
b col
0 16:30:00.100 C
1 16:30:00.200 D, E
2 16:30:00.350 F
3 16:30:00.450 G
4 not possible A, B
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to appendcolto the last instance of duplicates, instead of usingduplicates='drop'?
– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element incolon the next cell, instead of it being separated by,
– Adrian Y
Apr 1 at 4:07
@AdrianY - UnfortunatelyLcannon contains duplicates forpd.cut, for next cell do you think omitdf2 = df.groupby('b')['col'].apply(', '.join).reset_index()?
– jezrael
Apr 2 at 5:17
add a comment |
I think idea with cut working nice, also time data are converted to timedeltas by to_timedelta, replace non matching values by fillna and last aggregate join:
print (df)
time col
0 16:30:00.095 A
1 16:30:00.097 B
2 16:30:00.122 C
3 16:30:00.255 D
4 16:30:00.322 E
5 16:30:00.420 F
6 16:30:00.569 G
df['time'] = pd.to_timedelta(df['time'].astype(str))
L = ['16:30:00.100', '16:30:00.200', '16:30:00.350', '16:30:00.450']
v = pd.to_timedelta(L + [pd.Timedelta.max])
df['b'] = pd.cut(df['time'], bins=v, labels = L)
df['b'] = df['b'].cat.add_categories(['not possible'])
df['b'] = df['b'].fillna('not possible')
print (df)
time col b
0 16:30:00.095000 A not possible
1 16:30:00.097000 B not possible
2 16:30:00.122000 C 16:30:00.100
3 16:30:00.255000 D 16:30:00.200
4 16:30:00.322000 E 16:30:00.200
5 16:30:00.420000 F 16:30:00.350
6 16:30:00.569000 G 16:30:00.450
df2 = df.groupby('b')['col'].apply(', '.join).reset_index()
print (df2)
b col
0 16:30:00.100 C
1 16:30:00.200 D, E
2 16:30:00.350 F
3 16:30:00.450 G
4 not possible A, B
I think idea with cut working nice, also time data are converted to timedeltas by to_timedelta, replace non matching values by fillna and last aggregate join:
print (df)
time col
0 16:30:00.095 A
1 16:30:00.097 B
2 16:30:00.122 C
3 16:30:00.255 D
4 16:30:00.322 E
5 16:30:00.420 F
6 16:30:00.569 G
df['time'] = pd.to_timedelta(df['time'].astype(str))
L = ['16:30:00.100', '16:30:00.200', '16:30:00.350', '16:30:00.450']
v = pd.to_timedelta(L + [pd.Timedelta.max])
df['b'] = pd.cut(df['time'], bins=v, labels = L)
df['b'] = df['b'].cat.add_categories(['not possible'])
df['b'] = df['b'].fillna('not possible')
print (df)
time col b
0 16:30:00.095000 A not possible
1 16:30:00.097000 B not possible
2 16:30:00.122000 C 16:30:00.100
3 16:30:00.255000 D 16:30:00.200
4 16:30:00.322000 E 16:30:00.200
5 16:30:00.420000 F 16:30:00.350
6 16:30:00.569000 G 16:30:00.450
df2 = df.groupby('b')['col'].apply(', '.join).reset_index()
print (df2)
b col
0 16:30:00.100 C
1 16:30:00.200 D, E
2 16:30:00.350 F
3 16:30:00.450 G
4 not possible A, B
answered Mar 28 at 6:31
jezraeljezrael
406k32 gold badges423 silver badges486 bronze badges
406k32 gold badges423 silver badges486 bronze badges
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to appendcolto the last instance of duplicates, instead of usingduplicates='drop'?
– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element incolon the next cell, instead of it being separated by,
– Adrian Y
Apr 1 at 4:07
@AdrianY - UnfortunatelyLcannon contains duplicates forpd.cut, for next cell do you think omitdf2 = df.groupby('b')['col'].apply(', '.join).reset_index()?
– jezrael
Apr 2 at 5:17
add a comment |
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to appendcolto the last instance of duplicates, instead of usingduplicates='drop'?
– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element incolon the next cell, instead of it being separated by,
– Adrian Y
Apr 1 at 4:07
@AdrianY - UnfortunatelyLcannon contains duplicates forpd.cut, for next cell do you think omitdf2 = df.groupby('b')['col'].apply(', '.join).reset_index()?
– jezrael
Apr 2 at 5:17
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to append
col to the last instance of duplicates, instead of using duplicates='drop'?– Adrian Y
Apr 1 at 3:54
thanks for the help! one further thing though - if L contains duplicates, is it possible for me to append
col to the last instance of duplicates, instead of using duplicates='drop'?– Adrian Y
Apr 1 at 3:54
Also, is it possible to append a new element in
col on the next cell, instead of it being separated by , – Adrian Y
Apr 1 at 4:07
Also, is it possible to append a new element in
col on the next cell, instead of it being separated by , – Adrian Y
Apr 1 at 4:07
@AdrianY - Unfortunately
L cannon contains duplicates for pd.cut, for next cell do you think omit df2 = df.groupby('b')['col'].apply(', '.join).reset_index() ?– jezrael
Apr 2 at 5:17
@AdrianY - Unfortunately
L cannon contains duplicates for pd.cut, for next cell do you think omit df2 = df.groupby('b')['col'].apply(', '.join).reset_index() ?– jezrael
Apr 2 at 5:17
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55389869%2fgrouping-observations-by-cutoff-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown