How to add a MultiIndex after loading csv data into a pandas dataframe?What is the most efficient way to loop through dataframes with pandas?Add one row to pandas DataFrameSelecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameCreating an empty Pandas DataFrame, then filling it?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
The anatomy of an organic infrared generator
Why is su world executable?
Can I submit a paper computer science conference using an alias if using my real name can cause legal trouble in my original country
Why do balloons get cold when they deflate?
How to render "have ideas above his station" into German
How does the illumination of the sky from the sun compare to that of the moon?
A Magic Diamond
What exactly happened to the 18 crew members who were reported as "missing" in "Q Who"?
Existence of a certain set of 0/1-sequences without the Axiom of Choice
How to use the passive form to say "This flower was watered."
Does the Temple of the Gods spell nullify critical hits?
What happened after the end of the Truman Show?
Reducing contention in thread-safe LruCache
Why does this image of cyclocarbon look like a nonagon?
How do I answer an interview question about how to handle a hard deadline I won't be able to meet?
Programming a recursive formula into Mathematica and find the nth position in the sequence
What does a comma signify in inorganic chemistry?
Build a mob of suspiciously happy lenny faces ( ͡° ͜ʖ ͡°)
Tikz: The position of a label change step-wise and not in a continuous way
Are there any rules on how characters go from 0th to 1st level in a class?
What allows us to use imaginary numbers?
Trying to understand how Digital Certificates and CA are indeed secure
Has there ever been a truly bilingual country prior to the contemporary period?
Why was ramjet fuel used as hydraulic fluid during Saturn V checkout?
How to add a MultiIndex after loading csv data into a pandas dataframe?
What is the most efficient way to loop through dataframes with pandas?Add one row to pandas DataFrameSelecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasDelete column from pandas DataFrameCreating an empty Pandas DataFrame, then filling it?“Large data” work flows using pandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to add additional index rows to an existing pandas dataframe after loading csv data into it.
So let's say I load my data like this:
columns = ['Relative_Pressure','Volume_STP']
df = pd.read_csv(StringIO(contents), skiprows=4, delim_whitespace=True,index_col=False,header=None)
df.columns = columns
where contents
is a string in csv format. The resulting DataFrame might look something like this:
For clarity reasons I would now like to add additional index rows to the DataFrame as shown here:
However in the link these multiple index rows are generated right when the DataFrame is created. I would like to add e.g. rows for unit
or descr
to the columns.
How could I do this?
python pandas dataframe
add a comment |
I am trying to add additional index rows to an existing pandas dataframe after loading csv data into it.
So let's say I load my data like this:
columns = ['Relative_Pressure','Volume_STP']
df = pd.read_csv(StringIO(contents), skiprows=4, delim_whitespace=True,index_col=False,header=None)
df.columns = columns
where contents
is a string in csv format. The resulting DataFrame might look something like this:
For clarity reasons I would now like to add additional index rows to the DataFrame as shown here:
However in the link these multiple index rows are generated right when the DataFrame is created. I would like to add e.g. rows for unit
or descr
to the columns.
How could I do this?
python pandas dataframe
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add ametadata
property with print_metadata to print it. You can optionally override__str__
and__unicode__
to print metadata first, and then thesuper().__str__
andsuper().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.
– PM Hui
Mar 27 at 13:42
add a comment |
I am trying to add additional index rows to an existing pandas dataframe after loading csv data into it.
So let's say I load my data like this:
columns = ['Relative_Pressure','Volume_STP']
df = pd.read_csv(StringIO(contents), skiprows=4, delim_whitespace=True,index_col=False,header=None)
df.columns = columns
where contents
is a string in csv format. The resulting DataFrame might look something like this:
For clarity reasons I would now like to add additional index rows to the DataFrame as shown here:
However in the link these multiple index rows are generated right when the DataFrame is created. I would like to add e.g. rows for unit
or descr
to the columns.
How could I do this?
python pandas dataframe
I am trying to add additional index rows to an existing pandas dataframe after loading csv data into it.
So let's say I load my data like this:
columns = ['Relative_Pressure','Volume_STP']
df = pd.read_csv(StringIO(contents), skiprows=4, delim_whitespace=True,index_col=False,header=None)
df.columns = columns
where contents
is a string in csv format. The resulting DataFrame might look something like this:
For clarity reasons I would now like to add additional index rows to the DataFrame as shown here:
However in the link these multiple index rows are generated right when the DataFrame is created. I would like to add e.g. rows for unit
or descr
to the columns.
How could I do this?
python pandas dataframe
python pandas dataframe
asked Mar 27 at 13:15
AxelAxel
7441 gold badge5 silver badges25 bronze badges
7441 gold badge5 silver badges25 bronze badges
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add ametadata
property with print_metadata to print it. You can optionally override__str__
and__unicode__
to print metadata first, and then thesuper().__str__
andsuper().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.
– PM Hui
Mar 27 at 13:42
add a comment |
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add ametadata
property with print_metadata to print it. You can optionally override__str__
and__unicode__
to print metadata first, and then thesuper().__str__
andsuper().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.
– PM Hui
Mar 27 at 13:42
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add a
metadata
property with print_metadata to print it. You can optionally override __str__
and __unicode__
to print metadata first, and then the super().__str__
and super().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.– PM Hui
Mar 27 at 13:42
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add a
metadata
property with print_metadata to print it. You can optionally override __str__
and __unicode__
to print metadata first, and then the super().__str__
and super().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.– PM Hui
Mar 27 at 13:42
add a comment |
1 Answer
1
active
oldest
votes
You can create a MultiIndex
on the columns by specifically creating the index and then assigning it to the columns separately from reading in the data.
I'll use the example from the link you provided. The first method is to create the MultiIndex when you make the dataframe:
df = pd.DataFrame(('A',1,'desc A'):[1,2,3],('B',2,'desc B'):[4,5,6])
df.columns.names=['NAME','LENGTH','DESCRIPTION']
df
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
As stated, this is not what you are after. Instead, you can make the dataframe (from your file for example) and then make the MultiIndex
from a set of lists and then assign it to the columns:
df = pd.DataFrame('desc A':[1,2,3], 'desc B':[4,5,6])
# Output
desc A desc B
0 1 4
1 2 5
2 3 6
# Create a multiindex from lists
index = pd.MultiIndex.from_arrays((['A', 'B'], [1, 2], ['desc A', 'desc B']))
# Assign to the columns
df.columns = index
# Output
A B
1 2
desc A desc B
0 1 4
1 2 5
2 3 6
# Name the columns
df.columns.names = ['NAME','LENGTH','DESCRIPTION']
# Output
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
There are other ways to construct a MultiIndex
, for example, from_tuples
and from_product
. You can read more about Multi Indexes in the documentation.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378142%2fhow-to-add-a-multiindex-after-loading-csv-data-into-a-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can create a MultiIndex
on the columns by specifically creating the index and then assigning it to the columns separately from reading in the data.
I'll use the example from the link you provided. The first method is to create the MultiIndex when you make the dataframe:
df = pd.DataFrame(('A',1,'desc A'):[1,2,3],('B',2,'desc B'):[4,5,6])
df.columns.names=['NAME','LENGTH','DESCRIPTION']
df
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
As stated, this is not what you are after. Instead, you can make the dataframe (from your file for example) and then make the MultiIndex
from a set of lists and then assign it to the columns:
df = pd.DataFrame('desc A':[1,2,3], 'desc B':[4,5,6])
# Output
desc A desc B
0 1 4
1 2 5
2 3 6
# Create a multiindex from lists
index = pd.MultiIndex.from_arrays((['A', 'B'], [1, 2], ['desc A', 'desc B']))
# Assign to the columns
df.columns = index
# Output
A B
1 2
desc A desc B
0 1 4
1 2 5
2 3 6
# Name the columns
df.columns.names = ['NAME','LENGTH','DESCRIPTION']
# Output
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
There are other ways to construct a MultiIndex
, for example, from_tuples
and from_product
. You can read more about Multi Indexes in the documentation.
add a comment |
You can create a MultiIndex
on the columns by specifically creating the index and then assigning it to the columns separately from reading in the data.
I'll use the example from the link you provided. The first method is to create the MultiIndex when you make the dataframe:
df = pd.DataFrame(('A',1,'desc A'):[1,2,3],('B',2,'desc B'):[4,5,6])
df.columns.names=['NAME','LENGTH','DESCRIPTION']
df
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
As stated, this is not what you are after. Instead, you can make the dataframe (from your file for example) and then make the MultiIndex
from a set of lists and then assign it to the columns:
df = pd.DataFrame('desc A':[1,2,3], 'desc B':[4,5,6])
# Output
desc A desc B
0 1 4
1 2 5
2 3 6
# Create a multiindex from lists
index = pd.MultiIndex.from_arrays((['A', 'B'], [1, 2], ['desc A', 'desc B']))
# Assign to the columns
df.columns = index
# Output
A B
1 2
desc A desc B
0 1 4
1 2 5
2 3 6
# Name the columns
df.columns.names = ['NAME','LENGTH','DESCRIPTION']
# Output
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
There are other ways to construct a MultiIndex
, for example, from_tuples
and from_product
. You can read more about Multi Indexes in the documentation.
add a comment |
You can create a MultiIndex
on the columns by specifically creating the index and then assigning it to the columns separately from reading in the data.
I'll use the example from the link you provided. The first method is to create the MultiIndex when you make the dataframe:
df = pd.DataFrame(('A',1,'desc A'):[1,2,3],('B',2,'desc B'):[4,5,6])
df.columns.names=['NAME','LENGTH','DESCRIPTION']
df
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
As stated, this is not what you are after. Instead, you can make the dataframe (from your file for example) and then make the MultiIndex
from a set of lists and then assign it to the columns:
df = pd.DataFrame('desc A':[1,2,3], 'desc B':[4,5,6])
# Output
desc A desc B
0 1 4
1 2 5
2 3 6
# Create a multiindex from lists
index = pd.MultiIndex.from_arrays((['A', 'B'], [1, 2], ['desc A', 'desc B']))
# Assign to the columns
df.columns = index
# Output
A B
1 2
desc A desc B
0 1 4
1 2 5
2 3 6
# Name the columns
df.columns.names = ['NAME','LENGTH','DESCRIPTION']
# Output
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
There are other ways to construct a MultiIndex
, for example, from_tuples
and from_product
. You can read more about Multi Indexes in the documentation.
You can create a MultiIndex
on the columns by specifically creating the index and then assigning it to the columns separately from reading in the data.
I'll use the example from the link you provided. The first method is to create the MultiIndex when you make the dataframe:
df = pd.DataFrame(('A',1,'desc A'):[1,2,3],('B',2,'desc B'):[4,5,6])
df.columns.names=['NAME','LENGTH','DESCRIPTION']
df
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
As stated, this is not what you are after. Instead, you can make the dataframe (from your file for example) and then make the MultiIndex
from a set of lists and then assign it to the columns:
df = pd.DataFrame('desc A':[1,2,3], 'desc B':[4,5,6])
# Output
desc A desc B
0 1 4
1 2 5
2 3 6
# Create a multiindex from lists
index = pd.MultiIndex.from_arrays((['A', 'B'], [1, 2], ['desc A', 'desc B']))
# Assign to the columns
df.columns = index
# Output
A B
1 2
desc A desc B
0 1 4
1 2 5
2 3 6
# Name the columns
df.columns.names = ['NAME','LENGTH','DESCRIPTION']
# Output
NAME A B
LENGTH 1 2
DESCRIPTION desc A desc B
0 1 4
1 2 5
2 3 6
There are other ways to construct a MultiIndex
, for example, from_tuples
and from_product
. You can read more about Multi Indexes in the documentation.
answered Mar 27 at 13:32
willkwillk
1,5149 silver badges30 bronze badges
1,5149 silver badges30 bronze badges
add a comment |
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55378142%2fhow-to-add-a-multiindex-after-loading-csv-data-into-a-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The solution provided at your link looks ingenious, and a bit too so. Using multi-index for metadata storage has non-trivial impacts on performance and sub-optimal maintainability for future updates. The easiest solution is to provide a README for the data. A better solution is to create a subclass, but only add a
metadata
property with print_metadata to print it. You can optionally override__str__
and__unicode__
to print metadata first, and then thesuper().__str__
andsuper().__unicode__
. But if you are distributing a library with data, it's easier to give them a text README.– PM Hui
Mar 27 at 13:42