slicing pandas dataframe encounter KeyError: 'n_tokens_content', how to locate the bad rows efficiently?Add one row to pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasPandas error in python 3.5.1one-hot encode : list of column_values has to encodeAlpha_vantage produces errordifference between `header = None` and `header = 0` in pandasValueError: shapes (831,18) and (1629,2) not aligned: 18 (dim 1) != 1629 (dim 0)Getting KeyError using Pandas when accessing .csv filespanda DataFrame slicing has a KeyError: -1 Error
Defending Castle from Zombies
Spicing up a moment of peace
Can a network vulnerability be exploited locally?
Should I ask for a raise one month before the end of an internship?
Why might one *not* want to use a capo?
Find feasible point in polynomial time in linear programming
Employing a contractor proving difficult
Why does a sticker slowly peel off, but if it is pulled quickly it tears?
Why did Lucius make a deal out of Buckbeak hurting Draco but not about Draco being turned into a ferret?
Looking for a plural noun related to ‘fulcrum’ or ‘pivot’ that denotes multiple things as crucial to success
What ways are there to "PEEK" memory sections in (different) BASIC(s)
Why does AM radio react to IR remote?
Drawing probabilities on a simplex in TikZ
Is there a better way to use C# dictionaries than TryGetValue?
Why doesn't Starship have four landing legs?
If the UK Gov. has authority to cancel article 50 notification, why do they have to agree an extension with the EU
Another "Ask One Question" Question
Get contents before a colon
Why Can't A Name Be Written Literally In Japanese?
Is this position a forced win for Black after move 14?
RAID0 instead of RAID1 or 5, is this crazy?
Is there a way to tell what frequency I need a PWM to be?
What checks exist against overuse of presidential pardons in the USA?
Is it unusual for a math department not to have a mail/web server?
slicing pandas dataframe encounter KeyError: 'n_tokens_content', how to locate the bad rows efficiently?
Add one row to pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasPandas error in python 3.5.1one-hot encode : list of column_values has to encodeAlpha_vantage produces errordifference between `header = None` and `header = 0` in pandasValueError: shapes (831,18) and (1629,2) not aligned: 18 (dim 1) != 1629 (dim 0)Getting KeyError using Pandas when accessing .csv filespanda DataFrame slicing has a KeyError: -1 Error
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.
%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]
last line produces error
KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970
if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647
cache[item] = res
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442
return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446
indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
I think this is caused by some rows in the csv file, as this piece of code work well for other csv.
if yes, how to locate the bad rows efficiently?
python python-3.x pandas dataframe
add a comment |
I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.
%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]
last line produces error
KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970
if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647
cache[item] = res
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442
return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446
indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
I think this is caused by some rows in the csv file, as this piece of code work well for other csv.
if yes, how to locate the bad rows efficiently?
python python-3.x pandas dataframe
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
This error means there is no column calledn_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., rundf.columns
ordf.head()
) to see what your column names are.
– AlexK
Mar 27 at 22:01
add a comment |
I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.
%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]
last line produces error
KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970
if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647
cache[item] = res
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442
return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446
indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
I think this is caused by some rows in the csv file, as this piece of code work well for other csv.
if yes, how to locate the bad rows efficiently?
python python-3.x pandas dataframe
I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.
%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]
last line produces error
KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970
if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)
1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647
cache[item] = res
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442
return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446
indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()
KeyError: 'n_tokens_content'
I think this is caused by some rows in the csv file, as this piece of code work well for other csv.
if yes, how to locate the bad rows efficiently?
python python-3.x pandas dataframe
python python-3.x pandas dataframe
edited Mar 27 at 22:09
asked Mar 27 at 21:18
user11103981
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
This error means there is no column calledn_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., rundf.columns
ordf.head()
) to see what your column names are.
– AlexK
Mar 27 at 22:01
add a comment |
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
This error means there is no column calledn_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., rundf.columns
ordf.head()
) to see what your column names are.
– AlexK
Mar 27 at 22:01
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
This error means there is no column called
n_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns
or df.head()
) to see what your column names are.– AlexK
Mar 27 at 22:01
This error means there is no column called
n_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns
or df.head()
) to see what your column names are.– AlexK
Mar 27 at 22:01
add a comment |
1 Answer
1
active
oldest
votes
When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.
Input: df.columns
Output:
Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
' data_channel_is_entertainment', ' data_channel_is_bus',
' data_channel_is_socmed', ' data_channel_is_tech',
' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
' self_reference_max_shares', ' self_reference_avg_sharess',
' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
' LDA_03', ' LDA_04', ' global_subjectivity',
' global_sentiment_polarity', ' global_rate_positive_words',
' global_rate_negative_words', ' rate_positive_words',
' rate_negative_words', ' avg_positive_polarity',
' min_positive_polarity', ' max_positive_polarity',
' avg_negative_polarity', ' min_negative_polarity',
' max_negative_polarity', ' title_subjectivity',
' title_sentiment_polarity', ' abs_title_subjectivity',
' abs_title_sentiment_polarity', ' shares'],
dtype='object')
Give input as: df[' n_tokens_content'][:9]
output:
0 219
1 255
2 211
3 531
4 1072
5 370
6 960
7 989
8 97
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55386612%2fslicing-pandas-dataframe-encounter-keyerror-n-tokens-content-how-to-locate-t%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.
Input: df.columns
Output:
Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
' data_channel_is_entertainment', ' data_channel_is_bus',
' data_channel_is_socmed', ' data_channel_is_tech',
' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
' self_reference_max_shares', ' self_reference_avg_sharess',
' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
' LDA_03', ' LDA_04', ' global_subjectivity',
' global_sentiment_polarity', ' global_rate_positive_words',
' global_rate_negative_words', ' rate_positive_words',
' rate_negative_words', ' avg_positive_polarity',
' min_positive_polarity', ' max_positive_polarity',
' avg_negative_polarity', ' min_negative_polarity',
' max_negative_polarity', ' title_subjectivity',
' title_sentiment_polarity', ' abs_title_subjectivity',
' abs_title_sentiment_polarity', ' shares'],
dtype='object')
Give input as: df[' n_tokens_content'][:9]
output:
0 219
1 255
2 211
3 531
4 1072
5 370
6 960
7 989
8 97
add a comment |
When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.
Input: df.columns
Output:
Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
' data_channel_is_entertainment', ' data_channel_is_bus',
' data_channel_is_socmed', ' data_channel_is_tech',
' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
' self_reference_max_shares', ' self_reference_avg_sharess',
' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
' LDA_03', ' LDA_04', ' global_subjectivity',
' global_sentiment_polarity', ' global_rate_positive_words',
' global_rate_negative_words', ' rate_positive_words',
' rate_negative_words', ' avg_positive_polarity',
' min_positive_polarity', ' max_positive_polarity',
' avg_negative_polarity', ' min_negative_polarity',
' max_negative_polarity', ' title_subjectivity',
' title_sentiment_polarity', ' abs_title_subjectivity',
' abs_title_sentiment_polarity', ' shares'],
dtype='object')
Give input as: df[' n_tokens_content'][:9]
output:
0 219
1 255
2 211
3 531
4 1072
5 370
6 960
7 989
8 97
add a comment |
When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.
Input: df.columns
Output:
Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
' data_channel_is_entertainment', ' data_channel_is_bus',
' data_channel_is_socmed', ' data_channel_is_tech',
' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
' self_reference_max_shares', ' self_reference_avg_sharess',
' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
' LDA_03', ' LDA_04', ' global_subjectivity',
' global_sentiment_polarity', ' global_rate_positive_words',
' global_rate_negative_words', ' rate_positive_words',
' rate_negative_words', ' avg_positive_polarity',
' min_positive_polarity', ' max_positive_polarity',
' avg_negative_polarity', ' min_negative_polarity',
' max_negative_polarity', ' title_subjectivity',
' title_sentiment_polarity', ' abs_title_subjectivity',
' abs_title_sentiment_polarity', ' shares'],
dtype='object')
Give input as: df[' n_tokens_content'][:9]
output:
0 219
1 255
2 211
3 531
4 1072
5 370
6 960
7 989
8 97
When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.
Input: df.columns
Output:
Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
' data_channel_is_entertainment', ' data_channel_is_bus',
' data_channel_is_socmed', ' data_channel_is_tech',
' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
' self_reference_max_shares', ' self_reference_avg_sharess',
' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
' LDA_03', ' LDA_04', ' global_subjectivity',
' global_sentiment_polarity', ' global_rate_positive_words',
' global_rate_negative_words', ' rate_positive_words',
' rate_negative_words', ' avg_positive_polarity',
' min_positive_polarity', ' max_positive_polarity',
' avg_negative_polarity', ' min_negative_polarity',
' max_negative_polarity', ' title_subjectivity',
' title_sentiment_polarity', ' abs_title_subjectivity',
' abs_title_sentiment_polarity', ' shares'],
dtype='object')
Give input as: df[' n_tokens_content'][:9]
output:
0 219
1 255
2 211
3 531
4 1072
5 370
6 960
7 989
8 97
answered Mar 27 at 22:19
SravanthiGSravanthiG
513 bronze badges
513 bronze badges
add a comment |
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55386612%2fslicing-pandas-dataframe-encounter-keyerror-n-tokens-content-how-to-locate-t%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is your goal? What are you trying to achieve?
– Erfan
Mar 27 at 21:58
This error means there is no column called
n_tokens_content
in the dataframe you created. You'll have to examine the dataframe (e.g., rundf.columns
ordf.head()
) to see what your column names are.– AlexK
Mar 27 at 22:01