slicing pandas dataframe encounter KeyError: 'n_tokens_content', how to locate the bad rows efficiently?Add one row to pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasPandas error in python 3.5.1one-hot encode : list of column_values has to encodeAlpha_vantage produces errordifference between `header = None` and `header = 0` in pandasValueError: shapes (831,18) and (1629,2) not aligned: 18 (dim 1) != 1629 (dim 0)Getting KeyError using Pandas when accessing .csv filespanda DataFrame slicing has a KeyError: -1 Error

Defending Castle from Zombies

Spicing up a moment of peace

Can a network vulnerability be exploited locally?

Should I ask for a raise one month before the end of an internship?

Why might one *not* want to use a capo?

Find feasible point in polynomial time in linear programming

Employing a contractor proving difficult

Why does a sticker slowly peel off, but if it is pulled quickly it tears?

Why did Lucius make a deal out of Buckbeak hurting Draco but not about Draco being turned into a ferret?

Looking for a plural noun related to ‘fulcrum’ or ‘pivot’ that denotes multiple things as crucial to success

What ways are there to "PEEK" memory sections in (different) BASIC(s)

Why does AM radio react to IR remote?

Drawing probabilities on a simplex in TikZ

Is there a better way to use C# dictionaries than TryGetValue?

Why doesn't Starship have four landing legs?

If the UK Gov. has authority to cancel article 50 notification, why do they have to agree an extension with the EU

Another "Ask One Question" Question

Get contents before a colon

Why Can't A Name Be Written Literally In Japanese?

Is this position a forced win for Black after move 14?

RAID0 instead of RAID1 or 5, is this crazy?

Is there a way to tell what frequency I need a PWM to be?

What checks exist against overuse of presidential pardons in the USA?

Is it unusual for a math department not to have a mail/web server?

slicing pandas dataframe encounter KeyError: 'n_tokens_content', how to locate the bad rows efficiently?

Add one row to pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasPandas error in python 3.5.1one-hot encode : list of column_values has to encodeAlpha_vantage produces errordifference between `header = None` and `header = 0` in pandasValueError: shapes (831,18) and (1629,2) not aligned: 18 (dim 1) != 1629 (dim 0)Getting KeyError using Pandas when accessing .csv filespanda DataFrame slicing has a KeyError: -1 Error

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.

%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]

last line produces error

KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970

if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)

1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647

cache[item] = res

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442

return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446

indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

I think this is caused by some rows in the csv file, as this piece of code work well for other csv.

if yes, how to locate the bad rows efficiently?

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

What is your goal? What are you trying to achieve?

– Erfan
Mar 27 at 21:58

This error means there is no column called n_tokens_content in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns or df.head()) to see what your column names are.

– AlexK
Mar 27 at 22:01

add a comment |

I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.

%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]

last line produces error

KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970

if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)

1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647

cache[item] = res

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442

return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446

indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

I think this is caused by some rows in the csv file, as this piece of code work well for other csv.

if yes, how to locate the bad rows efficiently?

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

What is your goal? What are you trying to achieve?

– Erfan
Mar 27 at 21:58

This error means there is no column called n_tokens_content in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns or df.head()) to see what your column names are.

– AlexK
Mar 27 at 22:01

add a comment |

I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.

%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]

last line produces error

KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970

if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)

1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647

cache[item] = res

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442

return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446

indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

I think this is caused by some rows in the csv file, as this piece of code work well for other csv.

if yes, how to locate the bad rows efficiently?

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

I am trying to explore this dataset with pandas 0.20.3 in Python 3.6.2.

%pylab inline
import pandas as pd
df = pd.read_csv('OnlineNewsPopularity.csv')
df['n_tokens_content'][:9]

last line produces error

KeyError Traceback (most recent call
last)
~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2441 try:
-> 2442 return self._engine.get_loc(key) 2443 except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call
last) in ()
----> 1 df['n_tokens_content'][:9]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in getitem(self, key) 1962 return
self._getitem_multilevel(key) 1963 else:
-> 1964 return self._getitem_column(key) 1965 1966 def _getitem_column(self, key):

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/frame.py
in _getitem_column(self, key) 1969 # get column 1970

if self.columns.is_unique:
-> 1971 return self._get_item_cache(key) 1972 1973 # duplicate columns & possible reduce dimensionality

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/generic.py
in _get_item_cache(self, item) 1643 res = cache.get(item)

1644 if res is None:
-> 1645 values = self._data.get(item) 1646 res = self._box_item_values(item, values) 1647

cache[item] = res

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/internals.py
in get(self, item, fastpath) 3588 3589 if not
isnull(item):
-> 3590 loc = self.items.get_loc(item) 3591 else: 3592 indexer =
np.arange(len(self.items))[isnull(self.items)]

~/anaconda3/envs/tf11/lib/python3.6/site-packages/pandas/core/indexes/base.py
in get_loc(self, key, method, tolerance) 2442

return self._engine.get_loc(key) 2443 except KeyError:
-> 2444 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2445 2446

indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5280)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc
(pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20523)()

pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item
(pandas/_libs/hashtable.c:20477)()

KeyError: 'n_tokens_content'

I think this is caused by some rows in the csv file, as this piece of code work well for other csv.

if yes, how to locate the bad rows efficiently?

python python-3.x pandas dataframe

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

edited Mar 27 at 22:09

asked Mar 27 at 21:18

user11103981

asked Mar 27 at 21:18

user11103981

asked Mar 27 at 21:18

user11103981

What is your goal? What are you trying to achieve?

– Erfan
Mar 27 at 21:58

This error means there is no column called n_tokens_content in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns or df.head()) to see what your column names are.

– AlexK
Mar 27 at 22:01

add a comment |

What is your goal? What are you trying to achieve?

– Erfan
Mar 27 at 21:58

This error means there is no column called n_tokens_content in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns or df.head()) to see what your column names are.

– AlexK
Mar 27 at 22:01

What is your goal? What are you trying to achieve?

– Erfan
Mar 27 at 21:58

This error means there is no column called n_tokens_content in the dataframe you created. You'll have to examine the dataframe (e.g., run df.columns or df.head()) to see what your column names are.

– AlexK
Mar 27 at 22:01

add a comment |

1 Answer
1

active

oldest

votes

When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.

Input: df.columns

Output:

Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
 ' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
 ' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
 ' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
 ' data_channel_is_entertainment', ' data_channel_is_bus',
 ' data_channel_is_socmed', ' data_channel_is_tech',
 ' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
 ' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
 ' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
 ' self_reference_max_shares', ' self_reference_avg_sharess',
 ' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
 ' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
 ' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
 ' LDA_03', ' LDA_04', ' global_subjectivity',
 ' global_sentiment_polarity', ' global_rate_positive_words',
 ' global_rate_negative_words', ' rate_positive_words',
 ' rate_negative_words', ' avg_positive_polarity',
 ' min_positive_polarity', ' max_positive_polarity',
 ' avg_negative_polarity', ' min_negative_polarity',
 ' max_negative_polarity', ' title_subjectivity',
 ' title_sentiment_polarity', ' abs_title_subjectivity',
 ' abs_title_sentiment_polarity', ' shares'],
 dtype='object')

Give input as: df[' n_tokens_content'][:9]

output:
0 219 1 255 2 211 3 531 4 1072 5 370 6 960 7 989 8 97

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55386612%2fslicing-pandas-dataframe-encounter-keyerror-n-tokens-content-how-to-locate-t%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.

Input: df.columns

Output:

Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
 ' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
 ' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
 ' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
 ' data_channel_is_entertainment', ' data_channel_is_bus',
 ' data_channel_is_socmed', ' data_channel_is_tech',
 ' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
 ' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
 ' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
 ' self_reference_max_shares', ' self_reference_avg_sharess',
 ' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
 ' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
 ' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
 ' LDA_03', ' LDA_04', ' global_subjectivity',
 ' global_sentiment_polarity', ' global_rate_positive_words',
 ' global_rate_negative_words', ' rate_positive_words',
 ' rate_negative_words', ' avg_positive_polarity',
 ' min_positive_polarity', ' max_positive_polarity',
 ' avg_negative_polarity', ' min_negative_polarity',
 ' max_negative_polarity', ' title_subjectivity',
 ' title_sentiment_polarity', ' abs_title_subjectivity',
 ' abs_title_sentiment_polarity', ' shares'],
 dtype='object')

Give input as: df[' n_tokens_content'][:9]

output:
0 219 1 255 2 211 3 531 4 1072 5 370 6 960 7 989 8 97

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

add a comment |

When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.

Input: df.columns

Output:

Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
 ' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
 ' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
 ' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
 ' data_channel_is_entertainment', ' data_channel_is_bus',
 ' data_channel_is_socmed', ' data_channel_is_tech',
 ' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
 ' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
 ' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
 ' self_reference_max_shares', ' self_reference_avg_sharess',
 ' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
 ' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
 ' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
 ' LDA_03', ' LDA_04', ' global_subjectivity',
 ' global_sentiment_polarity', ' global_rate_positive_words',
 ' global_rate_negative_words', ' rate_positive_words',
 ' rate_negative_words', ' avg_positive_polarity',
 ' min_positive_polarity', ' max_positive_polarity',
 ' avg_negative_polarity', ' min_negative_polarity',
 ' max_negative_polarity', ' title_subjectivity',
 ' title_sentiment_polarity', ' abs_title_subjectivity',
 ' abs_title_sentiment_polarity', ' shares'],
 dtype='object')

Give input as: df[' n_tokens_content'][:9]

output:
0 219 1 255 2 211 3 531 4 1072 5 370 6 960 7 989 8 97

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

add a comment |

When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.

Input: df.columns

Output:

Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
 ' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
 ' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
 ' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
 ' data_channel_is_entertainment', ' data_channel_is_bus',
 ' data_channel_is_socmed', ' data_channel_is_tech',
 ' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
 ' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
 ' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
 ' self_reference_max_shares', ' self_reference_avg_sharess',
 ' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
 ' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
 ' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
 ' LDA_03', ' LDA_04', ' global_subjectivity',
 ' global_sentiment_polarity', ' global_rate_positive_words',
 ' global_rate_negative_words', ' rate_positive_words',
 ' rate_negative_words', ' avg_positive_polarity',
 ' min_positive_polarity', ' max_positive_polarity',
 ' avg_negative_polarity', ' min_negative_polarity',
 ' max_negative_polarity', ' title_subjectivity',
 ' title_sentiment_polarity', ' abs_title_subjectivity',
 ' abs_title_sentiment_polarity', ' shares'],
 dtype='object')

Give input as: df[' n_tokens_content'][:9]

output:
0 219 1 255 2 211 3 531 4 1072 5 370 6 960 7 989 8 97

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

When you print the columns using df.columns then 'n_tokens_content' has a leading space at the start.

Input: df.columns

Output:

Index(['url', ' timedelta', ' n_tokens_title', ' n_tokens_content',
 ' n_unique_tokens', ' n_non_stop_words', ' n_non_stop_unique_tokens',
 ' num_hrefs', ' num_self_hrefs', ' num_imgs', ' num_videos',
 ' average_token_length', ' num_keywords', ' data_channel_is_lifestyle',
 ' data_channel_is_entertainment', ' data_channel_is_bus',
 ' data_channel_is_socmed', ' data_channel_is_tech',
 ' data_channel_is_world', ' kw_min_min', ' kw_max_min', ' kw_avg_min',
 ' kw_min_max', ' kw_max_max', ' kw_avg_max', ' kw_min_avg',
 ' kw_max_avg', ' kw_avg_avg', ' self_reference_min_shares',
 ' self_reference_max_shares', ' self_reference_avg_sharess',
 ' weekday_is_monday', ' weekday_is_tuesday', ' weekday_is_wednesday',
 ' weekday_is_thursday', ' weekday_is_friday', ' weekday_is_saturday',
 ' weekday_is_sunday', ' is_weekend', ' LDA_00', ' LDA_01', ' LDA_02',
 ' LDA_03', ' LDA_04', ' global_subjectivity',
 ' global_sentiment_polarity', ' global_rate_positive_words',
 ' global_rate_negative_words', ' rate_positive_words',
 ' rate_negative_words', ' avg_positive_polarity',
 ' min_positive_polarity', ' max_positive_polarity',
 ' avg_negative_polarity', ' min_negative_polarity',
 ' max_negative_polarity', ' title_subjectivity',
 ' title_sentiment_polarity', ' abs_title_subjectivity',
 ' abs_title_sentiment_polarity', ' shares'],
 dtype='object')

Give input as: df[' n_tokens_content'][:9]

output:
0 219 1 255 2 211 3 531 4 1072 5 370 6 960 7 989 8 97

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

answered Mar 27 at 22:19

SravanthiG

513 bronze badges

add a comment |

Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

밀양 대씨 역사 각주 함께 보기 둘러보기 메뉴밀양 대씨

1973년 목차 사건 문화 탄생 사망 노벨상 달력 둘러보기 메뉴

1 Answer
1

1 Answer
1

1 Answer
1