The Python Oracle

Type Conversion in python AttributeError: 'str' object has no attribute 'astype'

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 2

--

Chapters
00:00 Type Conversion In Python Attributeerror: 'Str' Object Has No Attribute 'Astype'
00:55 Answer 1 Score 2
01:31 Accepted Answer Score 14
02:28 Answer 3 Score 1
02:37 Thank you

--

Full question
https://stackoverflow.com/questions/4191...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #typeconversion

#avk47



ACCEPTED ANSWER

Score 14


df['a'] returns a Series object that has astype as a vectorized way to convert all elements in the series into another one.

df['a'][1] returns the content of one cell of the dataframe, in this case the string '0.123'. This is now returning a str object that doesn't have this function. To convert it use regular python instruction:

type(df['a'][1])
Out[25]: str

float(df['a'][1])
Out[26]: 0.123

type(float(df['a'][1]))
Out[27]: float

As per your second question, the operator in that is at the end calling __contains__ against the series with '' as argument, here is the docstring of the operator:

help(pd.Series.__contains__)
Help on function __contains__ in module pandas.core.generic:

__contains__(self, key)
    True if the key is in the info axis

It means that the in operator is searching your empty string in the index, not the contents of it.

The way to search your empty strings is to use the equal operator:

df
Out[54]: 
    a
0  42
1    

'' in df
Out[55]: False

df==''
Out[56]: 
       a
0  False
1   True

df[df['a']=='']
Out[57]: 
  a
1  



ANSWER 2

Score 2


df['a'][1] will return the actual value inside the array, at the position 1, which is in fact a string. You can convert it by using float(df['a'][1]).

>>> df = pd.DataFrame({'a':['1.23', '0.123']})
>>> type(df['a'])
<class 'pandas.core.series.Series'>
>>> df['a'].astype(float)
0    1.230
1    0.123
Name: a, dtype: float64
>>> type(df['a'][1])
<type 'str'>

For the second question, maybe you have an empty value on your raw data. The correct test would be:

>>> df = pd.DataFrame({'a':['1', '']})
>>> '' in df['a'].values
True

Source for the second question: https://stackoverflow.com/a/21320011/5335508




ANSWER 3

Score 1


In addition to the solutions already posted you could also simply use:

df['a'].astype(float)[1]