Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 2 Looping
--
Chapters
00:00 Question
00:44 Accepted answer (Score 1037)
04:35 Answer 2 (Score 114)
04:58 Answer 3 (Score 62)
06:32 Answer 4 (Score 13)
06:56 Thank you
--
Full question
https://stackoverflow.com/questions/3692...
Accepted answer links:
[numpy.logical_or]: https://docs.scipy.org/doc/numpy/referen...
[numpy.logical_and]: https://docs.scipy.org/doc/numpy/referen...
[operator precedence]: https://docs.python.org/reference/expres...
[several logical numpy functions]: https://docs.scipy.org/doc/numpy/referen...
Answer 3 links:
[Boolean Indexing]: http://pandas.pydata.org/pandas-docs/sta...
Answer 4 links:
[Python docs]: https://docs.python.org/2/library/operat...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #dataframe #boolean #filtering
#avk47
ACCEPTED ANSWER
Score 1137
The or and and Python statements require truth-values. For pandas, these are considered ambiguous, so you should use "bitwise" | (or) or & (and) operations:
df = df[(df['col'] < -0.25) | (df['col'] > 0.25)]
These are overloaded for these kinds of data structures to yield the element-wise or or and.
Just to add some more explanation to this statement:
The exception is thrown when you want to get the bool of a pandas.Series:
>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You hit a place where the operator implicitly converted the operands to bool (you used or but it also happens for and, if and while):
>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Besides these four statements, there are several Python functions that hide some bool calls (like any, all, filter, ...). These are normally not problematic with pandas.Series, but for completeness I wanted to mention these.
In your case, the exception isn't really helpful, because it doesn't mention the right alternatives. For and and or, if you want element-wise comparisons, you can use:
-
>>> import numpy as np >>> np.logical_or(x, y)or simply the
|operator:>>> x | y -
>>> np.logical_and(x, y)or simply the
&operator:>>> x & y
If you're using the operators, then be sure to set your parentheses correctly because of operator precedence.
There are several logical NumPy functions which should work on pandas.Series.
The alternatives mentioned in the Exception are more suited if you encountered it when doing if or while. I'll shortly explain each of these:
If you want to check if your Series is empty:
>>> x = pd.Series([]) >>> x.empty True >>> x = pd.Series([1]) >>> x.empty FalsePython normally interprets the
length of containers (likelist,tuple, ...) as truth-value if it has no explicit Boolean interpretation. So if you want the Python-like check, you could do:if x.sizeorif not x.emptyinstead ofif x.If your
Seriescontains one and only one Boolean value:>>> x = pd.Series([100]) >>> (x > 50).bool() True >>> (x < 50).bool() FalseIf you want to check the first and only item of your Series (like
.bool(), but it works even for non-Boolean contents):>>> x = pd.Series([100]) >>> x.item() 100If you want to check if all or any item is not-zero, not-empty or not-False:
>>> x = pd.Series([0, 1, 2]) >>> x.all() # Because one element is zero False >>> x.any() # because one (or more) elements are non-zero True
ANSWER 2
Score 132
Pandas uses bitwise & |. Also, each condition should be wrapped inside ( ).
This works:
data_query = data[(data['year'] >= 2005) & (data['year'] <= 2010)]
But the same query without parentheses does not:
data_query = data[(data['year'] >= 2005 & data['year'] <= 2010)]
ANSWER 3
Score 64
For Boolean logic, use & and |.
np.random.seed(0)
df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
>>> df
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
2 0.950088 -0.151357 -0.103219
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
>>> df.loc[(df.C > 0.25) | (df.C < -0.25)]
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
To see what is happening, you get a column of Booleans for each comparison, e.g.,
df.C > 0.25
0 True
1 False
2 False
3 True
4 True
Name: C, dtype: bool
When you have multiple criteria, you will get multiple columns returned. This is why the join logic is ambiguous. Using and or or treats each column separately, so you first need to reduce that column to a single Boolean value. For example, to see if any value or all values in each of the columns is True.
# Any value in either column is True?
(df.C > 0.25).any() or (df.C < -0.25).any()
True
# All values in either column is True?
(df.C > 0.25).all() or (df.C < -0.25).all()
False
One convoluted way to achieve the same thing is to zip all of these columns together, and perform the appropriate logic.
>>> df[[any([a, b]) for a, b in zip(df.C > 0.25, df.C < -0.25)]]
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.443863
For more details, refer to Boolean Indexing in the documentation.
ANSWER 4
Score 15
Or, alternatively, you could use the operator module. More detailed information is in the Python documentation:
import operator
import numpy as np
import pandas as pd
np.random.seed(0)
df = pd.DataFrame(np.random.randn(5,3), columns=list('ABC'))
df.loc[operator.or_(df.C > 0.25, df.C < -0.25)]
A B C
0 1.764052 0.400157 0.978738
1 2.240893 1.867558 -0.977278
3 0.410599 0.144044 1.454274
4 0.761038 0.121675 0.4438