The Python Oracle

Data prints, but does not write to dataframe

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Sunrise at the Stream

--

Chapters
00:00 Data Prints, But Does Not Write To Dataframe
00:47 Accepted Answer Score 4
02:01 Thank you

--

Full question
https://stackoverflow.com/questions/3501...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #csv #pandas #confusionmatrix

#avk47



ACCEPTED ANSWER

Score 4


The problem here is that you can't append to a df like this by simply assigning a scalar value to a new column:

In [55]:
stats = pd.DataFrame()
stats['TruePositive'] = 210483
stats

Out[55]:
Empty DataFrame
Columns: [TruePositive]
Index: []

You'll need to construct the df with the desired values in the ctor:

In [62]:
TP = 210483
FP = 153902
FN = 32845
TN = 10788
stats = pd.DataFrame({'TruePositive':[TP], 'TrueNegative':[TN], 'FalsePositive':[FP], 'FalseNegative':[FN]})
stats

Out[62]:
   FalseNegative  FalsePositive  TrueNegative  TruePositive
0          32845         153902         10788        210483

OR add a dummy row and then your code will work:

In [71]:
stats = pd.DataFrame()
stats = stats.append(pd.Series('dummy'), ignore_index=True)
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats

Out[71]:
       0  TruePositive  TrueNegative  FalsePositive  FalseNegative
0  dummy        210483         10788         153902          32845

You can then drop the dummy column calling drop:

In [72]:
stats.drop(0, axis=1)

Out[72]:
   TruePositive  TrueNegative  FalsePositive  FalseNegative
0        210483         10788         153902          32845

So why your attempt failed is because your initial df was empty, you're assigning a new column with a scalar value, the scalar value will set all rows for the new column to this value. As your df has no rows this fails which is why you have an empty df.

Another way would be to create the df with a single row (here I put NaN):

In [77]:
stats = pd.DataFrame([np.NaN])
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats.dropna(axis=1)

Out[77]:
   TruePositive  TrueNegative  FalsePositive  FalseNegative
0        210483         10788         153902          32845