How to filter objects for count annotation in Django?
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 2
--
Chapters
00:00 How To Filter Objects For Count Annotation In Django?
01:10 Answer 1 Score 52
01:47 Answer 2 Score 97
02:03 Answer 3 Score 6
03:45 Accepted Answer Score 228
04:55 Thank you
--
Full question
https://stackoverflow.com/questions/3075...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #django #djangomodels #djangoaggregation
#avk47
ACCEPTED ANSWER
Score 228
Conditional aggregation in Django 2.0+ allows you to further reduce the amount of faff this has been in the past. This will also use Postgres' filter logic, which is somewhat faster than a sum-case (I've seen numbers like 20-30% bandied around).
Anyway, in your case, we're looking at something as simple as:
from django.db.models import Q, Count
events = Event.objects.annotate(
    paid_participants=Count('participant', filter=Q(participant__is_paid=True))
)
There's a separate section in the docs about filtering on annotations. It's the same stuff as conditional aggregation but more like my example above. Either which way, this is a lot healthier than the gnarly subqueries I was doing before.
For more complex annotation filters, it may be more understandable to structure the filter as a separate queryset that is passed to an __in expression:
from django.db.models import Q, Count
paid_participants = Participant.objects.filter(is_paid=True)
events = Event.objects.annotate(
    paid_participants=Count(
        'participant', 
        filter=Q(participant__in=paid_participants)
    )
)
In this context, the Participant queryset does not need to explicitly filter on the outer Event ID like a subquery because Count (and other aggregation functions) implicitly filter for reverse foreign key relationships of the current row. In other words, with this format specify your filters globally for the Model that is being counted.
ANSWER 2
Score 97
Just discovered that Django 1.8 has new conditional expressions feature, so now we can do like this:
events = Event.objects.all().annotate(paid_participants=models.Sum(
    models.Case(
        models.When(participant__is_paid=True, then=1),
        default=0, output_field=models.IntegerField()
    )))
ANSWER 3
Score 52
UPDATE
The sub-query approach which I mention is now supported in Django 1.11 via subquery-expressions.
Event.objects.annotate(
    num_paid_participants=Subquery(
        Participant.objects.filter(
            is_paid=True,
            event=OuterRef('pk')
        ).values('event')
        .annotate(cnt=Count('pk'))
        .values('cnt'),
        output_field=models.IntegerField()
    )
)
I prefer this over aggregation (sum+case), because it should be faster and easier to be optimized (with proper indexing).
For older version, the same can be achieved using .extra
Event.objects.extra(select={'num_paid_participants': "\
    SELECT COUNT(*) \
    FROM `myapp_participant` \
    WHERE `myapp_participant`.`is_paid` = 1 AND \
            `myapp_participant`.`event_id` = `myapp_event`.`id`"
})
ANSWER 4
Score 6
I would suggest to use the .values method of your Participant queryset instead. 
For short, what you want to do is given by:
Participant.objects\
    .filter(is_paid=True)\
    .values('event')\
    .distinct()\
    .annotate(models.Count('id'))
A complete example is as follow:
Create 2
Events:event1 = Event.objects.create(title='event1') event2 = Event.objects.create(title='event2')Add
Participants to them:part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\ for _ in range(10)] part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\ for _ in range(50)]Group all
Participants by theireventfield:Participant.objects.values('event') > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>Here distinct is needed:
Participant.objects.values('event').distinct() > <QuerySet [{'event': 1}, {'event': 2}]>What
.valuesand.distinctare doing here is that they are creating two buckets ofParticipants grouped by their elementevent. Note that those buckets containParticipant.You can then annotate those buckets as they contain the set of original
Participant. Here we want to count the number ofParticipant, this is simply done by counting theids of the elements in those buckets (since those areParticipant):Participant.objects\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>Finally you want only
Participantwith ais_paidbeingTrue, you may just add a filter in front of the previous expression, and this yield the expression shown above:Participant.objects\ .filter(is_paid=True)\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
The only drawback is that you have to retrieve the Event afterwards as you only have the id from the method above.