'Django - Joining after filtering

Imagine there are a model and a view:

class Task(models.Model):
    title = models.CharField(...)
    message = models.TextField(...)
    performer = models.ForeignKey(User, on_delete=models.CASCADE, ...)
    orgunit = models.ForeignKey(OrgUnit, on_delete=models.CASCADE ...)
    deviation = models.ForeignKey(Deviation, on_delete=models.CASCADE, related_name='tasks', ...)
    creator = models.ForeignKey(User, on_delete=models.SET_NULL, ...)
    for_control = models.ManyToManyField('self', ...)
    # другие поля

class TasksViewSet(viewsets.ModelViewSet):
    serializer_class = TaskSerializer
    pagination_class = TasksPaginator
    filterset_class = TasksFilterSet

    def get_queryset(self):
        qs = Task.objects\
            .select_related(
                'performer__position__orgunit__conformation', 'deviation',
                'creator__position__orgunit__conformation', 'orgunit__conformation',
            ).prefetch_related('for_control')

        return qs

Fields in select_related and prefetch_related are for future serializing.

As far as I know first joining happens and then filtering (where clause from TasksFilterSet) happens, right? If so obviously this can affect performance.

So I thought would it be better first filter and then join? Of course some joins need to be done for filtering but this is not the case. I mean something like this:

class TasksViewSet(viewsets.ModelViewSet):
    serializer_class = TaskSerializer
    pagination_class = TasksPaginator
    filterset_class = TasksFilterSet

    def get_queryset(self):
        return Task.objects.only('pk')

    def list(self, request, *args, **kwargs):
        queryset = self.filter_queryset(self.get_queryset())
        page = self.paginate_queryset(queryset)

        if page is not None:
            page = Task.objects.filter(pk__in=[task.pk for task in page]).select_related(
                'performer__position__orgunit__conformation', 'deviation',
                'creator__position__orgunit__conformation', 'orgunit__conformation',
            )
            serializer = self.get_serializer(page, many=True)
            return self.get_paginated_response(serializer.data)

        serializer = self.get_serializer(queryset, many=True)
        return Response(serializer.data)

What do you think about this?



Solution 1:[1]

The order in which you call select_related(), prefetch_related() and filter() does not matter. The resulting queryset will be the same

From the docs for select_related

The order of filter() and select_related() chaining isn’t important. These querysets are equivalent: Entry.objects.filter(pub_date__gt=timezone.now()).select_related('blog') Entry.objects.select_related('blog').filter(pub_date__gt=timezone.now())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Iain Shelvington