-
Notifications
You must be signed in to change notification settings - Fork 29
INTPYTHON-751 Make query generation omit $expr unless required #396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
529e0ff
to
a78f26b
Compare
8b0c247
to
141f1cf
Compare
d11378a
to
2c48d11
Compare
Substr.as_mql = substr | ||
Trim.as_mql = trim("trim") | ||
TruncBase.as_mql = trunc | ||
Cast.as_mql_expr = cast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the function does not support as_mql_path
. It could be added latter if we try to simplify constants expressions
return value | ||
|
||
|
||
def base_expression(self, compiler, connection, as_path=False, **extra): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the common handler for all expressions. It defines if an expr
is needed or not.
KeyTransformExact.as_mql_expr = key_transform_exact_expr | ||
KeyTransformExact.as_mql_path = key_transform_exact_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check alphabetization of the classes and functions (not only in this file).
} | ||
|
||
def range_match(a, b): | ||
## TODO: MAKE A TEST TO TEST WHEN BOTH ENDS ARE NONE. WHAT SHALL I RETURN? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI says, "If either the start or end value provided to the BETWEEN operator is NULL, the entire BETWEEN condition will typically evaluate to UNKNOWN (and thus FALSE in a WHERE clause), unless explicitly handled." (I confirmed this for SQLite and PostgreSQL)
However, to match the semantics implemented here where None is treated as min/max date, I would expect __range=[None, None]
not to filter any values.
connection, | ||
operator=None, | ||
resolve_inner_expression=False, | ||
**extra_context, # noqa: ARG001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the removal of extra_context
strictly related to this patch? (Mainly wondering, though perhaps it could be a separate trivial PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, It was imported from django. I copied the extra_context thing but I realize that the codes never uses those extra_context. So I went to remove it. But I agree, it could be in a separate PR.
Aggregate.as_mql_expr = aggregate | ||
Count.as_mql_expr = count | ||
StdDev.as_mql_expr = stddev_variance | ||
Variance.as_mql_expr = stddev_variance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we have as_mql_expr()
, as_mql_path()
, and as_mql(..., as_path=...)
. If this is the way we keep it, it would be good to explain in the design document which objects (aggregate, func, expression, etc.) get which.
I wonder about renaming as_mql_expr()
or as_mql_path()
to as_mql()
(i.e. treating one of paths as the default). Do you think it would be more or less confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was the idea. I’ll explain it in the docs, and we might also consider renaming some methods. The core concept is:
- Every expression has an
as_mql
method. - In some cases, it’s simpler to implement
as_mql
directly, so those methods don’t follow the common expression flow. - For other expressions,
as_mql
is a composite function that delegates toas_path
oras_expr
when applied. - The
base_expression.as_mql
method controls when these are called and performs boilerplate checks to prevent nesting anexpr
inside anotherexpr
(a MongoDB 6 restriction).
In short: every object has as_mql
. Some also define as_path
and as_expr
. The base_expression
coordinates how these methods are used, except for cases where as_mql
is defined directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc here: link
django_mongodb_backend/base.py
Outdated
return {"$or": [{a: {"$exists": False}}, {a: None}]} | ||
return {"$and": [{a: {"$exists": True}}, {a: {"$ne": None}}]} | ||
|
||
mongo_operators_expr = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mongo_expr_operators
might be a more natural word order.
django_mongodb_backend/functions.py
Outdated
lhs_mql = {"$convert": {"input": lhs_mql, "to": output_type}} | ||
if decimal_places := getattr(self.output_field, "decimal_places", None): | ||
lhs_mql = {"$trunc": [lhs_mql, decimal_places]} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
from django.db.models.sql.where import AND, OR, XOR, ExtraWhere, NothingNode, WhereNode | ||
from pymongo.errors import BulkWriteError, DuplicateKeyError, PyMongoError | ||
|
||
from .query_conversion.query_optimizer import convert_expr_to_match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can delete all that code too. :-)
|
||
def regex_match(field, regex, insensitive=False): | ||
options = "i" if insensitive else "" | ||
# return {"$regexMatch": {"input": field, "regex": regex, "options": options}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chop
def test_annotate(self): | ||
obj = Book.objects.create( | ||
author=Author(name="Shakespeare", age=55, address=Address(city="NYC", state="NY")) | ||
) | ||
book_from_ny = ( | ||
Book.objects.annotate(city=F("author__address__city")).filter(city="NYC").first() | ||
) | ||
self.assertCountEqual(book_from_ny.city, obj.author.address.city) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it passes in the current code, it could be added separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmh. maybe I have to delete it, I create It to validate something. But the check is contained in others test
Co-authored-by: Tim Graham <[email protected]>
qs = Tour.objects.filter(exhibit__sections__number=1) | ||
self.assertCountEqual(qs, [self.egypt_tour, self.wonders_tour]) | ||
|
||
def test_foreign_field_exact_expr(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I have to do more test like this? just make a query and the check the generated sql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, an assertion could be added to existing tests where possible.
Doc here
In this PR a unified approach for generating MQL from Django expressions was implemented. The core idea is to centralize the control flow in a
base_expression
method, which decides whether the expression can be translated into a directfield: value
match (index-friendly) or must fall back to$expr
. This keeps the logic for wrapping and dispatching in one place, while each lookup/function only defines its own expression-building logic.This approach also allows mixing direct
field: value
matches with$expr
clauses within the same$match
. As a result, multiple$expr
entries may coexist alongside index-optimized conditions, depending on the shape of the query.Most lookups now follow this pattern by simply implementing
as_mql_expr
(and optionallyas_mql_path
when a match-based translation is possible). Only a few special cases likeCol
,Func
operators (except theKeyTransform
) , and many more, override the base behavior directly. This structure also leaves room for future optimizations (e.g. constant folding) without having to change the overall flow.Additionally, since MongoDB 6 does not allow nesting
$expr
inside another$expr
, the flow inbase_expression
ensures that such cases are flattened. In practice, expressions are generated without redundant wrapping, so the final MQL never contains$expr
within$expr
.NOTE: Some polish will be made, but the main idea and the majority of the code is already rendered.