-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[feature](function) add approx_top_sum aggregation function #43643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
|
||
#pragma once | ||
|
||
#include <rapidjson/encodings.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: 'rapidjson/encodings.h' file not found [clang-diagnostic-error]
#include <rapidjson/encodings.h>
^
run buildall |
ca8d282
to
1ea36ee
Compare
run buildall |
1ea36ee
to
17ecc13
Compare
run buildall |
qt_sql """ select approx_top_k(clientip) from ${tableName}; """ | ||
qt_sql """ select approx_top_k(clientip, 10) from ${tableName}; """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why deleting these tests in a feature PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it is difficult to implement default values for the variable arguments in the function framework. It is hoped that users will directly specify the last two parameters.
17ecc13
to
23d7adc
Compare
run buildall |
@@ -43,7 +43,7 @@ class IDataType; | |||
|
|||
struct AggregateFunctionAttr { | |||
bool enable_decimal256 {false}; | |||
std::vector<std::pair<std::string, bool>> column_infos; | |||
std::vector<std::string> column_names; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this basic data structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize the redundant data structures added in approx_top_k.
sub_writer.Key(_column_names[i].data(), _column_names[i].size()); | ||
sub_writer.String(row_str.data(), row_str.size()); | ||
} | ||
sub_writer.Key("count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -55,9 +55,19 @@ public ApproxTopK(boolean distinct, boolean alwaysNullable, Expression... varArg | |||
|
|||
@Override | |||
public void checkLegalityBeforeTypeCoercion() { | |||
if (arity() < 1) { | |||
if (arity() < 3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change arguments num from 1 to 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing function framework code cannot implement approx_top_k and approx_top_sum with default values.
23d7adc
to
cf48a1b
Compare
run buildall |
TeamCity be ut coverage result: |
cf48a1b
to
adf71e1
Compare
run buildall |
TeamCity be ut coverage result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…3643) Problem Summary: 1. The function approx_top_sum has been implemented. Here is an example of its usage: select approx_top_sum(c1, c2, c3, 10, 300) from tbl. ### Release note Add new function `approx_top_sum`.
…3643) Problem Summary: 1. The function approx_top_sum has been implemented. Here is an example of its usage: select approx_top_sum(c1, c2, c3, 10, 300) from tbl. Add new function `approx_top_sum`.
What problem does this PR solve?
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)