-
Notifications
You must be signed in to change notification settings - Fork 6k
lignthing/importinto: parallel reading files infos from data store #59382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @joechenrh. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #59382 +/- ##
================================================
+ Coverage 73.1105% 75.8478% +2.7373%
================================================
Files 1711 1760 +49
Lines 473248 492946 +19698
================================================
+ Hits 345994 373889 +27895
+ Misses 105966 96595 -9371
- Partials 21288 22462 +1174
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
/cc @kennytm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7/9
pkg/executor/importer/import.go
Outdated
} | ||
|
||
var err error | ||
if dataFiles, err = mydump.ParallelProcess(ctx, allFiles, e.ThreadCnt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if dataFiles, err = mydump.ParallelProcess(ctx, allFiles, e.ThreadCnt, | |
if dataFiles, err = mydump.ParallelProcess(ctx, allFiles, e.ThreadCnt*2, |
just like lightning does, we can double it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6/11
@@ -142,6 +144,10 @@ type MDLoaderSetupConfig struct { | |||
// MaxScanFiles specifies the maximum number of files to scan. | |||
// If the value is <= 0, it means the number of data source files will be scanned as many as possible. | |||
MaxScanFiles int | |||
|
|||
// ScanFileConcurrency specifes the concurrency of scaning source files. | |||
ScanFileConcurrency int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just move it to LoaderConfig, it's always needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's used for mdLoaderSetup, so I only add it in MDLoaderSetupConfig
and modify it using WithScanFileConcurrency
, just like WithMaxScanFiles
.
/retest |
@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
@joechenrh: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
/retest |
2 similar comments
/retest |
/retest |
/cherry-pick release-8.5 |
Signed-off-by: ti-chi-bot <[email protected]>
@joechenrh: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
In response to a cherrypick label: failed to apply #59382 on top of branch "release-8.5":
|
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #56104, close #60224
Problem Summary:
What changed and how does it work?
Constructing file infos in parallel for both lightning and
IMPORT INTO
.RegionConcurrency * 2
.IMPORT INTO
, the concurrency is the task'sThreadCnt * 2
.Check List
Tests
The following results are tested by lightning with 16 core (so concurrency is 32). Data is stored on S3.
IMPORT INTO
should have the similar result.Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.