-
Notifications
You must be signed in to change notification settings - Fork 72
feat: Support Keyspace #2454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
feat: Support Keyspace #2454
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2454 +/- ##
========================================
Coverage 79.96% 79.97%
========================================
Files 384 384
Lines 15935 15936 +1
Branches 8340 8341 +1
========================================
+ Hits 12743 12745 +2
+ Misses 1912 1910 -2
- Partials 1280 1281 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but let's wait for an approval from @godexsoft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not thoroughly reviewed the code because i don't yet understand why we need so many changes if keyspaces is supposed to be compatible with cassandra?
In general i'm not a fan of adding custom stuff to the cassandra backend.. if we need to have custom behaviour maybe we should create a separate backend and inherit where we can, modify where we need.
if (!range_) { | ||
executor_.writeSync(schema_->updateLedgerRange, ledgerSequence_, false, ledgerSequence_); | ||
executor_.writeSync(schema_->insertLedgerRange, false, ledgerSequence_); | ||
executor_.writeSync(schema_->insertLedgerRange, true, ledgerSequence_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this not needed before, do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because before, the UpdateLedgerRange would insert the first range that it finds, let's say rippled started at ledger 256; this would insert 256, and the false
means it is not the latest ledger in our ledger_range
table.
Since the updateLedgerRange
can't be used in Keyspace, I simplified it to both keyspace and scylla using this insertLedgerRange schema, as I felt like it is easier to understand the logic behind it. (ie, first time clio loads up without a ledger range table, insert both the latest and non latest range into the table)
- **Required**: True | ||
- **Type**: string | ||
- **Default value**: `cassandra` | ||
- **Constraints**: The value must be one of the following: `cassandra`, `aws_keyspace`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May as well add scylladb
to the mix then.. if we are using the same backend for multiple DBs already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason we didn't add scylladb
is because scylladb
and cassandra
would be the same as it uses fully the same schemas
); | ||
}(); | ||
// AWS_keyspace supported queries | ||
} else if (settingsProvider_.get().getSettings().provider == "aws_keyspace") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please can you explain why we need so many changes and what exactly is not supported so we had to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explained in my other comment 👍
PreparedStatement updateLedgerRange = [this]() { | ||
return handle_.get().prepare( | ||
fmt::format( | ||
R"( | ||
UPDATE {} | ||
SET sequence = ? | ||
WHERE is_latest = ? | ||
IF sequence IN (?, null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was suggested by scylla team before IIRC.. this gave us a good speedup - let's not go back to the slower old version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert the logic so that scylladb uses this 👍
@godexsoft For some context, Keyspace is compatible with Cassandra but there's a few limitations. Specifically the places where we had:
To tackle this, I split the queries that contains the above into 2 queries. ie, 2 queries will be equivalent to the 1 from before. The goal of this PR is to get Clio to run with Keyspace while those running cassandra/scylladb would work the exact same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments to mostly make the code more readable and less error-prone
) const | ||
{ | ||
std::vector<ripple::uint256> nftIDs; | ||
if (settingsProvider_.getSettings().provider == "aws_keyspace") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have 2 private methods, and here just call one or another under if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this marked as resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @kuznetsss and @godexsoft , we are completely changing the structure of the backend, (ie, adding a cassandraFamily level where CassandraBackend and KeyspaceBackend is inheriting from) there's not going to be anymore checking provider
in backend, so I marked it as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, please, don't just "resolve" the issue, without any comment 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep it open until it's actually resolved, even if it's resolved not the way that reviewer (in this case me) suggested
@@ -347,6 +348,86 @@ class Schema { | |||
Statements(SettingsProviderType const& settingsProvider, Handle const& handle) | |||
: settingsProvider_{settingsProvider}, handle_{std::cref(handle)} | |||
{ | |||
// initialize scylladb supported queries | |||
if (settingsProvider_.get().getSettings().provider == "scylladb") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also have 2 private methods here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this marked as resolved?
// For ScyllaDB / Cassandra ONLY | ||
std::optional<PreparedStatement> selectAccountFromBeginningScylla; | ||
std::optional<PreparedStatement> selectAccountFromTokenScylla; | ||
std::optional<PreparedStatement> selectNFTsByIssuerScylla; | ||
|
||
// For AWS Keyspaces ONLY | ||
// NOTE: AWS keyspace is not able to load cache with accounts | ||
std::optional<PreparedStatement> selectNFTsAfterTaxonKeyspaces; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be std::variant<CassandraStatements, AWSKeyspacesStatements>
, and each one is a struct of needed statements, you will probably be able to get rid of std::optional as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this marked resolved?
@@ -88,6 +89,9 @@ struct Settings { | |||
/** @brief Size of batches when writing */ | |||
std::size_t writeBatchSize = kDEFAULT_BATCH_SIZE; | |||
|
|||
/** @brief Provider to know if we are using scylladb or keyspace */ | |||
std::string provider = kDEFAULT_PROVIDER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create an enum for the provider.
I know that the config definition doesn't support the enumeration, but I think we shouldn't use raw strings and raw strings comparison in the code itself, because it's error-prone (see my comment above for incorrect kDEFAULT_PROVIDER
value)
Support AWS Keyspace queries