Skip to content

Conversation

bobby-richard
Copy link
Contributor

Problem: I need to capture data from multiple tenant databases in the same mongodb cluster

The connector already allowed you to capture data from multiple databases/collections, but this was not very useful without the ability to filter. The copy.existing.namespace.regex setting already exists in the kafka source connector and does exactly this. So this PR just exposes the existing setting to the flink connector.

@leonardBang
Copy link
Contributor

Thanks @bobby-richard for the contribution, @Jiabao-Sun Could you help review this PR?

@Jiabao-Sun
Copy link
Contributor

@leonardBang OK

@Jiabao-Sun
Copy link
Contributor

Thanks @bobby-richard, It's a nice improvement. This setting was not exposed before because multiple databases and collections are complicated for users.

The copy.existing.namespace.regex setting to provide a regular expression that matches specific collections by their namespace, but there is no subscription for these collection changes.

If we need to improve the support of multiple databases and collections, I think we should also expose pipeline configuration for MongoDBTableSource according to the multiple-source-example.

https://github.com/ververica/flink-cdc-connectors/blob/e19b78691ece0af61db42800fb97be64242a77e9/flink-connector-mongodb-cdc/src/main/java/com/ververica/cdc/connectors/mongodb/MongoDBSource.java#L181-L188

@bobby-richard
Copy link
Contributor Author

Sounds good @Jiabao-Sun, I will expose pipeline as well.

@bobby-richard bobby-richard changed the title Add support for copy.existing.namespace.regex config to mongo-cdc Add support for copy.existing.namespace.regex and pipeline config to mongo-cdc Oct 11, 2021
@bobby-richard
Copy link
Contributor Author

@Jiabao-Sun Exposed pipeline configuration as well. Nice catch!

@Jiabao-Sun
Copy link
Contributor

Thanks @bobby-richard, it LGTM.
Hi @leonardBang, do you have other suggestions?

@bobby-richard
Copy link
Contributor Author

@leonardBang @Jiabao-Sun any updates?

@Jiabao-Sun
Copy link
Contributor

@leonardBang @Jiabao-Sun any updates?

Thanks @bobby-richard, it's a good enhencement that we can subscribe and filter multiple databases and collections.

@wuchong @leonardBang Do you have any suggestions?

@@ -206,6 +213,13 @@ Connector Options
only documents in which the closed field is set to false are copied.
</td>
</tr>
<tr>
<td>copy.existing.namespace.regex</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Jiabao-Sun for the contribution, could we use databaselist and collectionList filter and then transfer to underlying copy.existing.namespace.regex here for unification consideration? other parts looks good to me.

@Jiabao-Sun
Copy link
Contributor

closed #940

@Jiabao-Sun Jiabao-Sun closed this Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants