Skip to content

Conversation

dybyte
Copy link
Contributor

@dybyte dybyte commented Sep 3, 2025

Purpose of this pull request

I observed that the testRestoreTaskWhenBinlogDelete test in MysqlCDCWithBinlogDeleteIT occasionally fails in CI.

ERROR org.apache.seatunnel.engine.server.rest.filter.ExceptionHandlingFilter - Error occurred while processing request
2025-09-03T06:28:36.6933917Z java.lang.NullPointerException: null
2025-09-03T06:28:36.6934361Z 	at org.apache.seatunnel.engine.server.rest.service.BaseService.getJobStartTime(BaseService.java:213) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6934783Z 	at org.apache.seatunnel.engine.server.rest.service.BaseService.convertToJson(BaseService.java:184) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6935241Z 	at org.apache.seatunnel.engine.server.rest.service.JobInfoService.getJobInfoJson(JobInfoService.java:69) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6935650Z 	at org.apache.seatunnel.engine.server.rest.servlet.JobInfoServlet.doGet(JobInfoServlet.java:52) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6935976Z 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) ~[seatunnel-hadoop3-3.1.4-uber.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6936290Z 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[seatunnel-hadoop3-3.1.4-uber.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6936803Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6937284Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6937783Z 	at org.apache.seatunnel.engine.server.rest.filter.ExceptionHandlingFilter.doFilter(ExceptionHandlingFilter.java:50) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6938211Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6938669Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6939175Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6939653Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6940124Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6940656Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6941136Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6941665Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6942104Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6942577Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6943041Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6943746Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6944211Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6944679Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6945048Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.Server.handle(Server.java:516) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6945477Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6945894Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6946303Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6946750Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6947254Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6947731Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6948137Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6948635Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6949136Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6949648Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6950183Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6950748Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6951220Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6951742Z 	at org.apache.seatunnel.shade.org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) ~[seatunnel-starter.jar:2.3.12-SNAPSHOT]
2025-09-03T06:28:36.6951874Z 	at java.lang.Thread.run(Thread.java:750) [?:1.8.0_342]

This failure is caused by a NullPointerException in the getJobStartTime method of BaseService.
This PR fixes the NPE in getJobStartTime, which eliminates the flakiness of the testRestoreTaskWhenBinlogDelete test.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Covered by existing test

Check list

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little curious why it would be null. But this fix also good.

@Hisoka-X
Copy link
Member

Hisoka-X commented Sep 4, 2025

Hi @dybyte . I think add a unit test would be better. But also not a big deal if not.

@dybyte
Copy link
Contributor Author

dybyte commented Sep 5, 2025

I'm a little curious why it would be null.

I think the NPE occurs because restoreJob executes asynchronously, and getJobStatus() is called inside the await() before the job is fully initialized. But I’m not entirely sure.

CompletableFuture.supplyAsync(
() -> {
try {
container.restoreJob(
"/mysqlcdc_to_mysql_with_binlog_delete.conf",
String.valueOf(jobId));
} catch (Exception e) {
log.error("Commit task exception :" + e.getMessage());
throw new RuntimeException(e);
}
return null;
});
await().atMost(2, TimeUnit.MINUTES)
.untilAsserted(
() -> {
String jobStatus = container.getJobStatus(String.valueOf(jobId));
Assertions.assertEquals("RUNNING", jobStatus);
});

@Hisoka-X
Copy link
Member

Hisoka-X commented Sep 5, 2025

I think the NPE occurs because restoreJob executes asynchronously, and getJobStatus() is called inside the await() before the job is fully initialized

Should not be. Because the error exception came from rest api. getJobStatus use RPC.

@dybyte
Copy link
Contributor Author

dybyte commented Sep 5, 2025

I have confirmed that the issue occurs because the job's start timestamp is accessed before it reaches the SCHEDULED state, as shown in the following code:

public void startJob() {
isRunning = true;
log.info("{} state process is start", getJobFullName());
updateJobState(JobStatus.SCHEDULED);
stateProcess();
}

private void updateStateTimestamps(@NonNull JobStatus targetState) {
// we must update runningJobStateTimestampsIMap first and then can update
// runningJobStateIMap
Long[] stateTimestamps = runningJobStateTimestampsIMap.get(jobId);
stateTimestamps[targetState.ordinal()] = System.currentTimeMillis();
runningJobStateTimestampsIMap.set(jobId, stateTimestamps);
}

This is all I was able to confirm. If you have any additional insights, I would greatly appreciate it. Thank you!

@Hisoka-X
Copy link
Member

Hisoka-X commented Sep 6, 2025

Oh. Looks make sense. So I think the patch is great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants