Skip to content

One TTL task losing heartbeat will block other tasks from getting heartbeat #57915

@YangKeao

Description

@YangKeao

See the following codes:

// updateHeartBeat updates the heartbeat for all tasks with current instance as owner
func (m *taskManager) updateHeartBeat(ctx context.Context, se session.Session, now time.Time) error {
	for _, task := range m.runningTasks {
		state := &cache.TTLTaskState{
			TotalRows:   task.statistics.TotalRows.Load(),
			SuccessRows: task.statistics.SuccessRows.Load(),
			ErrorRows:   task.statistics.ErrorRows.Load(),
		}
		if task.result != nil && task.result.err != nil {
			state.ScanTaskErr = task.result.err.Error()
		}

		intest.Assert(se.GetSessionVars().Location().String() == now.Location().String())
		sql, args, err := updateTTLTaskHeartBeatSQL(task.JobID, task.ScanID, now, state, m.id)
		if err != nil {
			return err
		}
		_, err = se.ExecuteSQL(ctx, sql, args...)
		if err != nil {
			return errors.Wrapf(err, "execute sql: %s", sql)
		}

		if se.GetSessionVars().StmtCtx.AffectedRows() != 1 {
			return errors.Errorf("fail to update task status, maybe the owner is not myself (%s), affected rows: %d",
				m.id, se.GetSessionVars().StmtCtx.AffectedRows())
		}
	}
	return nil
}

it'll return error when one of the tasks fail to update. However, we should log the error and continue to try the next task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.severity/moderatesig/sql-infraSIG: SQL Infratype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions