[feature](routine-load) Add last task schedule time to routine load jobs#65166
Open
0AyanamiRei wants to merge 9 commits into
Open
[feature](routine-load) Add last task schedule time to routine load jobs#651660AyanamiRei wants to merge 9 commits into
0AyanamiRei wants to merge 9 commits into
Conversation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: Routine load job system table support added LAST_TASK_SCHEDULE_TIME to the FE schema and thrift struct, but the BE information schema scanner still did not expose or fill the new column. The scheduler also updated the job-level timestamp before confirming the task still belonged to the job. This change wires the BE scanner to the new thrift field, updates the job-level timestamp after task validity is checked, and extends the routine load system table regression case to query the new column.
### Release note
Add LAST_TASK_SCHEDULE_TIME to information_schema.routine_load_jobs.
### Check List (For Author)
- Test:
- Manual test: ./build-support/check-format.sh
- Manual test: git diff --check
- Regression test: Not run (requires local Doris and Kafka cluster; to be run in final validation)
- Behavior changed: Yes. information_schema.routine_load_jobs exposes LAST_TASK_SCHEDULE_TIME.
- Does this need documentation: No
Contributor
Author
|
run buildall |
### What problem does this PR solve? Issue Number: N/A Related PR: apache#65166 Problem Summary: The routine load system table regression checked LAST_TASK_SCHEDULE_TIME on an invalid Kafka topic path. That job can pause while refreshing Kafka partitions before any task reaches RoutineLoadTaskScheduler, so the job-level task schedule time is expected to stay empty. This change keeps the abnormal-pause system table coverage, adds a real Kafka topic and scheduled routine load job for the LAST_TASK_SCHEDULE_TIME assertion, and marks the job-level timestamp field as transient to make its runtime-only semantics explicit. ### Release note None ### Check List (For Author) - Test: - Manual test: git diff --check - Manual test: SHOW COLUMNS FROM information_schema.routine_load_jobs LIKE 'LAST_TASK_SCHEDULE_TIME' and SELECT JOB_NAME, LAST_TASK_SCHEDULE_TIME for a scheduled routine load job - Regression test: TMPDIR=/data/data3/huangruixin/tmp/codex-build ./run-regression-test.sh --run -d load_p0/routine_load -s test_routine_load_job_info_system_table - Behavior changed: No - Does this need documentation: No
Support LastTaskScheduleTime field in SHOW ROUTINE LOAD command to be consistent with information_schema.routine_load_jobs system table. The new field is appended as the last column after ComputeGroup, using the same getLastTaskScheduleTimeString() method to ensure consistent semantics and display format. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cheduleTime Add regression test to verify: 1. SHOW ROUTINE LOAD includes LastTaskScheduleTime as the last column 2. The value is non-empty for scheduled jobs 3. The value is consistent between SHOW ROUTINE LOAD and information_schema.routine_load_jobs Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 30085 ms |
Contributor
TPC-DS: Total hot run time: 174515 ms |
Contributor
ClickBench: Total hot run time: 25.34 s |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
Add
LAST_TASK_SCHEDULE_TIMEtoinformation_schema.routine_load_jobsso users can see the latest valid routine load task scheduling time at the job level. The scheduler records the timestamp only after confirming the task still belongs to the job, and the information schema path exposes it through FE thrift and the BE schema scanner. The regression keeps the abnormal-pause system table coverage and verifies the new field with a real Kafka-backed scheduled routine load job.Example:
Example output:
Release note
Add
LAST_TASK_SCHEDULE_TIMEtoinformation_schema.routine_load_jobs.Check List (For Author)
Test
Behavior changed:
information_schema.routine_load_jobsnow exposesLAST_TASK_SCHEDULE_TIME.Does this need documentation?
Check List (For Reviewer who merge this PR)