#1
After setting up monitoring and alerting, the next improvement is automatic recovery.
This ensures your Firebird maintenance jobs can retry after failure — minimizing manual intervention and downtime.

1. Why Auto-Retry Matters

Sometimes jobs fail due to temporary issues — like a network delay, database lock, or resource constraint.
Instead of waiting for human action, an automatic retry can often solve the issue immediately.You’ll implement retry logic and limited attempts for failed jobs, making your system more reliable.

2. Define Retry Settings

You can configure retry parameters globally in your application properties:
app.retry.max-attempts=3
app.retry.delay-seconds=10
This ensures retries happen a few times with a small delay, avoiding infinite loops or database stress./

3. Create a Retry Utility

Let’s build a small helper to handle retry logic in a clean way:
@Component
public class RetryHandler {

    @Value("${app.retry.max-attempts:3}")
    private int maxAttempts;

    @Value("${app.retry.delay-seconds:10}")
    private int delaySeconds;

    public void executeWithRetry(Runnable task, String jobName, Consumer<Exception> onFailure) {
        int attempt = 0;
        while (attempt < maxAttempts) {
            try {
                task.run();
                return;
            } catch (Exception e) {
                attempt++;
                if (attempt >= maxAttempts) {
                    onFailure.accept(e);
                } else {
                    try {
                        Thread.sleep(delaySeconds * 1000L);
                    } catch (InterruptedException ignored) {}
                }
            }
        }
    }
}
This helper wraps any job logic, automatically retrying when exceptions occur.

4. Update Maintenance Scheduler to Use Retry

Modify the existing job scheduler to utilize the retry mechanism:
@Component
public class MaintenanceScheduler {

    @Autowired
    private RetryHandler retryHandler;
    @Autowired
    private MaintenanceLogRepository logRepo;
    @Autowired
    private EmailAlertService emailService;
    @Autowired
    private AuditLogRepository auditRepo;

    @Scheduled(cron = "0 0 2 * * SUN")
    public void cleanAndArchiveLogs() {
        retryHandler.executeWithRetry(() -> {
            auditRepo.archiveOldLogs();
            auditRepo.deleteOldLogs();
            logRepo.save(new MaintenanceLog("Cleanup Job", "SUCCESS",
                    LocalDateTime.now(), "Logs archived and cleaned successfully."));
        }, "Cleanup Job", (Exception e) -> {
            logRepo.save(new MaintenanceLog("Cleanup Job", "FAILED",
                    LocalDateTime.now(), e.getMessage()));
            emailService.sendFailureAlert("Cleanup Job", e.getMessage());
        });
    }
}
This ensures that even if the first attempt fails, the job will retry a few times before marking it as failed.

5. Track Retry Attempts in Logs

You can enhance observability by logging retry counts:
logger.warn("Attempt {}/{} failed for job {}: {}", attempt, maxAttempts, jobName, e.getMessage());
Add this inside the catch block of the retry handler to make debugging easier in production environments.

6. Add Recovery Strategy

When all retries fail, you can trigger a recovery procedure, such as restoring from a backup table or rescheduling the job for later:
public void recoverFailedJob(String jobName) {
    logRepo.save(new MaintenanceLog(jobName, "RECOVERY_TRIGGERED",
            LocalDateTime.now(), "Automatic recovery scheduled."));
    // Example recovery logic: re-queue or restore previous data
}
You can call this recovery method from within the failure handler.
#ads

image quote pre code