Here is what is happening.

By design DBMS_JOBS would retry any failed job after 1 minute, 2 minutes, 4 minutes... etc until it reachs 16 failed retries. At this point in time DBMS_JOBS gaves up and marks the job as "broken".

I would intercept the failure in an EXCEPTIONS section, there you could...
1- Send an email, page or whatever alter you have in place to the person or team that can do something about it.
2- Set the job as "broken" so DBMS_JOBS wouldn't try again.