[CentOS6][SOS JobScheduler] 終了コードを判定してJobChain の流れを制御する


Create: 2013/03/29
LastUpdate: 2013/03/29
[ メニューに戻る ]

ここでは下図の環境を使用します。
環境の詳細を知りたい場合は、メニューに戻って構築手順を参照してください。


JobCheduler の内部APIを利用すると、ジョブの終了コードを判定してJobChainの処理の流れを制御することができます。
ここでは、以下の10個のState を持つ JobChain を制御してみます。
 step1 | step2 | 200 | 300 | 400 | success | success:2 | error | exit.99 | exit.default
step1とstep2 の終了コードを判定して、以下のように流れるよう制御してみます。
[ 正常時 ]
step1 → step2 → 200 → 300 → 400 → success
[ step2 の終了コードが"99"の場合 ]
step1 → step2 → exit.99
[ step2 の終了コードが"0,99"以外の場合 ]
step1 → step2 → exit.default
[ step1 の終了コードが"1"の場合 ]
step1 → 300 → 400 → success
[ step1 の終了コードが"2"の場合 ]
step1 → success:2
[ step1 の終了コードが"0,1,2"以外の場合 ]
  step1 → error
手順については、「JobScheduler FAQ」を参考にしました。

1.step1 のジョブを定義


以下のファイル名でジョブを定義します。
job1.job.xml
内容は以下のとおり。exit の値は、テスト時に変更します。(赤字)
<job order="yes"
     stop_on_error="no">
    <script language="shell">
        <![CDATA[
echo "** job1 is running."
exit 0
        ]]>
    </script>

    <monitor name="exitCodeDispatcher"
             ordering="0">
        <script language="javascript">
            <![CDATA[
function spooler_task_after(){
  var exitCode = spooler_task.exit_code;
  var order = spooler_task.order;

  spooler_log.info ("** Exit Code is: " + exitCode);
  result = true;

  switch (exitCode ) {
  case 0: //example to proceed the job chain at another node
     spooler_log.info("** proceeding with next step");
     break;
  case 1: //example to proceed the job chain at another node
     spooler_log.info("** go to state(300)");
     order.state="300"
     break;
  case 2: //example to end with an end state
     spooler_log.info("** go to state(success:2)");
     order.state="success:2"
     break;
  default: //Other exit codes are handled as an error
     //spooler_log.info("** Exit Code of " + order.job_chain + "/" + order.id + " in node " + order.job_chain_node.state + " was " + exitCode);
     result = false;
     break;
  }
  //If you want to avoid messages like
  //2011-08-04 10:13:14.531 [ERROR]  (Task sample/job_with_exit_code:1001447) SCHEDULER-280  Process terminated with exit code 1 (0x1)
  //spooler_task.exit_code = 0;
  return result;
}
//This also could be done in a more generic way
//See example in Step job_with_exit_code_generic
            ]]>
        </script>
    </monitor>

    <run_time/>
</job>
終了コードが "0" 以外の場合、ジョブの実行結果に以下のようなエラーメッセージを出力されます。
2011-08-04 10:13:14.531 [ERROR]  (job1:447) SCHEDULER-280  Process terminated with exit code 1 (0x1)
このメッセージを出力したくない場合は、以下のように終了コードに "0" を設定して、ジョブを正常終了させます。
spooler_task.exit_code = 0;

2.step2 のジョブを定義


以下のファイル名でジョブを定義します。
job2.job.xml
内容は以下のとおり。exit の値は、テスト時に変更します。(赤字)
<job order="yes"
     stop_on_error="no">
    <params/>

    <script language="shell">
        <![CDATA[
echo "** job2 is running."
exit 0
        ]]>
    </script>

    <monitor name="exitCodeDispatcherGeneric"
             ordering="0">
        <script language="javascript">
            <![CDATA[
function spooler_task_after(){
//You define a node with exit.<exitCode> for each possible exitCode
//If node is not defined, a default will be used
  var exitCode = spooler_task.exit_code;
  var order = spooler_task.order;

  if (exitCode != 0){
    newState = "exit." + exitCode;
    try {//Checking, wether node is defined in job chain configuration
      order.job_chain.node( newState )
    } catch (e) {
      spooler_log.info("** Not Found " + newState + ". go to state(exit.default)");
      order.state = "exit.default";
    }
    spooler_log.info("** go to state(exit." + exitCode + ")");
    order.state = "exit." + exitCode
  }
  return true;
}
            ]]>
        </script>
    </monitor>

    <run_time/>
</job>

このジョブでは、終了コードが 0 以外の場合、終了コードからState名を作成し、そのState が存在しなければ "exit.default" へ遷移するように設定します。

3.200、300、400 のジョブを定義


以下のファイル名でジョブを定義します。
job3.job.xml
内容は以下のとおり。
<job order="yes"
    stop_on_error="no">
    <script language="shell">
        <![CDATA[
echo "** job3 is running."
exit 0
        ]]>
    </script>
    <run_time/>
</job>

4.JobChainを定義


以下のファイル名でジョブを定義します。
job_chain1.job_chain.xml
内容は以下のとおり。
<job_chain orders_recoverable="yes" visible="yes">
    <job_chain_node state="step1" job="job1" next_state="step2"   error_state="error"/>
    <job_chain_node state="step2" job="job2" next_state="200"     error_state="error"/>
    <job_chain_node state="200"   job="job3" next_state="300"     error_state="error"/>
    <job_chain_node state="300"   job="job3" next_state="400"     error_state="error"/>
    <job_chain_node state="400"   job="job3" next_state="success" error_state="error"/>
    <job_chain_node state="success"/>
    <job_chain_node state="success:2"/>
    <job_chain_node state="error"/>
    <job_chain_node state="exit.99"/>
    <job_chain_node state="exit.default"/>
</job_chain>

5.動作確認


まず、正常時の動作を確認します。step1 と step2 の終了コードは "0" にします。
JOCで JobChain を実行します。
下図のように "step1"から実行します。


実行結果は下図のとおり。
予定どおりに処理が流れて "success" で終了しています。


ログは以下のとおり。
2013-03-29 11:36:29.708 [info]   (Task test/exit2/job1:5553) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:660, state=step1, on Scheduler http://centos6:4444
2013-03-29 11:36:29.709 [info]   (Task test/exit2/job1:5553) 
2013-03-29 11:36:29.709 [info]   (Task test/exit2/job1:5553) Task test/exit2/job1:5553 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job1.log
2013-03-29 11:36:29.753 [info]   (Task test/exit2/job1:5553) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:36:29.785 [info]   (Task test/exit2/job1:5553) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.JttQBk"'
2013-03-29 11:36:29.796 [info]   (Task test/exit2/job1:5553) ** Exit Code is: 0
2013-03-29 11:36:29.796 [info]   (Task test/exit2/job1:5553) ** proceeding with next step
2013-03-29 11:36:29.797 [info]   (Task test/exit2/job1:5553) ** job1 is running.
2013-03-29 11:36:29.800 [info]   (Task test/exit2/job1:5553) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:660, state=step1, on Scheduler http://centos6:4444
2013-03-29 11:36:29.800 [info]   set_state step2, Job /test/exit2/job2
2013-03-29 11:36:29.836 [info]   (Task test/exit2/job2:5554) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:660, state=step2, on Scheduler http://centos6:4444
2013-03-29 11:36:29.837 [info]   (Task test/exit2/job2:5554) 
2013-03-29 11:36:29.837 [info]   (Task test/exit2/job2:5554) Task test/exit2/job2:5554 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job2.log
2013-03-29 11:36:29.876 [info]   (Task test/exit2/job2:5554) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:36:29.902 [info]   (Task test/exit2/job2:5554) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.L8yAjw"'
2013-03-29 11:36:29.913 [info]   (Task test/exit2/job2:5554) ** job2 is running.
2013-03-29 11:36:29.917 [info]   (Task test/exit2/job2:5554) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:660, state=step2, on Scheduler http://centos6:4444
2013-03-29 11:36:29.917 [info]   set_state 200, Job /test/exit2/job3
2013-03-29 11:36:29.931 [info]   (Task test/exit2/job3:5555) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:660, state=200, on Scheduler http://centos6:4444
2013-03-29 11:36:29.931 [info]   (Task test/exit2/job3:5555) 
2013-03-29 11:36:29.931 [info]   (Task test/exit2/job3:5555) Task test/exit2/job3:5555 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job3.log
2013-03-29 11:36:29.932 [info]   (Task test/exit2/job3:5555) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:36:29.932 [info]   (Task test/exit2/job3:5555) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.NhRq16"'
2013-03-29 11:36:29.970 [info]   (Task test/exit2/job3:5555) ** job3 is running.
2013-03-29 11:36:29.970 [info]   (Task test/exit2/job3:5555) SCHEDULER-915  Process event
2013-03-29 11:36:29.971 [info]   (Task test/exit2/job3:5555) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:660, state=200, on Scheduler http://centos6:4444
2013-03-29 11:36:29.971 [info]   set_state 300, Job /test/exit2/job3
2013-03-29 11:36:30.046 [info]   (Task test/exit2/job3:5556) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:660, state=300, on Scheduler http://centos6:4444
2013-03-29 11:36:30.050 [info]   (Task test/exit2/job3:5556) 
2013-03-29 11:36:30.050 [info]   (Task test/exit2/job3:5556) Task test/exit2/job3:5556 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job3.log
2013-03-29 11:36:30.051 [info]   (Task test/exit2/job3:5556) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:36:30.052 [info]   (Task test/exit2/job3:5556) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.nVcFq6"'
2013-03-29 11:36:30.109 [info]   (Task test/exit2/job3:5556) ** job3 is running.
2013-03-29 11:36:30.109 [info]   (Task test/exit2/job3:5556) SCHEDULER-915  Process event
2013-03-29 11:36:30.110 [info]   (Task test/exit2/job3:5556) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:660, state=300, on Scheduler http://centos6:4444
2013-03-29 11:36:30.110 [info]   set_state 400, Job /test/exit2/job3
2013-03-29 11:36:30.125 [info]   (Task test/exit2/job3:5557) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:660, state=400, on Scheduler http://centos6:4444
2013-03-29 11:36:30.128 [info]   (Task test/exit2/job3:5557) 
2013-03-29 11:36:30.128 [info]   (Task test/exit2/job3:5557) Task test/exit2/job3:5557 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job3.log
2013-03-29 11:36:30.128 [info]   (Task test/exit2/job3:5557) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:36:30.129 [info]   (Task test/exit2/job3:5557) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.tIOO57"'
2013-03-29 11:36:30.184 [info]   (Task test/exit2/job3:5557) ** job3 is running.
2013-03-29 11:36:30.185 [info]   (Task test/exit2/job3:5557) SCHEDULER-915  Process event
2013-03-29 11:36:30.186 [info]   (Task test/exit2/job3:5557) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:660, state=400, on Scheduler http://centos6:4444
2013-03-29 11:36:30.187 [info]   set_state success
2013-03-29 11:36:30.187 [info]   SCHEDULER-945  No further job in job chain - order has been carried out
2013-03-29 11:36:30.187 [info]   SCHEDULER-940  Removing order from job chain
step2 の終了コードを "99" にした場合の実行結果は下図のとおり。
予定どおりに処理が流れて "exit.99" で終了しています。


ログは以下のとおり
2013-03-29 11:42:27.923 [info]   (Task test/exit2/job1:5559) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:661, state=step1, on Scheduler http://centos6:4444
2013-03-29 11:42:27.924 [info]   (Task test/exit2/job1:5559) 
2013-03-29 11:42:27.924 [info]   (Task test/exit2/job1:5559) Task test/exit2/job1:5559 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job1.log
2013-03-29 11:42:27.949 [info]   (Task test/exit2/job1:5559) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:42:27.976 [info]   (Task test/exit2/job1:5559) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.bveWCc"'
2013-03-29 11:42:27.986 [info]   (Task test/exit2/job1:5559) ** Exit Code is: 0
2013-03-29 11:42:27.987 [info]   (Task test/exit2/job1:5559) ** proceeding with next step
2013-03-29 11:42:27.987 [info]   (Task test/exit2/job1:5559) ** job1 is running.
2013-03-29 11:42:27.991 [info]   (Task test/exit2/job1:5559) SCHEDULER-843  Task has ended processing of Order test/exit2/job_chain1:661, state=step1, on Scheduler http://centos6:4444
2013-03-29 11:42:27.991 [info]   set_state step2, Job /test/exit2/job2
2013-03-29 11:42:28.005 [info]   (Task test/exit2/job2:5560) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:661, state=step2, on Scheduler http://centos6:4444
2013-03-29 11:42:28.005 [info]   (Task test/exit2/job2:5560) 
2013-03-29 11:42:28.005 [info]   (Task test/exit2/job2:5560) Task test/exit2/job2:5560 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job2.log
2013-03-29 11:42:28.024 [info]   (Task test/exit2/job2:5560) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:42:28.051 [info]   (Task test/exit2/job2:5560) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.QJu4Nb"'
2013-03-29 11:42:28.070 [info]   (Task test/exit2/job2:5560) ** go to state(exit.99)
2013-03-29 11:42:28.071 [info]   set_state exit.99
2013-03-29 11:42:28.071 [info]   SCHEDULER-945  No further job in job chain - order has been carried out
2013-03-29 11:42:28.071 [info]   SCHEDULER-941  Removing order from job chain. Order is still being executed by Task test/exit2/job2:5560
step1 の終了コードを "2" にした場合の実行結果は下図のとおり。
予定どおりに処理が流れて "success:2" で終了しています。


ログは以下のとおり
2013-03-29 11:44:55.255 [info]   (Task test/exit2/job1:5562) SCHEDULER-842  Task is going to process Order test/exit2/job_chain1:662, state=step1, on Scheduler http://centos6:4444
2013-03-29 11:44:55.256 [info]   (Task test/exit2/job1:5562) 
2013-03-29 11:44:55.256 [info]   (Task test/exit2/job1:5562) Task test/exit2/job1:5562 - Protocol starts in /home/jobs/sos-berlin.com/jobscheduler/scheduler/logs/task.test,exit2,job1.log
2013-03-29 11:44:55.285 [info]   (Task test/exit2/job1:5562) SCHEDULER-918  state=starting (at=never)
2013-03-29 11:44:55.307 [info]   (Task test/exit2/job1:5562) SCHEDULER-987  Starting process: '/bin/sh' '-c' '"/tmp/jobs/sos.lq8wzp"'
2013-03-29 11:44:55.318 [info]   (Task test/exit2/job1:5562) ** Exit Code is: 2
2013-03-29 11:44:55.318 [info]   (Task test/exit2/job1:5562) ** go to state(success:2)
2013-03-29 11:44:55.318 [info]   set_state success:2
2013-03-29 11:44:55.318 [info]   SCHEDULER-945  No further job in job chain - order has been carried out
2013-03-29 11:44:55.318 [info]   SCHEDULER-941  Removing order from job chain. Order is still being executed by Task test/exit2/job1:5562