Page 138 - IJOCTA-15-2
P. 138
End of day process optimization
constant during the 6-month time interval; and values on comparing the quality of solutions gen-
this thread allocation amount is the default mode erated by the heuristic. The details of the MIP
for each task. formulation and the Simulated Annealing algo-
The MRCPSP formulation requires a resource rithm are provided in Sections 4.1 and 4.2, re-
usage value for each resource and a duration value spectively.
for each mode of each task. For the default mode
of each task, the resource usage value is the con- 4.1. Mathematical optimization, MIP
stant value observed in the daily logs, and the du-
ration value is the median of the daily duration The data in our case study has highly varying
calculated from daily logs. task durations, with some tasks taking as short
Infrastructural data sets indicate that 40% of as 0.015 minutes for completion, and some tasks
the tasks cannot be parallelized, and must be exe- taking as long as 60 minutes for completion. For-
cuted by a single thread. For the remaining tasks, mulating this problem with a time-indexed MIP
depending on the duration of the task, we define model requires an immense number of time in-
two or three alternative candidate thread num- tervals for discretization of time; hence leads to
bers as the resource requirements of new modes. inefficiency. For this reason, we base the MIP for-
Unfortunately, the durations of the tasks in these mulation of our problem on the flow based model
5
modes have not been observed, and must there- described in. To the best of our knowledge, in-
fore be estimated. We base our estimation on the cluding the multi-mode concept in the flow-based
following inverse relation formulation is our contribution.
The model consists of the following variables:
(dur default )(thrdCnt default )
(dur new ) = (1) S i Start time of task i
thrdCnt new
x ij binary variable indicating whether
where dur refers to the duration of the task, task i is completed before task j be-
thrdCnt refers to the thread count, and the sub- gins (the completion time of task i is
scripts default refers to the observed values in the at most the start time of task j)
data set and new refers to the proposed alterna-
z im binary variable indicating whether
tives.
task i is executed in mode m
An essential characteristic of our problem is
its iterative and data-driven nature. Since every y ijm binary variable corresponding to
day different subsets of the EOD tasks are exe- x ij × z im
cuted, the scheduler needs to be called regularly f ijr the amount of resource r transferred
on a daily basis. As non-default task modes are from task i to task j.
selected and executed, observations on their daily
durations are collected in daily job logs. These
observations can be used to improve the estima- Let A = {0, . . . , n + 1} be the set of tasks
tions in (1). This enhancement plan is anticipated such that 0 is the starting task and n + 1 is the
to ultimately converge and utilize higher quality ending task; E ⊆ A × A be the set of precedence
estimations for task durations. relations such that (i, j) ∈ E indicates that task
i is a predecessor of task j, M i be the number
of execution modes of task i, R be the number
4. Methodology
of renewable resources, b imr be the amount of re-
The solution methodology applied to solve this source r task i requires in mode m, and p im be
MRCPSP has been two-fold: optimization via the estimated duration of task i in mode m. By
mixed integer programming, and approximation design, the dummy tasks i ∈ {0, n + 1} have only
via heuristics. The ultimate goal of this case a single mode, and for each of these two tasks, the
study is creating a tool, a scheduler for the bank’s amount of resource r required to execute the task
implementation. Although MIP formulation is in the default mode 1 (b i1r ) equals the availability
guaranteed to give optimal results, efficiently solv- of resource r at any given point in time.
ing a large-scale MIP to optimality requires li- The flow formulated in this model can be in-
censed commercial solvers. Therefore, for im- terpreted as follows: Given a network of tasks
plementation purposes, a heuristic algorithm has connected via the precedence relations, all of the
been devised. More specifically, similar to other available resources are gathered at task 0, split
work on MRCPSP, Simulated Annealing algo- among the tasks during their execution, and hence
rithm has been deployed. The MIP formulation flow through the tasks, and finally accumulate at
and the optimal solutions are used as benchmark task n + 1.
333

