UNIVERSIDAD PONTIFICIA COMILLAS
ESCUELA TÉCNICA SUPERIOR DE INGENIERÍA (ICAI)
INGENIERÍA SUPERIOR INDUSTRIAL
PROYECTO FIN DE CARRERA
DESARROLLO DE ALGORITMOS
METAHEURÍSTICOS PARA LA
PLANIFICACIÓN DE UN
TALLER FLEXIBLE
SARA LUMBRERAS SANCHO
MADRID, junio de 2006
Autorizada la entrega del proyecto al alumno:
Sara Lumbreras Sancho
EL DIRECTOR DEL PROYECTO
Ángel Sarabia Viejo
Fdo: Fecha:
Vº Bº del Coordinador de Proyectos
Claudia Meseguer Velasco
Fdo: Fecha:
“The general jobshop problem is a fascinating challenge. It is easy to pose and visualise,
but extremely difficult to solve. The problem keeps attracting researchers that cannot
believe that a so simply structured problem can be so difficult until they try”
Richard W. Conway, et al. (1967)
Agradecimientos
A todos los que me han ayudado durante esta larga etapa de estudiante. A mis padres, a
mi hermana Esther y a los profesores y directores – especialmente algunos de ellos – que
he tenido desde que era una niña hasta ahora. A todos los que me han enseñado y me
siguen enseñando a aprender, a trabajar, a vivir.
También me gustaría agradecer algunos favores concretos que han apoyado este
proyecto en particular. De nuevo, a Esther, por sus maravillosas ilustraciones. A
Paloma, por sus ‘clases particulares’ de biología. A Josemaría, por escuchar atentamente
cada progreso.
Y, sobre todo, a Ángel. Me has hecho disfrutar como una enana.
Resumen v
Resumen
Los métodos clásicos de optimización no resultan adecuados para los problemas
NP-completos de la planificación, para los que consumen tiempos de cálculo en
muchas ocasiones inaceptables. En estos casos es conveniente aplicar técnicas
heurísticas, en las que se renuncia al óptimo global a cambio de obtener soluciones
muy satisfactorias en tiempos asequibles .La secuenciación de tareas en un taller es una
de las manifestaciones más interesantes de esta realidad, tanto por la diversidad de
entornos en los que se presenta como por su complejidad, que ha motivado la
aparición de una extensa literatura.
El caso en estudio es inusitadamente general, abarcando la mayoría de las posibles
complejidades de los modelos deterministas. El taller es de tipo flexible, lo que implica
que una operación puede realizarse en varias máquinas distintas, por supuesto con
diferentes tiempos de ejecución. Además, se han considerado tiempos de cambio
dependientes de la secuencia y tiempos de transporte que pueden depender del
tamaño de lote o del trabajo en particular.
El objetivo de este trabajo es la resolución del problema del taller en esta versión
general, mediante el desarrollo y estudio de nuevos algoritmos que sean capaces de
resolver el problema en tiempos que queden muy por debajo de los empleados para su
resolución mediante métodos clásicos.
Únicamente con el fin de comparar el comportamiento de los algoritmos
desarrollados con el resultado de las técnicas tradicionales, se ha construido un modelo
de programación lineal. Sin embargo, el núcleo de los objetivos fue el desarrollo de los
heurísticos y su implementación.
Para ello, se ha implementado una codificación basada en listas de trabajos. Si bien
la idea fundamental resultó no ser original (es sugerida aunque no empleada en
[MESOUGNI 04], aunque el autor no pareció no ser consciente de sus ventajas), sí lo
son la implementación y la generalización a doble lista para posibilitar su aplicación al
problema en su versión flexible. Este sistema presenta la considerable ventaja de poder
codificar únicamente soluciones factibles, lo cual redunda en algoritmos más sencillos
y eficientes.
Resumen vi
Se ha desarrollado un algoritmo de recocido simulado que difiere de las
implementaciones convencionales de este paradigma en su estructura más básica, que
como novedad se articula en dos pasos: cada iteración realiza inicialmente una
búsqueda de mejora en el entorno de la última solución aceptada. Sólo si esta primera
etapa no ha devuelto ningún resultado satisfactorio se habilita después la posibilidad
de saltos termodinámicos –la base del recocido-. Además, se proponen una definición
de vecindario especialmente ventajosa y dos nuevas reglas de enfriamiento – que
constituyen una parte fundamental de este tipo de algoritmos- , el Plan de Convección y
una Regla Dinámica, basadas en las aplicaciones industriales del recocido y en el control
dinámico de la probabilidad de salto respectivamente. Ambos planes destacan por su
eficiencia y robustez.
Se creó asimismo un genético, para el que se han desarrollado versiones de los
operadores genéticos convencionales (generación, selección, cruce, mutación o
inmigración) que aprovechan las ventajas de la codificación empleada. Además, se
introduce un nuevo mecanismo que aumentan la eficiencia mediante una original
imitación de los procesos de reproducción asexual en algunos insectos y que ha sido
bautizado como partenogénesis.
Se han diseñado tres híbridos que combinan a los anteriores con el objetivo de
mejorar su comportamiento:
Un híbrido secuencial en familia, que se caracteriza por la aplicación del recocido
simulado a un conjunto de soluciones que se seleccionan como las más
prometedoras de la etapa anterior.
Un híbrido al que se ha bautizado como no-darwiniano porque desarrolla la idea
de que las mutaciones no son en realidad aleatorias sino que avanzan en
direcciones de mejora.
Un algoritmo compuesto que combina el no-darwiniano y el original de manera
secuencial, consiguiendo mejorar en eficiencia al resto de algoritmos
desarrollados.
Además, se ha incluido una generalización adicional que permite trabajar con tiempos
aleatorios mediante la definición de una función de distribución coherente,
posibilitando así la resolución de un taller estocástico.
Resumen vii
La posibilidad de fallo de alguna de las máquinas está también considerada,
permitiendo la obtención de soluciones robustas.
Se aplica la Teoría de de las Limitaciones a la solución para obtener datos de apoyo
para la toma de decisiones de gestión. La herramienta diseñada identifica cuellos de
botella para la secuencia de tareas a realizar, y analiza la evolución temporal del trabajo
en curso.
Finalmente, la bondad de la solución obtenida es evaluada comparándola con una
cota obtenida mediante un heurístico propuesto y con un valor calculado a partir del
ajuste de una distribución no paramétrica (lo cual supone otra aportación original de
este proyecto) a los resultados de la evolución de la función objetivo.
Summary viii
Summary
Classical optimization methods are not suitable to solve NP-complex scheduling
problems, for which they take computation times often unacceptable. In these cases, it
is advisable to resort to heuristic techniques, which sacrifice global optimality in order
to provide very satisfactory solutions in affordable times.
Jobshop scheduling is one of the most interesting manifestations of this fact, not
only because of the diversity of situations where it arises but also because of its
complexity, which has motivated extensive research. The studied case is unusually
general, contemplating the majority of the possible complexities of deterministic
models. The shop is flexible, which implies that each operation can be performed on
several different machines, of course with dissimilar processing times. Set up times
dependent on sequence, as well as transportation times that can vary for different
batch sizes, have been considered.
The objective of this project is to solve the jobshop problem in this general version,
developing new algorithms that are capable of solving the problem in computation
times much better than the ones shown by classical methods.
With the single objective of comparing the performances of these algorithms and the
traditional approaches, a linear programming model was built. Nevertheless, the core
of the objectives was the development and implementation of the metaheuristics.
With this aim, a codification system based in job lists was implemented. The
fundamental idea turned out to be not original (it is suggested but not used in
[MESOUGHNI 04], where its author does not to seem to be aware of the advantages of
this codification). Nevertheless, both its implementation and its generalization to a
double list to enable its application to the flexible version of the problem are an
innovation. This system shows the considerable advantage of encoding only feasible
solutions, which results in more simple and efficient algorithms.
A Simulated Annealing algorithm was developed, which differs from the
conventional implementation of this paradigm in its fundamental structure,
innovatively composed of two steps. Each iteration performs a search of better
solutions in the neighbourhood of the latest accepted solution. Only if this search does
Summary ix
not return any satisfactory result, the possibility of thermodynamical jumps–Simulated
Annealing base-is enabled. In addition, a neighbourhood definition, specially
advantageous, is proposed. Two innovative cooling plans –which are essential to this
algorithms- are also introduced. Named Convection plan and Dynamical rule, they are
based respectively on the industrial applications of annealing and on controlling
dynamically jump probabilities. Both have shown remarkable efficiency and
robustness.
A Genetic Algorithm was also created, for which new versions of the conventional
genetic operators (generation, selection, crossover, mutation and immigration) were
developed, with the aim of taking advantage of the codification system. In addition, a
new efficiency-enhancing mechanism inspired by asexual reproduction processes in
some social insects that has been named parthenogenesis is introduced.
Three hybrids that combine the latest algorithms with the objective of improving
their properties were designed:
A sequential hybrid, which has the peculiarity of applying Simulated Annealing
to a family of solutions that are selected as the most promising ones of the
Genetic step of the algorithm.
A non-darwinian version, that assumes that mutations are not random changes
but are directed towards improvement directions.
A compound algorithm, that combines the non-darwinian and the original genetic
to achieve the best performance results among the developed algorithms.
Furthermore, an additional generalization enables the algorithm to work with
random time values from a consistent distribution function, solving therefore a
Stochastic Jobshop.
Machine failure is considered too, enabling the models to achieve robust solutions.
The Theory of Constraints is applied to the solutions to analyse the shop
configuration and support management decisions. The developed tool identifies
bottlenecks in the sequence and analyses the temporal evolution of work in process.
Finally, the goodness of the provided solutions is evaluated by comparing it to a
bound found through a proposed heuristic and to a value that is obtained from a non-
Summary x
parametrical distribution fit to the algorithm performance data (which constitutes
another original contribution in this project).
Index xi
Index
1 MOTIVATION ...................................................................................................................................... 2
1.1 Scheduling or how to avoid missing Saturday’s golf game......................... 2
1.2 The Job Shop Problem (JSP) .............................................................................. 4
1.2.1 Introducing the problem 4 1.2.2 JSP Taxonomy 5
1.3 A little of history. A complex problem............................................................ 6
1.4 A Futile Search. Review of the Classical Approaches to the JSP............... 10
1.4.1 Branch & Bound 10 1.4.2 Dynamic Programming 10 1.4.3 Trapped in a Dead End? Project Aims 11
2 QUOD NATURA NON SUNT TURPIA. HEURISTIC METHODS FROM A
HOLISTIC POINT OF VIEW............................................................................................................ 13
2.1 Nihil novi sub sole. Basic mechanisms underlying heuristic optimization
methods. ............................................................................................................ 14
2.1.1 Global exploration 15 2.1.2 Combination 15 2.1.3 Selection 15 2.1.4 Neighbour search 15
2.2 Mechanisms of Evolution................................................................................ 16
2.2.1 Genetic Algorithms (G.A.) 16 2.2.1.1 Individual 16 2.2.1.2 Chromosome, gene and allele 17 2.2.1.3 Schema 17 2.2.1.4 Selection 18 2.2.1.5 Crossover 18 2.2.1.6 Mutation 19 2.2.1.7 The result 20 2.2.1.8 To Sumarize 20 2.2.1.9 Range of possibilities 20 2.2.1.10 Genetic Algorithms and the Jobshop problem. Recent works 21
2.2.2 The bright of a diamond. Simulated Annealing (S.A.) 21 2.2.2.1 A thermodynamical basis 22 2.2.2.2 Simulated Annealing and Markov Chains 24 2.2.2.3 The cooling plan 25 2.2.2.4 Simulated Annealing and the Jobshop problem. Recent works 25
Index xii
2.2.3 Other strikes of inspiration 26 2.2.4 Elitist ants, geniuses, and growing up too fast 27
2.3 Breakdown of some optimization techniques.............................................. 27
2.3.1 Taboo search 28 2.3.2 Simulated Annealing 28 2.3.3 Genetic Algorithms 28 2.3.4 Ant colony optimization 28 2.3.5 Immune Systems Optimization 28 2.3.6 The four elements applied to other classical optimization methods 29
2.4 Conclusions ....................................................................................................... 29
3 JUST TO COMPARE. A LINEAR PROGRAMMING MODEL FOR THE FLEXIBLE
JOBSHOP PROBLEM......................................................................................................................... 31
3.1 A more specific definition of the problem .................................................... 31
3.1.1 Considerations 31
3.2 Formulation ....................................................................................................... 32
3.2.1 Parameters 32 3.2.2 Variables 32 3.2.3 Equations 33
3.3 Linearizing the model...................................................................................... 34
3.4 Numerical Results ............................................................................................ 35
3.5 Conclusions ....................................................................................................... 37
4 DESIGNING A CODIFICATION. IMPORTANCE OF THE BASIS. ..................................... 39 4.1.1 The hat and the rabbit: making things appear and disappear 39 4.1.2 Simplifying the calculation 40 4.1.3 Operators design 41 4.1.4 Method selection 41 4.1.5 Conclusions 41
4.2 The codification used: two lists for a schedule ............................................ 41
4.2.1 Introduction: leaving formalities behind 41
4.3 The first idea is not the best ............................................................................ 42
4.4 Calling our magician. Introducing the codification used........................... 43
4.5 Our magician’s communication problems ................................................... 46
4.6 Conclusions ....................................................................................................... 47
5 DATA GENERATION. SOME PRACTICAL ISSUES................................................................ 49
5.1 Basic parameters ............................................................................................... 49
Index xiii
5.2 Time related parameters.................................................................................. 49
5.3 Complexities of a generalised jobshop.......................................................... 50
5.4 Generating random jobshop data .................................................................. 50
5.4.1 Processing time 50 5.4.2 Change, transportation and Start-up times 51
5.5 Making this structure useful to work with real data .................................. 51
6 BRIEF CONSIDERATIONS ABOUT JOBSHOP PROBLEM COMPLEXITY....................... 53
6.1 Introduction....................................................................................................... 53
6.2 Linear programming model ........................................................................... 53
6.3 Heuristic algorithm complexity ..................................................................... 54
6.4 Compared problem complexity. Conclusions ............................................. 55
7 A SIMULATED ANNEALING ALGORITHM APPLIED TO THE JOBSHOP
PROBLEM ............................................................................................................................................. 58
7.1 The basis. Looking for a good neighbour first ............................................. 58
7.2 Generating a starting point ............................................................................. 61
7.3 The importance of being neighbours............................................................. 61
7.3.1 Machine list minimum movement 62 7.3.2 Job list movement 63 7.3.3 Combining both movements 64 7.3.4 The key factor 64 7.3.5 Advantages of the neighbour creating system compared to classical alternatives 65
7.4 Too many doors to knock................................................................................ 67
7.5 Thermodynamical jumps. Choosing a cooling plan ................................... 69
7.5.1 Cooling plans state of the art 69 7.5.2 Industrial considerations. Interesting ideas about the causes of the success and
failure of the described cooling plans 70 7.5.3 Fixing the plan defining parameters 72 7.5.4 Dynamic cooling plan. A different proposal 73
7.6 Computation time vs. Problem Size .............................................................. 75
7.7 Conclusions ....................................................................................................... 77
8 A GENETIC ALGORITHM FOR THE JOBSHOP PROBLEM ................................................. 80
8.1 Creating the first generation ........................................................................... 80
8.1.1 Originating individuals 80 8.1.2 Diversity is the key 81
Index xiv
8.2 Survival of the fittest. ....................................................................................... 83
8.2.1 Natural selection 83 8.2.2 Never taking a step back 84
8.3 Mating season ................................................................................................... 85
8.3.1 Finding a partner 85 8.3.2 Breeding process 85
8.3.2.1 Machine list breeding 85 8.3.2.2 Job list breeding 86
8.4 Exploration enhancing mechanisms.............................................................. 88
8.4.1 Mutation 88 8.4.1.1 Mutation mechanism 88 8.4.1.2 Mutation probability 91
8.4.2 Immigration 92 8.4.2.1 Immigration rate 92 8.4.2.2 Immigration mechanism 94
8.5 Exploitation enhancing mechanisms ............................................................. 94
8.5.1 Introducing a new concept. Asexual reproduction 94 8.5.1.1 Social insects and evolution 94 8.5.1.2 Introducing an asexual system 95
8.1 Computation time vs. Problem size............................................................... 97
8.2 Conclusions ....................................................................................................... 99
9 TURNING THE MIXER ON. HYBRIDS...................................................................................... 101
9.1 First one, then another ................................................................................... 101
9.1.1 When is it time to make the switch? 101 9.1.2 A family of solutions 102
9.2 Non-darwinian G.A. ...................................................................................... 105
9.3 Conclusions ..................................................................................................... 108
10 ANTICIPATING MACHINE FAILURE. AN APPROACH TO THE ROBUST
JOBSHOP ............................................................................................................................................ 110
10.1 Introduction.................................................................................................... 110
10.2 Not so many combinations .......................................................................... 110
10.3 Concentrating on what is relevant.............................................................. 112
10.4 Considering a real problem ......................................................................... 114
10.5 A simple example.......................................................................................... 114
11 INTRODUCING RANDOM TIME VALUES. AN APPROACH TO THE
STOCHASTIC JOBSHOP PROBLEM .......................................................................................... 118
Index xv
11.1 Randomness inside the Jobshop ................................................................. 118
11.2 Adapting the algorithm................................................................................ 121
12 FURTHER SUGGESTIONS TO THE JOBSHOP RESPONSIBLE. APPLYING T.O.C. .... 123
12.1 Introduction to the Theory of Constraints................................................. 123
12.2 Bottleneck concept applied to Jobshop production ................................. 125
12.2.1 The main difference 125 12.2.2 Traces of a bottleneck 125 12.2.3 T.O.C. Study 125
13 CAN YOU POSSIBLY KNOW HOW FAR YOU ARE FROM YOUR GOAL, IF YOU
DO NOT EVEN KNOW WHERE IT IS? ...................................................................................... 131
13.1 An obvious bound. True relaxation............................................................ 132
13.2 Can you estimate it?...................................................................................... 132
13.2.1 Introducing the problems of the data to fit 132 13.2.2 No parameters? A different kind of density distribution 135 13.2.3 Applying non-parametric density function estimation to the data provided by the
algorithms 137
13.3 Conclusions .................................................................................................... 138
14 NEXT STEPS....................................................................................................................................... 140
REFERENCES........................................................................................................................................... 143
LIST OF ABBREVIATIONS USED..................................................................................................... 148
A GENETIC EXPRESSION MODELS. AN INTRODUCTION.................................................. 151
B ARTICLE PRESENTED TO C.I.O. 2006 ....................................................................................... 154
C DATA FOR THE LP EXAMPLE ..................................................................................................... 170
Index of Figures xvi
Index of Figures
Figure 1 Example of disjunctive graph..................................................................................................... 5 Figure 2. Computational complexity of the Travelling Salesman Problem ....................................... 9 Figure 3. Example Gantt diagram ........................................................................................................... 45 Figure 4: A Gantt diagram is understandable by mere inspection.................................................... 47 Figure 5. LP model complexity ................................................................................................................ 54 Figure 6. Heuristic model complexity..................................................................................................... 55 Figure 7. Compared complexity .............................................................................................................. 56 Figure 8. Representation of a certain solution....................................................................................... 62 Figure 9. The same solution after a machine list movement .............................................................. 62 Figure 10. Initial solution .......................................................................................................................... 63 Figure 11. The same solution after a job list movement ...................................................................... 64 Figure 12. Neighbourhood size in linear scale ...................................................................................... 68 Figure 13. Neighbourhood size in logarithmic scale............................................................................ 68 Figure 14. Temperature evolution followed by natural convection processes................................ 71 Figure 15. Comparation of the first order transitory and the geometrical rule............................... 72 Figure 16. Algorithm performance with different cooling plans....................................................... 74 Figure 17. Temperature evolution with different cooling plans ........................................................ 74 Figure 18. Computation time/Iteration vs. Problem size ................................................................... 76 Figure 19. Curve fitting to Computation time per iteration vs. Problem size ................................. 77 Figure 20 Computation time vs. Population size.................................................................................. 82 Figure 21. Performance vs. Population size........................................................................................... 83 Figure 22. Initial solution .......................................................................................................................... 89 Figure 23. Solution after a machine list mutation................................................................................. 89 Figure 24. Initial solution ......................................................................................................................... 90 Figure 25. Solution after a job list mutation........................................................................................... 91 Figure 26. Performance for several mutation rates............................................................................... 92 Figure 27. Performance for several immigration rates........................................................................ 93 Figure 28. Computation time vs. immigration rate.............................................................................. 93 Figure 29. Performance for several parthenogenesis rates.................................................................. 96 Figure 30. Computation time vs. Parthenogenesis rate ....................................................................... 97 Figure 31. Computation time vs. Problem size ..................................................................................... 97 Figure 32. Curve fitting to Computation time vs. Problem Size ........................................................ 98 Figure 33. Hybrid performance with 4-member familiy ................................................................... 103 Figure 34. Hybrid performance with 6-member familiy ................................................................... 103 Figure 35. Hybrid performance with 12-member familiy ................................................................. 104
Index of Figures xvii
Figure 36. Performance comparison between the first hybrid and the original G.A. .................. 104 Figure 37. Non-darwinian mutation algorithm compared to the conventional G.A. .................. 106 Figure 38. Compared performance between the non-darwinian mutation algorithm and the
original G.A.................................................................................................................................... 106 Figure 39. Performance comparison between the conventional G.A., the non-darwinian
hybrid and the compound version ............................................................................................ 107 Figure 40. Performance comparison between the original,the non-darwinian and the
compound version........................................................................................................................ 107 Figure 41. Number of posible failures vs. problem size .................................................................... 111 Figure 42. Pareto distribution................................................................................................................. 113 Figure 43. Resulting solution.................................................................................................................. 115 Figure 44. Resulting schedule after the considered failure ............................................................... 116 Figure 45. Random component from a Weibull distribution ........................................................... 120 Figure 46. Resulting distribution in a specific example..................................................................... 120 Figure 47. Gantt diagram for the optimal schedule obtained........................................................... 126 Figure 48. W.I.P. evolution for that solution........................................................................................ 127 Figure 49. Gantt diagram for the solution............................................................................................ 127 Figure 50. Corresponding W.I.P. evolution ......................................................................................... 128 Figure 51. Example of density function................................................................................................ 133 Figure 52. Normal fit................................................................................................................................ 133 Figure 53. Different widely used distributions applied to the data ................................................ 134 Figure 54. Examples of bins with different sizes ................................................................................ 135 Figure 55. Example of non parametric fit............................................................................................. 137
Index of Tables xviii
Index of Tables
Table 1. Example of single-point crossover ........................................................................................... 19 Table 2. Example of two-point crossover ............................................................................................... 19 Table 3. Numerical values for processing time in the example.......................................................... 36 Table 4. Optimal schedule for the example ........................................................................................... 36 Table 5. Optimal schedule for the second example.............................................................................. 37 Table 6. Example of schedule oriented to machines ............................................................................ 42 Table 7. Circular reference in a schedule oriented to machines......................................................... 43 Table 8. Table specifying the machines that will perform each operation ....................................... 44 Table 9. Example of codification based on machine schedules.......................................................... 66 Table 10. Neighbour solution ................................................................................................................... 67 Table 11. Results of the curve fitting to computation time per iteration .......................................... 76 Table 12. Parent 1........................................................................................................................................ 85 Table 13. Parent 2........................................................................................................................................ 86 Table 14. Child 1 ......................................................................................................................................... 86 Table 15. Child 2 ......................................................................................................................................... 86 Table 16. Resuts of the different fits ........................................................................................................ 98 Table 17. Example of performance possibilities for each operation ................................................ 115
1The Jobshop Problem.
Motivation.
Equation Chapter (Next) Section 1
The Jobshop problem. Motivation 2
1 Motivation
1.1 Scheduling or how to avoid missing Saturday’s golf game
Johnson, pioneer in studying
scheduling techniques, was said
to have started his research on
the two-machine flowshop (the
problem of scheduling different
jobs that have to be processed by
two different machines in the
same order) after encountering
an everyday problem. He
wished to play golf one Saturday but, as a single man, had to do a week’s laundry
before leaving for his t. time. Due to wet weather, the different batches to be washed
had to go through the washing machine first and then through the dryer. This forms a
two-machine flowshop. His goal was to finish the laundry chores as soon as possible,
so he wondered, ‘How can I order these batches so that the time I spend washing and
drying will be minimized?’ According to the story, this was the first time in which the
minimum makespan criterion was formulated formally (Nevertheless, the concept is as
old as mankind). The anecdote finishes assuring that he became so intrigued by this
problem that he forgot about both the laundry and the golf match and spent the rest of
the weekend working on the now famous Johnson algorithm.
Scheduling is one of the vital problems in both the industrial environment and
everyday living. It is defined as the process of allocating one or more scarce resources
to different activities over a period of time. One or more criteria are established as the
goal(s) for the schedule. For instance, one may choose to minimize makespan (the time
in which all the work will be finished, as Johnson wanted to do), or to minimize the
maximum delay (if we consider it will be penalized in some way). There is a wide
range of possible optimisation criteria that will be explored in the next sections.
Now a quick tour around some different functional areas of a company will show
the presence of scheduling in them:
The Jobshop problem. Motivation 3
Production department needs to know the real sequence of products to be
manufactured. Thus, staff has to be aware of the schedule resulting from the
Master Production Plan.
Accounting department takes into account this production schedule and all the
due dates to predict the revenues and costs the company will incur, as well as
update financial reports and make cash flow estimates.
Marketing department will determine, for example, if punctuality is one of the
strong or weak points of the company’s organization. Knowing the processing
time of each product in each moment can allow them to adjust delivery dates
offered to clients.
Purchases department can make a more accurate prevision of the material
requirements and so reduce the cost of raw materials inventories.
As stated in [FRENCH 1990], a compilation of the different strategies to solve
scheduling problems, ‘Any manufacturing firm not engaged in mass production of a single
item will have scheduling problems similar to that of the job shop. Each product will have its
own route through the various work areas and machines of the factory. In the clothing industry
different styles have different requirements in cutting, sewing, pressing and packaging. In steel
mills each size of rod or girder passes through the set of rollers in its own particular order and
with its own particular temperature and pressure settings.[…] The objective in scheduling will
vary from firm to firm and often from day to day. Perhaps the aim would be to maintain an
equal level of activity in all departments so that expensive machines are seldom idle. Perhaps it
would be to achieve certain contractual target dates or simply to finish all the work as soon as
possible.’
Scheduling seems to be a problem that concerns every area of a company and has
been subject to extensive research for over fifty years. Although the aim of this project
will be limited to manufacturing scheduling (and will thus drop such interesting topics
as inventory management) the concepts used can easily be generalised to any situation
with several tasks to be accomplished abiding limitations on resources. Adopting
manufacturing terminology, a job consists of one or more activities, and a machine is a
resource that can perform at most one activity at a time.
The Jobshop problem. Motivation 4
1.2 The Job Shop Problem (JSP)
1.2.1 Introducing the problem
It was formulated for the first time by Coffman in 1976 [COFFMAN 1976], although
more constrained situations had already been object of research. Informally, the
problem could be described as follows. We are given a set of jobs and a set of
machines. Each job consists of a chain of operations, each one of which needs to be
processed during a certain amount of time on a given machine. This duration can be
deterministic, in which case we will face a Deterministic JSP or follow a given
probability distribution in the case of a Stochastic JSP. One of the most generally
accepted assumptions is that each machine can only process one operation at a time.
This is not necessarily the case. Nevertheless, a JSP with multitask machines remains
practically unexplored. This constraint will thus be assumed in this project.
Sometimes it is useful to represent the problem by Roy and Sussman’s ‘Disjunctive
graph model’ (1964). The problem is determined by three different sets: n , m and o .
These contain the n jobs, m machines and o operations respectively defined by the
problem. The machine in which each operation must be performed is represented
as oM . The precedence constraints for the operations of the same job are
represented u v→ . The graph ( , , )G O A E is described as follows:
0, 1N= +O o∪ , where 0 and 1+N are dumb operations with null processing
time meaning the start and the end of the schedule. The weight of each vertex is
given by the processing time omt .
( , ) / , , (0, ) / , :
( , 1) / , :
A u w u v o u w U w w o v o v w
v N v o w o v w
= ∈ → ∉ ∃ ∈ →
+ ∈ ∃ ∉ → (1-1)
This means that A contains arcs that connect all the operations of the same job, as
well as arcs from the dumb operation ‘start’ to the first operation of each job and from
the last one to the dumb operation ‘end’.
, / o vE o v M M= = (1-2)
Thus, edges in E connect operations to be processed by the same machine.
The Jobshop problem. Motivation 5
0
321
654 10
987
00
332211
665544 1010
998877
Figure 1 Example of disjunctive graph
Figure 1 shows the disjunctive graph for a jobshop with 3 machines and 3 jobs.
Operations 1, 5 and 9 correspond to machine 1 , operations 2, 4 and 8 by machine 2 and
operations 3, 6 and 7 by machine 3. 0 and 10 are the fictitious initial and final
operations. Dotted lines represent edges in E and arrows denote arcs in A .
If the aim of the jobshop is minimizing makespan, the objective value of each
feasible solution (given by the combination of the two different ways in which each
disjunction –edge in E - can be settled) is the length of the longest path in the graph.
1.2.2 JSP Taxonomy
As it was presented before, the nature of the processing times data could split JSP
into deterministic and stochastic JSP.
Production systems can be classified according to machine environment. When each
job requires only one operation this is known as single-stage and they can involve
either a single machine, or m machines operating in parallel. In the latter case, each
machine has the same function. We can consider three different classes of parallel
machines. In the case of identical parallel machines each processing time is
independent of the machine performing the job. Uniform parallel machines operate at
different speeds but are otherwise identical. When working with unrelated parallel
machines the processing times depend not only on the machine selected but also on its
schedule.
The Jobshop problem. Motivation 6
There are three main types of multi-stage systems. In a flowshop with s stages,
each job is processed through the stages 1 to s in that pre-determined order, which is
the same for all the jobs. In an openshop, the processing of each job also goes once
through each stage, but the sequence of stages through which a job must pass can
differ between jobs and forms part of the decision process. In a jobshop, the case that
will be studied in this project, each job has a prescribed routing through the stages, but
the routing may differ from job to job. A multiprocessor variant of multi-stage systems
can also be introduced, where each stage comprises several parallel machines.
Many other specific formulations appear in research. Some representative cases will
now be introduced.
Robust jobshops were developed in an attempt to take into consideration that the
assumption that the machines will all always be available to process the scheduled jobs
is just an idealization. As stated in [JENSEN 1999], ‘A quality solution which can be
modified easily according to changes in the environment may be more valuable than an optimal
solution that does not allow easy modifications’.
When both serial and parallel operations appear, we will face an assembly jobshop.
Dispatching rules for this kind of models were researched in [REEJA 2000].
Additional constraints could also be considered. For example, a storage limitation
will lead to buffer-constrained jobshops [HOLTHAUS 2002].
Finally, flexible jobshops are multi-stage, multiprocessor production systems, that
is, jobshops with several different machines available to perform each manufacturing
stage. This will be the particular situation surveyed in this project.
1.3 A little of history. A complex problem
Much of the early work on scheduling had the aim of finding out rules that lead to
optimal schedules for some simple jobshop models. One of the first examples was
Johnson’s paper on the two machine flowshop [JOHNSON 1954]. According to
Conway, Maxwell and Miller, ‘This paper is important not only for its own content but also
for its influence. […] Johnson accomplishment is not so much the proposition of an algorithm as
its is the offering of a proof that the obvious algorithm is in fact optimal’. Subsequent works
studied problems of scheduling a single machine to minimize the maximum lateness of
the jobs for which an optimal solution is obtained by sequencing the jobs according to
The Jobshop problem. Motivation 7
the earliest due date (EDD) rule of Jackson [JACKSON 1955]. And of scheduling a
single machine to minimize the sum of weighted completion times of the jobs for
which an optimal solution is obtained by sequencing the jobs according to the shortest
weighted processing time (SWPT) rule of Smith [SMITH 1956]. These simple problems
do not often appear in practice. However, the analysis of these simplistic models
provides valuable insights that enable more complex systems to be scheduled.
Furthermore, the performance of many production systems is often determined by the
quality of the schedules for a bottleneck machine. In these cases, simple models can be
useful for scheduling such a machine.
Since then, papers have addressed several topics. For an extensive review of
research, this paper refers to [DUDEK 1992]. It gives an exhaustive and well-organised
overview of the research conducted from the 1950’s to the early 1990’s.
When trying to get a general perspective of all that work, it appears as the quest for
optimality in minimizing makespan for the n -job m -machine flowshop prompted as a
natural continuation of Johnson’s paper. The problem raised interest already in the
1950’s with Akers’ graphical approach [AKERS 1956] and the search for algorithms for
such a problem began in the early 1960’s and resulted in the work of Dudek and
Teuton (1964) [DUDEK 1964] and Smith and Dudek (1967) [SMITH 1967], who used a
combinatorial optimization approach. Meanwhile, the branch-and-bound analysts
published applications of that approach for the flowshop problem. The first papers
were by Ignali (1965) [IGNALI 1965] and Lomniki (1965) [LOMNICKI 1965].
Additional papers on optimization were published; however, no clearly original,
innovative and effective results appeared.
This lack of significant results in optimization was not due to a lack of effort but to a
lack of success. It was not until the application of the computational complexity theory
that the real reason for this lack of success was unveiled.
According to this theory, there are two different kinds of problems depending on
the amount of resources required during computation to solve them. The most
common resources considered are time (how many steps does it take to figure them
out) and space (how much memory is necessary). The class P consists of all those
decision problems that can be solved on a deterministic sequential machine in an
amount of time that is polynomial in the size of the input. The class NP is the set of
The Jobshop problem. Motivation 8
decision problems that can be solved in polynomial time by a non-deterministic
machine and verified in polynomial time on a deterministic machine. These kinds of
machines, known as Turing machines, are automatons composed of some tape to be
used as memory -in which symbols from some finite alphabet can be stored-, a head -to
read and write on the tape-, a state register and an action table –that tells the machine
what symbol to write, how to move the head and the following state-. In deterministic
Turing machines the next state is determined by the current one, while in non-
deterministic ones take also into account the symbol written on the tape. In a very
informal way, the P NP≠ problem can be formulated as ‘If possible solutions can be
verified quickly, can they also be calculated quickly?’ An example of this could be
determining whether a number is a prime number. Just checking that it has a certain
non-trivial factor is much easier than figuring it out.
After years of search, no polynomial algorithm has been found for a NP-problem.
Nevertheless, there is no definite proof that P NP≠ and researchers do not have a
common view on the subject. In a 2002 poll to scientists that asked Is P NP= ? 61%
believed the answer was no, 9% believed the answer was yes, 22% were unsure, and
8% believed the question may be independent of the currently accepted axioms, and so
impossible to prove or disprove. A $1,000,000 prize has been offered for a correct
solution.
NP-complete (Non-deterministic Polynomial time Complete) problems are the most
intractable of all combinatorial optimization problems. The class NP-Complete is
defined as those problems which are NP-hard and in the class NP. This concept was
first introduced by Cook in 1971. If a polynomial time solution to a problem makes it
possible to solve all NP problems in polynomial time then this problem is said to be
NP-hard. That would be a kind of master key for all NP problems. No such algorithm
is known to exist.
Perhaps one of the most well-known problems of this kind is the Travelling
Salesman Problem. A figure with the computational complexity of this problem is
included to illustrate the ideas exposed.
The Jobshop problem. Motivation 9
100
101
102
100
1050
10100
10150
10200 Travelling Salesman Problem
Number of cities to visit
Pos
sibl
e ro
utes
Figure 2. Computational complexity of the Travelling Salesman Problem
The number of possible solutions to evaluate in the case of the JSP are exposed in
section 6, given that this number depends on the solution definition used. Two
different ones are used in this project –the one of a linear programming model and the
one proposed for the heuristic-, so their implications on the problem complexity will be
analysed once they have been introduced.
One of first works that applied computational complexity theory to the JSP,
concluding that it was NP-complete was [GONZALEZ, 1977]. In fact, the JSP seems to
be one of the hardest NP-Complete problems.
Dudek, Panwalkar and Smith made this heart-breaking statement in one of their
papers years after that search [DUDEK 1992] : ‘The consolation for those seeking optimizing
algorithms following the NP-completeness shock was the realization that the lack of success in
optimizing the general flowshop problem was the impossibility of the task rather than the
inadequacy of the effort. It is unfortunate that the theory on NP-completeness did not come
several years earlier, before so much effort was expended in the futile search for efficient
optimizing algorithms. As members of the group of researchers who searched for optimizing
algorithms, we can report that at least it was challenging and interesting even though it was
not as productive as had been hoped.’
The Jobshop problem. Motivation 10
1.4 A Futile Search. Review of the Classical Approaches to the JSP
This section will present an overview of the two main classical approaches to
solving scheduling problems, based in enumerative algorithms: Branch & Bound and
Dynamic Programming.
1.4.1 Branch & Bound
The solving process can be conveniently represented by a search tree. Each node of
the tree symbolizes a subset of feasible solutions. How the feasible solutions at a node
are partitioned into subsets, each corresponding to a descendant node of the search
tree, is specified by a branching rule.
For instance, in the case of a single machine scheduling each solution is defined as a
sequence of jobs. For such a problem, commonly used branching rules involve forward
sequencing and backward sequencing. In forward sequencing, each node corresponds
to sequences in which the jobs in the first positions are fixed. The first branching
creates nodes corresponding to every job that can appear in the first position, the
second branching creates nodes that fix the job in the second position, and so on. A
backward sequencing is built from the end. That would be the ‘Branch’ part of the
algorithm.
The problem has an objective function to be minimized. A lower bounding scheme
associates a lower bound with each node of the search tree. The aim is to eliminate any
node for which the lower bound is worse than or equal to the value of the best known
feasible solution, since it cannot lead to an better solution. Lower bounds are usually
obtained by solving a relaxed problem. The most widely used methods are Lagrangean
relaxation and solving the linear programming relaxation of an integer programming
formulation. The use of dominance rules can lead to eliminating nodes before
calculating any lower bounds. The order in which the nodes of the tree are selected for
branching has to be defined. One of the most widely used strategy is to branch from a
node with the smallest lower bound.
1.4.2 Dynamic Programming
In dynamic programming each partial solution is represented by a state with a
certain assigned value of the objective function. When two or more partial solutions
The Jobshop problem. Motivation 11
achieve the same state, the one with the best value is retained, while the others are
dismissed. The states have to be defined in such a way that this elimination is valid.
For instance, consider the single machine sequencing problem again. Partial solutions
would be initial partial sequences. A state can be represented by the jobs that are
sequenced and the time that the machine becomes available to process the jobs that
remain unprocessed. A decision transforms a partial solution from one state to another.
All possible decisions are considered. Applying this method recursively determines the
optimal solution value, and backtracking is then used to find the decisions that define
that solution.
1.4.3 Trapped in a Dead End? Project Aims
As stated below, it is believed that such algorithms cannot efficiently solve (very
commonly understood as ‘in polynomial time’) the JSP. Fortunately, that was not the end
of the research on scheduling. It just meant a change of view.
New interesting approaches to combinatorial problems were developed. These
methods will be reviewed in the next chapter.
This project is set in motion from all the research efforts that started with Johnson’s
work and that this chapter has tried to capture. It is born with the ambition of taking
advantage of all the already developed heuristic methods and combining them to
create a more powerful new hybrid one. Afterwards, the performance of this new
algorithm will be tested and compared to the currently used ones. If possible, its
effectiveness will be theoretically studied.
2Quod natura non sunt turpia.
Heuristic methods from a holistic
point of view
Equation Chapter (Next) Section 1
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 13
2 Quod natura non sunt turpia1. Heuristic methods from a
holistic point of view
In the author’s opinion,
there have been only two main
sources of inspiration to
mankind. The first one is man’s
own emotions. Feelings have
generated every moving
masterpiece from the beginning
of time. Nature is the second
one. A beautifully shaped
concha shell can remind us of a
Durero’s spiral and the golden
number or just provoke the
idea of include it in a still
nature picture. The tireless
work of ants searching for food can inspire both a Samaniego’s fable and the ant colony
optimization techniques that will be presented later.
Having run out of ideas, turning to nature can give us some perspective on how to
solve incommensurably more complex optimization problems in a beautifully efficient
way.
There is a qualitative difference between the classical approach to optimization and
these nature-inspired methods. While the first one searches the global optimum point,
the latest look for good solutions, with no guarantee of optimality, but with affordable
computation times. This is the difference between a heuristic, conceived as a series of
steps that lead to optimum, and a metaheuristic method as the ones that will be used in
this project. The concepts of efficiency and effectiveness are the underlying difference
between them: a linear programming model of a general jobshop can be effectively
solved by dual simplex method – i.e. by following the algorithm steps, it is sure that
1 ‘What is natural cannot be bad’, Latin proverb.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 14
the global optimum will be achieved sometime - but that does not mean that it could do
it efficiently – so the time spent in the calculation can be longer that what we can afford
-. Therefore, this approach renounces a guaranteed optimality in the solution in order
to be able to provide a good solution in a reasonable amount of time. There is an
additional difference: nature is not deterministic. A probabilistic component is
introduced. As will be presented later, this helps preventing the algorithm from getting
stuck in local optima. The next sections will describe some of these methods: the ones
that will be used in this project and some other interesting ones of amazing creativity.
As stated in the introduction chapter, nature is one of the main sources of
inspiration for mankind. Fascinating in its perfection, constantly evolving, and always
one step further than every scientist, it has provided the most inquisitive ones with the
most interesting tracks to follow. Having run out of ideas, turning to nature can give us
some perspective on how to solve incommensurably more complex optimization
problems in a beautifully efficient way.
This idea has set in motion a series of different heuristic models, which looked to
different natural optimizers and tried to capture their essence and translate into
optimization methods.
The intuition underlying this project is that all these diverse interpretations of
nature can be split into a discrete (and very limited) set of basic mechanisms. These
patterns will be introduced in the next section, and later on they will be identified in
the most well-known optimization techniques.
2.1 Nihil novi sub sole2. Basic mechanisms underlying heuristic optimization
methods.
Continuing with proverbs, this last one remarks the real difficulty of actual
innovation. As stated below, there is nothing new with heuristic methods but a
reinterpretation of natural patterns. Moreover, they all sound similar. This section
explores the similitude between them expressed as the basic parts they share. The
nomenclature is intended to be general, but the influence of some of the particular
methods is obvious in some cases.
2 ‘Nothing new under the sun’, Latin proverb.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 15
2.1.1 Global exploration
This basic tool could be described as a completely random search. Different
solutions, with neither connection between them nor particular search direction, are
proposed with no subsequent treatment.
2.1.2 Combination
Combination is an operator that receives some solutions and returns just one that is
a combination of them. Note that the properties of the output will be a combination of
the inputs only in some cases, depending on the convex nature of the problem.
Moreover, the same operator applied to the same inputs is able to return different
output, given that some degree of randomness is introduced in the process.
As this operator is the one with more difficulties to be practically applied, a
different metaheuristic paradigm, called ‘Genetic expression models’ , originally created
in this project, is introduced in the annexes.
2.1.3 Selection
From a set of solutions, choosing some of them and discarding the rest. The
conditions upon which this selection is done are completely free and must, as in the
previous cases, include some randomness in its criterion.
2.1.4 Neighbour search
The surrounds of a given point are inspected looking for better solutions. Although
there can be some randomness when selecting the points to be analysed, this search
involves a certain level of direction.
These four elements are the basis of all optimization techniques. After presenting
some of the most widely used metaheuristics, they will be identified to illustrate this
idea.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 16
2.2 Mechanisms of Evolution
2.2.1 Genetic Algorithms (G.A.)
Biologists seem to live in a constant
surprise. It appears as if there was a living
being adapted to every habitat on earth.
From abyssal fish that bear pressures of 4
kPa and an absolutely dark environment to the bacteria living in the sulphuric
extremely hot water in Yellowstone’s geysers, life has reached all possible scenarios.
All those creatures came from ancestors that had nothing to do with their present
appearance. Evolution has lead to a specialised living being for each possible ecological
niche that in some way is optimum for that specific environment. These methods can
be explored with the aim of developing new optimization techniques. This brilliant
idea was conceived by Holland [HOLLAND 1975] for solving parameter optimization
problems. Genetic algorithms have proven to be robust in the sense that they provide
good solutions on a wide range of problems. In addition, they can be easily modified
with respect to the objective function and constraints.
Generation after generation, the individuals compete with each other. Survival of
the fittest (Darwinian selection) is the main idea. Fitter members of the population are
more likely (but just more likely, in a probabilistic sense) to mate with each others and
have an offspring in subsequent generations. This mating leads to a recombination of
the genetic materials of the best and often drives to a sequence of generations that is
successively fitter.
In addition to the mating mechanism, nature introduces infrequent and completely
random modifications that increase diversity among the population. This is known as
mutation.
To implement the biological metaphor, these ideas are materialised:
2.2.1.1 Individual
Each possible solution to the problem is considered an ‘individual’. It can be ‘normal’
or a ‘monster’ depending on whether it is feasible or not. It will be codified in a way
that is believed to be efficient for the given problem. The algorithm will work with the
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 17
codification and not with the solution. This is one more difference between G.A. and
the traditional optimization methods. There will be an initial population whose size is
usually calculated using heuristic rules. This enables the algorithm to work with
several parallel optimization lines and not just one, as it happens in the classical
approaches.
2.2.1.2 Chromosome, gene and allele
Chromosome is the definition of each individual. It is represented by strings. Each
element in the string is called ‘gene’ and is occupied by one of the symbols of a certain
alphabet. Each symbol in this alphabet is called ‘allele’. The size of a chromosome is the
number of genes it contains.
It is necessary to define a useful codification to translate each solution to a string of
symbols. Binary codification is frequently used in general problems because of its
simplicity and the possibility of using already developed genetic operators. Although it
has been actually used in some jobshop scheduling work, as in [NAKANO,1991] or
[YAMADA,1992], due to the problems that arise when trying to implement crossover
genetic operators, this project discards this option. An appropiate codification for the
problem will be created.
2.2.1.3 Schema
Given the initial alphabet Ω , we will consider an extended alphabet 'Ω constituted
by the last one plus a new element, the metasymbol ‘*’. It indicates that the position it
occupies there can be any of the elements of Ω . All the strings formed by the extended
alphabet receive the name of ‘schemas’.
The order of a schema is given by the number of fixed positions in the string, and its
length is the difference between the position of the last fixed character and the first one.
The number of individuals that belong to a given schema in each generation evolves
according to its fitness relative to the remaining members of the population, its length
and order.
The so-called Schema Theorem is the only mathematically rigorous support for the
G.A. theory. It was formulated by Holland as:
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 18
( )( , ) ( )[ ( , 1)] · ( , )· 1 · ·(1 )( ) 1
o sc m
u s t d sE m s t m s t p pf t l
⎛ ⎞+ ≥ − −⎜ ⎟−⎝ ⎠ (2-1)
• m(s,t)= instances of schema s in population at time t .
• f(t)= average fitness of population at time t .
• u(s,t)= average fitness of instances of s at time t .
• pc = probability of single point crossover operator.
• pm = probability of mutation operator.
• l = length of single bit strings.
• o(s) number of defined (non *) bits in s .
• d(s)= distance between leftmost, rightmost defined bits in s .
And, informally, can be expressed as:
‘Short, low-order, above-average fitness schemata receive exponentially increasing trials in
subsequent generations’
2.2.1.4 Selection
The fittest members of the population will be selected to breed. There are two main
methods to replicate this mechanism. The first one, known as the roulette wheel, is
based on the simple fitness proportional rule. This stipulates that the expected number
of copies of each individual in the next generation is proportional to its fitness. More
specifically, if the population has a size of N members, the selection process will be
carried out by building a roulette wheel with N slots. Each slot corresponds to a
member and its angular width is proportional to the fitness of that individual. The
wheel is spun and fixed pointers select the members that will generate the next
generation. In second one, known as the tournament, a random permutation of the
existing solutions is obtained. They are then split up into a definite number of groups
and the fittest member of each group is selected.
2.2.1.5 Crossover
It could be defined as a random function that returns a ‘child’ chromosome out of
two ‘parent’ chromosomes by recombining in some way their genetic information.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 19
When using a binary codification, it is easy to implement single or multiple point
operators. For instance, we could have:
Example of single point crossover in genes 3rd and 4th .
Parent 1: 110|10011 Child 1: 11001101
Parent 2: 011|01101 Child 2: 01110011
Table 1. Example of single-point crossover
Example of two-point crossover in genes 3rd-4th and 6th-7th.
Parent 1 : 110|110|11 Child 1: 110011111
Parent 2: 011|011|01 Child 2 : 01110001
Table 2. Example of two-point crossover
Other operators have been developed, such as Partially Mapped Crossover (PMC),
from Goldberg and Lingle, or the Reeves operator, which is widely used when
working with numerical strings. More sophisticated ideas have been researched. In
[AGGARWAL, 1996], for example, one of the children is constructed in such a way to
have the best objective function value from all the feasible set of children while the
other is selected with the aim of maintaining diversity.
2.2.1.6 Mutation
Mutation consists in introducing small random changes in some individuals with a
very low probability. That reduces the risk of getting trapped in local optima.
There are variations of this mechanism that include immigration (adding
individuals independent of the current generation) or inversion (modifying all the
genes that lie between two positions randomly selected).
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 20
2.2.1.7 The result
Fitter members and schemas spread while the worst solutions disappear, giving
fitter individuals (i.e, solutions with a better value of the objective function) generation
after generation.
2.2.1.8 To Sumarize
The process could be expressed in pseudo code as follows.
Initialize population;
Generation = 0;
While (Termination_Criterion = = false)
Generation++;
Selection(Population);
Crossover(Population);
Mutate(Population);
2.2.1.9 Range of possibilities
There is an enormous range of possible specific implementations of G.A. Every
living being’s reproductive pattern could be studied and translated to an algorithm.
Some successful implementations have actually come from this idea (like the ‘Deer’s
Genetic Algorithm’ used in Jose Luis Gahete’s thesis [GAHETE 2004]). However, it
should be noted that these patterns can be modified in such ways that nature could
never contemplate. For instance, the best individual(s) of each generation, the
‘genius(es)’, could be passed directly to the next generation and treated as immortal
until they are beat in fitness by other members of the population.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 21
2.2.1.10 Genetic Algorithms and the Jobshop problem. Recent works
In [GONÇALVES 2002], a hybrid G.A. is presented, with the particularity that after
a schedule is obtained a local search heuristic is applied to improve the solution. To
prevent premature convergence, an immigration mechanism is included.
[MESGHOUNI 2004] introduced parallel representation (which could be viewed as
a multi-dimensional matrix where each row corresponds to a different machine) and
developed genetic operators adapted to this codification.
[FAYAD 2003] performed some research on the ‘real word’ J.S.P. – that is, using
fuzzy sets to model uncertain processing times and due dates using G.A. to solve it.
2.2.2 The bright of a diamond. Simulated Annealing (S.A.)
This time, the inspiration came from the
metallurgical processes that underlie the
thermal treatment that enhances mechanical
resistance in steel. It consists in rising the
metal temperature to the fusion point and
then allowing it to get cold again. The
properties of the solid obtained will highly
depend on the speed of this cooling. If this happens very fast (the tempering process),
the formed crystals will contain a relatively high number of imperfections. On the
contrary, if the process is slow enough, the mineral can shape into big perfect crystals,
the ones that achieve an energy minimum, just the necessary to keep molecular
cohesion, which can be as beautiful as a diamond.
This concept was introduced in statistical mechanics by Metropolis in 1953
[METROPOLIS 1953]. Thirty years later, Kilpatrick [KIRPATRICK 1983] and Cerny
[CERNY 1985] used it as a general solution approach for discrete optimization
problems. Quickly after these early works, this approach was applied to a variety of
problems as diverse as VLSI (very-large-scale integrated circuits) design, ([JEPSEN
1983], [KIRKPATRICK 1984], [VECCHI 1983], [ROWEN 1985], pattern
recognition([GEMAN, 1984, [HINTON, 1984]) and code generation [EL GAMAL 1987].
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 22
2.2.2.1 A thermodynamical basis
The annealing process is based on the laws of Thermodynamics. According to them,
at a specific temperature t , the energy increment E∆ in the studied thermodynamical
system is a random variable whose distribution function is given by:
·( ; ) k tP E t e∂
−∆ > ∂ = (2-2)
where k represents the Boltzmann constant.
That is equivalent to:
·( ; ) 1 k tP E t e∂
−∆ ≤ ∂ = − (2-3)
The probability of a big energy increment is very low, and can be described through
an exponential distribution function, which implies that this increment happening is a
rare event. It is well known that this distribution is linked to the Poisson one (the latest
one gives the probability of the occurrence in a certain moment and the exponential
models the time length between occurrences). This could be an introduction to the role
that Markov chains (based in Poisson distributions) will play in Simulated Annealing.
Metropolis’ algorithm generates a perturbation and calculates the energy increment
that would result from it. If energy lessens, then the new state is accepted. If it
increases, it can be accepted or not depending on the thermodynamical rule above.
Kirkpatrick adapted it to be an optimization heuristic which could be seen as a
variant of the well-known local search heuristics based on the concept of
neighbourhood. Each solution is assimilated to a point of a thermodynamical system.
In the classical approach, up-hill movements (the result of a neighbour with a higher
level of energy being selected in the random search) are rejected. Consequently, there is
a huge risk of getting trapped in local optima and the final result can depend highly on
the initial point chosen to start the process.
On the contrary, simulated annealing does not disregard this kind of solutions but
accepts them according to the thermodynamical probability of the energy increment
they involve.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 23
To develop the metallurgical metaphor, a few more concepts are introduced. The
different ‘states of the system’ represent the space of feasible solutions of the problem.
Energy is equivalent to cost. A change of state is understood as accepting a neighbour
solution. Temperature will be the control parameter of the mechanism, therefore
creating the need for a ‘cooling plan’ that defines its evolution with time. Finally, a
frozen state represents the heuristic final solution.
A possible pseudo code expression could be as follows:
Select an initial solution 0s from the space of feasible solutions S ;
Fix the initial temperature to a positive value 0 0t > and select a cooling plan ( )CP t ;
While (Termination_Criterion == false)
Select another state from the neighbourhood of the current one 0( )s N s∈ ;
Evaluate the energy increment 0( ) ( )E E s E s∆ = − ;
If 0E∆ < then 0s s=
Else
Generate a random number [0,1]r∈ ;
If Etr e∆
−< then make the up-hill movement 0s s=
Change the value for temperature if necessary ( )t CP t=
Note that, in the algorithm, the Boltzmann constant has disappeared. Its function
would be just scaling the control parameter, which will continue to be referred to as
‘temperature’.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 24
2.2.2.2 Simulated Annealing and Markov Chains
A Markov chain, named after Andrei Markov, is a discrete-time stochastic process
(although some continuous generalizations have been developed) in which the
probability for changing to a specific state depends only on the current state. The chain
is a sequence of random variables. The range of these variables, i.e., the set of their
possible values, is called the state space. At any time n , a finite Markov chain can be
characterized by a matrix of probabilities whose ,x y element is given
by 1( / )n nP X x X y+ = = , where n represents the time index. If the chain is homogenous,
these probabilities will be independent from n . These kinds of discrete finite Markov
chains can also be described by a directed graph where the arcs are labelled by the
probabilities of going from one state to the other state that are on either end of the arc.
There is a wide range of practical problems in areas such as inventory management,
queuing systems, maintenance and manpower planning that have been modelled as
Markov chains.
A final class is a subset of states that can be reached from states outside the class,
but there is no external state that can be reached from inside the subset. In an
irreducible Markov chain, it is possible to reach all feasible states from any initial point.
That implies that no class other than the complete chain with period ∂ can be defined.
A periodical chain can be split into ∂ different sets of states in such a way that the
evolution of the chain will pass from s∈ 1S to s∈ 2S … s∈ S∂ . An irreducible
aperiodical chain –that is to say, a chain in which 1 is the maximum common divisor of
all net circuits- is called ‘ergodic’.
Formally, the simulated annealing algorithm could be modelled as a Markov Chain
with as many states as feasible solutions to the problem there are. Given a
temperature t , the probability to move from state i to state j can be represented by ijp .
This configures a stationary Markov chain. If it is ergodic it will converge to a limiting
distribution and, as temperature cools down, get closer to a uniform distribution on the
set of optimal solutions.
Anily and Federgruen [ANILY, 1987] got to prove the convergence of this method.
They stated that a simulated annealing mechanism, when using an infinitely long
cooling plan, can be assimilated to a strongly ergodic Markov chain, thus being
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 25
asymptotically independent of the starting state and converging to a distribution that
can be identified as the global minimum.
2.2.2.3 The cooling plan
A good design of the cooling plan is essential to achieve both efficiency and
effectiveness of the algorithm. It has to include the definition of the upper and lower
bounds for the temperature parameter and the rate according to which it is lowered.
If the aim is to get a solution that is independent from the initial state, then the
starting temperature must be high enough to allow all the up-hill movements
proposed. In some cases, the problem provides sufficient information to suitably
estimate this initial value. This happens when the maximum difference between the
values of the cost function of any two points in a neighbourhood is known. This
difference can be considered as enough to avoid getting trapped in local optima.
Several functions to describe the cooling plan have been developed. Nevertheless,
more sophisticated alternatives do not lead to better results.
It is necessary to be really careful when designing a cooling plan, and be prepared
to make a study of the performance resulting from several alternatives.
2.2.2.4 Simulated Annealing and the Jobshop problem. Recent works
In [LAARHOVEN 1992] a S.A. approach is introduced to solve the J.S.P. to
minimize makespan. They use the Aarts and Van Laarhoven cooling rule described in
7.5.1, which takes into consideration the variance of the cost values generated. They
chose the length of each Markov chain to be equal to the size of the largest
neighbourhood. In this paper they introduce an algorithm that gives quite accurate
solutions and is approximately polynomial in the size of the problem.
[AIDYN 2003] Developed a ‘parallel SA’ model in which several solutions are
worked out, as the concept of ‘population’ in G.A.s.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 26
2.2.3 Other strikes of inspiration
Genetic Algorithms and Simulated Annealing are not the only approaches that have
looked back into nature with the aim of solving complex optimization problems. As an
illustration, the Ant Colony Optimization techniques will be briefly reviewed.
These algorithms were introduced by Dorigo in 1996 [DORIGO, 1996] and are
inspired by the behaviour of colonies of real ants, in particular, how they forage for
food. One of the main underlying ideas is that these insects can communicate through
indirect means by modifying the concentration of pheromones, some highly volatile
chemicals, in their immediate environment.
The ants visit a solution
depending on the distance to
the point they are in (the
increment or decrement in the
objective function) and the
pheromone concentration in it.
The relative importance of these
two factors is fixed through two
parameters. Once the ant has
completed its tour, the amount
of evaporation of the
pheromone is calculated, and
the chemical track of this latest
ant (that will be more intense for
better cost values) is added.
Ant by ant, the solutions are improved and finally converge to the global optimum.
This basic mechanism can be enhanced, for instance, by introducing a set of ‘elitist ants’ that raise the pheromone concentration in the best solutions found until that moment. This idea was proposed in [WHITE, 2003].
Another example of biological metaphor used to build an optimization method are
immune systems. These mechanisms are motivated by the theory of immunology. The
biological immune system functions to protect the body against pathogens or antigens
that could potentially cause harm. It works by producing antibodies that identify, bind
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 27
to, and finally eliminate the pathogens. Even though the number of antigens is far
larger than the number of antibodies, the biological immune system has evolved to
allow it to deal with the antigens. The immune system will discover the criteria of the
antigens so that in future it can react both to those antigens it has encountered before
as well as to entirely new ones. This idea was introduced by de Castro and Timmis in
[DE CASTRO 2002], and has been already applied to the JSP in recent works such as
[AICKELIN 2004].
Human Guided Search, interactive algorithms that take advantage of human beings’
abilities in fields where they outperform computers, has also been applied to this
problem in [LESH 2003].
More processes in nature could presumably be adapted for optimization purposes,
so there is a wide range of unexplored nature-inspired techniques.
2.2.4 Elitist ants, geniuses, and growing up too fast
Elements such as elitist ants and geniuses are introduced in the general mechanisms
with the aim of speeding convergence up. However, an unbalanced addition of this
kind can have harmful effects in the performance of the algorithm.
As expressed in [WHITE, 2003], ‘However, by using elitist ants to reinforce the best tour
the problems now takes advantage of global data with the additional problem of deciding on how
many elitist ants to use. If too many elitist ants are used, then the algorithm can easily become
trapped in local optima’.
This text illustrates the ‘exploration versus exploitation’ dilemma that will be recurrent
along this project and any other dealing with similar metaheuristical techniques. In
some way, it is similar to the problem that arises when a child is willing to grow too
fast and burns out life stages too soon, or when we hurry to finish any delicate piece of
work. It is vital to find a balance between leaving the algorithm to blindly explore
without allowing it to learn and guiding it too much, thereby making it grow too fast
so it cannot explore the paths that would lead it to the best solution.
2.3 Breakdown of some optimization techniques
This section will make a description of the already introduced heuristic methods in
terms of the four elements presented in section 2.1.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 28
2.3.1 Taboo search
The first solution, completely random, is obtained from a Global exploration
mechanism.
The algorithm then performs neighbour search, which involves direction, until it
finds the solution.
2.3.2 Simulated Annealing
The first solution is provided by a global search method.
The movements according to the Boltzmann equation are an expression of
neighbour search.
2.3.3 Genetic Algorithms
The first generation and the immigrants are obtained by global search.
Natural selection is a very important part of the algorithm.
Children are created from their parents through a combination process.
Mutation can be understood as a kind of neighbour search.
Containing all the basic elements provides GA with a special robustness, and can
make them preferable when the characteristics of the space of solutions are not well
known.
2.3.4 Ant colony optimization
The first solution is obtained by global search.
The algorithm then continues the work by neighbour search.
2.3.5 Immune Systems Optimization
New patterns are constantly being generated by global search.
The designs that are not valid are eliminated in a selection process.
The most successful lymphocytes are copied allowing some random changes in
them: that could be understood as neighbour search. Moreover, the essence of this
is clearly the same as in the mutation operation.
Quod natura non sunt turpia. Heuristic methods from a holistic point of view 29
It seems obvious that the ideas underlying Immune Systems Optimization are
exactly the same as in GA but discarding the combination operation.
2.3.6 The four elements applied to other classical optimization methods
In them, the degree of randomness has disappeared.
Dynamic programming or Branch & Bound could be understood as an aggregate
of global search that has the peculiarity of being exhaustive and selection, which
eliminates the branches that cannot drive to better results.
Methods like simplex are an expression of a pure neighbour search in which the
level of direction is the maximum possible, completely guiding the algorithm
towards the best solutions.
2.4 Conclusions
Optimization methods share some elemental mechanisms that provide them with
their effectiveness and efficiency: global search, neighbour search, selection and
combination. They all include randomness in some way.
Different methods put different weights on them, and express their essence
differently. That is the most important degree of freedom when designing an
optimization method.
When choosing a method to solve a particular problem, its characteristics can give
an indication beforehand of which ones will have a better performance. Those can be
easily understood as to which extent the four elements are consistent with the problem
and the codification used (e.g. Does a combination of solutions have something to do with the
inputs? If the fitness value of a combination were to be contained in the interval
defined by the parents’ finesses, then a combination mechanism would lack any sense as
no possible improvement could appear).
Once the elemental mechanisms are evaluated, one of the methods which use them
must be selected. The real work then begins: implementing each one of the ideas in the
way that suits the problem peculiarities best.
3Just to compare.
A linear programming model for
the Jobshop problem
Equation Chapter (Next) Section 1
Just to compare. A linear programming model for the Flexible Jobshop problem 31
3 Just to compare. A linear programming model for the Flexible
Jobshop problem
Just with the aim of comparing the performance results of the developed
metaheuristic algorithms with the outcome of the classical approaches to the problem,
a L.P. model has been created.
3.1 A more specific definition of the problem
The jobshop problem consists in finding the optimal schedule (according to a
chosen criterion) for a set of n jobs, each one composed of io operations to be
performed on the set of available m machines.
General assumptions are:
One machine can at most perform one operation at a time
The operations cannot be split, once they are started they have to be completed
before the machine is performing it is available for a second operation.
The description above could belong to a plain Jobshop problem. The following
characteristics determine the peculiarities of the specific problem to be solved.
Each operation can be performed not just on one machine but on a certain set of
them. This is the reason for the ‘Flexible’ adjective of the problem title.
The processing times for an operation on a given machine are independent from
the sequence scheduled. However, the concept of set up time, which varies for
each operation depending on the latest operation performed by the machine, is
introduced.
3.1.1 Considerations
The model that will be developed in this project will take into consideration:
The setup time for a machine considering that it was idle before, which will differ
from machine to machine and could vary depending on the specific job
scheduled in the first place for that machine.
Processing times that are different for different machines.
Just to compare. A linear programming model for the Flexible Jobshop problem 32
Setup times for each machine that will depend on the job performed and the
latest job performed on it.
3.2 Formulation
The proposed formulation will be introduced following the GAMS structure.
3.2.1 Parameters
n : Number of jobs to be scheduled.
io : Operations that compose the job i .
m : Total number of available machines.
iij kPT : Processing time of the job i in its j -th operation when it is performed on
machine k .
0iij kST : Setup time when starting to perform the job i in its j -th operation on
machine k and it was idle before.
i xij xy kST : Set up time when changing, on machine k , to j -th operation of job i
from y -th operation of job x .
ijkP : Binary parameter that takes the value ‘1’ if and only if the j -th operation of
job i can be performed on machine k . The equation below links the value of these
variables to the processing times. The processing time is recorded as ‘0’ when the
machine is not able to perform the operation.
iijr : Order number of process j of job i . In the GAMS model, this is simply
introduced as ( )iord j .
3.2.2 Variables
iij klM : Binary variable that takes the value ‘1’ if and only if the j -th operation of
job i is performed on machine k on l -th place.
klT : Time moment in which machine k finishes its l -th process. They are defined
in a recursive way.
Just to compare. A linear programming model for the Flexible Jobshop problem 33
iijX : Moment when the j -th operation of job i is finished. It is defined in a
recursive way too.
( )Start k : Binary variable that takes the value ‘1’ if and only if machine k is used.
( , )Stop k l : Binary variable that takes the value ‘1’ if and only if l -th position is
the last one for machine k .
3.2.3 Equations
Objective function to be minimized : total makespan. It is greater or equal than
the time in which the last operation of each of the jobs is finished.
iioTM X≥ (3-1)
Each operation must be performed:
,
1iij kl
k l
M =∑ (3-2)
Recursive definition of klT :
0 0kT = (3-3)
Dumb operation ‘start’. In the created GAMS model this is not necessary (when the
compiler looks for the value of a variable whose index is out of bounds, ‘0 ‘is used)
, 1 1,
, 1, , , ,
max , · 0 , ·
· · ·
i i
i
i i i x i x
i i x
kl k l ij k ijk ia ij klk i j
ij kl ij k ij kl xy k l ij xy ki j i j x y
T T M ST X M
M PT M M ST
−
−
⎧ ⎫⎪ ⎪= +⎨ ⎬⎪ ⎪⎩ ⎭
+
∑ ∑
∑ ∑ (3-4)
where ( ) 1ia r ij= −
The first addend selects the latest of the time when the previous process of that job
or the previous process of that machine were finished. The second one accounts for
the processing time and the last one for the set up time.
Recursive definition of iijX :
0 0iX = (3-5)
, 1 1 , 1,
, 1,
max , · 0 , ·
· · ·
i i i
i i i x i x
x
ij i j ij k ijk k l ij klk k l
ij kl ij k ij kl xy k l ij xy kk l klxy
X X M ST T M
M PT M M ST
− −
−
⎧ ⎫= +⎨ ⎬
⎩ ⎭+
∑ ∑
∑ ∑ (3-6)
Just to compare. A linear programming model for the Flexible Jobshop problem 34
The first addend selects the latest of the time when the previous operation of the job
or the previous operation performed by the machine finishes. The second one accounts
for the processing time and the last one for the set up time.
Filling consecutive positions: it is necessary to compel the model to fill
consecutive job positions for each machine.
If a machine is started, it has to be stopped sometime:
kl kl
Stop Start=∑ (3-7)
If a machine was started and has not been stopped yet, it is used:
1
,1
i
g l
ij kl k k lg
M Start Stop= −
=
≥ − ∑ (3-8)
If a machine is started, its first position is full:
1,
iij k ki j
M Start=∑ (3-9)
The positions are filled in ascending order:
( 1),
i iij kl ij k li j
M M −≤ ∑ (3-10)
3.3 Linearizing the model
Although the model here explained contains some non-linearities in its formulation,
the problem can be linearized by means of introducing a large amount of binary
variables, therefore boosting computation time.
Four non-linearities can be found in the model:
, 1 1,
, , 1, , , ,
max , · 0 , ·
· · ·
i i
i
i i i x i x
i i x
kl k l ij k ijk ia ij klk i j
ij kl ij k ij kl xy k l ij xy ki j i j x y
T T M ST X M
M PT M M ST
−
−
⎧ ⎫⎪ ⎪= +⎨ ⎬⎪ ⎪⎩ ⎭
+
∑ ∑
∑ ∑ (3-11)
Just to compare. A linear programming model for the Flexible Jobshop problem 35
, 1 1 , 1,
, 1,
max , · 0 , ·
· · ·
i i i
i i i x i x
x
ij i j ij k ijk k l ij klk k l
ij kl ij k ij kl xy k l ij xy kk l klxy
X X M ST T M
M PT M M ST
− −
−
⎧ ⎫= +⎨ ⎬
⎩ ⎭
+
∑ ∑
∑ ∑ (3-12)
Two of them are products of binary variables. The constraints introduced to
represent such a restriction will be:
·
1
z x yz xz yz x y
=
≤⎧ ⎫⎪ ⎪≤⎨ ⎬⎪ ⎪≥ + −⎩ ⎭
(3-13)
The remaining ones are products of a binary and a real variable. To describe it in a
simpler way, the problem can be expressed as a variable v that takes the value of
another one u if and only if the binary variable b takes the value 1. The way of
introducing this into the model can be described as:
·
(1 )v M bv u M b≤⎧ ⎫
⎨ ⎬≥ − −⎩ ⎭ (3-14)
Thus, the restrictions have been linearized to make the use of a LP solver possible.
In addition, the max operator is suppressed.
max ,z x yz xz y
=
≥≥
(3-15)
3.4 Numerical Results
The model will be run with different data in order to examine the numerical
solutions provided.
Two cases will be considered to illustrate the performing of the gams program. The
first one will be small so that the optimal solution can be seen intuitively. Afterwards, a
complex problem will be analized to show the performance of the model in that field.
Let us start considering two jobs (A, B) to be done in two machines (K1, K2). Job A
has two operations (j1,j2) and B has two (j1,j2). In order to make it easy to solve, we will
not take into account any kind of setup time.
Just to compare. A linear programming model for the Flexible Jobshop problem 36
The numerical values of each operation time:
Task K1 [Hours] K2 [Hours]
A.j1 3 6
A.j2 4
B.j1 5 4
B.j2 5
Table 3. Numerical values for processing time in the example
It can be easily seen that the optimal order of operations will be:
L2L1K1
43
L1
21 87
L2
65 109
K2
Time [Hours]
L2L1K1
43
L1
21 87
L2
65 109
K2
Time [Hours]
Table 4. Optimal schedule for the example
The optimal value was 9 hours.
To get an idea of the complexity of the problem:
Number of equations: 488
Number of single variables: 197
Computation time: 0.62 secs.
Finally, the model can be tested with the data amount which will be more usual in a
real case. For the sake of simplicity, the time data are not included here but in the final
annexes.
This problem is computationally intensive. The optimal solution was found by Neos
Server (a multi-institutional effort that enables easy access to more than fifty
Just to compare. A linear programming model for the Flexible Jobshop problem 37
optimization packages and can be found at http://www-neos.mcs.anl.gov/) after
analyzing it for more than 24 hours:
A.j1
C.j3
5
C.j1,C.j2
D.j1
B.j1,B.j2
1
K1
K2
K3
Time
B.j4,B.j5,B.j6,B.j7
C.j4,C.j5
15
B.j3
A.j2,A.j3,A.j4
10
D.j2
C.j7D.j3
D.j4,D.j5C.j6
35302520
A.j1
C.j3
5
C.j1,C.j2
D.j1
B.j1,B.j2
1
K1
K2
K3
Time
B.j4,B.j5,B.j6,B.j7
C.j4,C.j5
15
B.j3
A.j2,A.j3,A.j4
10
D.j2
C.j7D.j3
D.j4,D.j5C.j6
35302520
Table 5. Optimal schedule for the second example
The optimal value obtained was 35.8 hours.
And problem complexity can be expressed as:
Number of equations: 104,260
Number of single variables: 36,400
Computation time: 23 hours.
3.5 Conclusions
The last one of the numerical examples introduced kept GAMS working for more
than 20 hours. The resolution of the problem could not be finished in time to provide
numerical results here.
In the authors’ opinion, the two more important characteristics of the model
regarding this fact are:
• The great amount of binary variables.
• The impossibility to describe the model in a more intuitive way.
This has the consequence of boosting computation time as shown in the results
provided.
Nevertheless, the heuristic algorithms developed in this project overcome these
difficulties to achieve considerably better performances.
4Designing a Codification.
Importance of the basis
Equation Chapter (Next) Section 1
Designing a codification. Importance of the basis. 39
4 Designing a codification. Importance of the basis.
When starting to implement
any of the optimization
methods described, the first
problem to tackle is how to
represent the possible
solutions. Although some tools
based on binary codification
are available and ready to use
with the time savings this
represents, they are not always
advisable. It is of a vital
importance to choose the best
possible system and not to
reduce the range of options to
just binary. The complexity and
the characteristics of the problem that make more suitable the different elementary
optimization mechanisms –global search, neighbour search, selection and combination-
strongly depend on the codification.
As stated below, choosing the right codification system has a huge impact on the
final outcome. Therefore, a considerable time investment should be made in order to
finally make a sound decision. This section will explore the main reasons of this impact
and will provide with some pieces of advice on how to actually choose or design a
codification.
4.1.1 The hat and the rabbit: making things appear and disappear
A classical definition of optimization is choosing the best solution among the possible
ones. According to this, there are only two things that define an optimization problem:
what is possible –delimitating the space of feasible solutions, usually by a series of
constraints- and what is considered the best –defining an objective function that will
return objective values when applied to the feasible solutions-.
Designing a codification. Importance of the basis. 40
These two concepts are the sources of complexity for the problem. Thus, if one of
them could be reduced or simplified, the problem would be easier to be solved. In
some cases, difficult problems are relaxed, which means that some of their constraints
are disregarded in order to get an approximate solution or valuable information about
the exact one. This is not a kind of relaxation. All the constraints are taken into account,
not explicit but implicitly.
The main idea is that if a codification makes it impossible to construct an unfeasible
solution, then it will not be necessary to check any solution, because all of them will be
feasible beforehand. That simplifies the model to be applied and lowers computational
time.
Of course finding such a codification would be an ideal situation, but in some cases
it is possible to find an intermediate possibility in which a given set of constraints –but
not they all- can be overseen. The rabbit is inside the hat before the magician pulls it
out from its ears, but we cannot see it. We do not have to worry about implicit
constraints. They are in there in case we wanted to pull them out.
The objective is to find the way of providing all the necessary information to define
a solution, but giving only the necessary and sufficient data. Giving more information
arises the threaten of unfeasibility.
In this project, a codification of this kind has been found. Although it was
considered an original innovation of this project, the author discovered it was already
briefly described in [MESOUGHNI 04] but not used. Astonishingly, Mesoughni did not
recognise its advantages and continued using the traditional codification depicted
below. It will be introduced in further sections.
4.1.2 Simplifying the calculation
The second source of complexity in a problem is the peculiarities of the objective
function. Choosing a good codification can make the objective calculations easier. As in
the case of making constraints implicit, that results in simpler models and lower
computation times.
Designing a codification. Importance of the basis. 41
4.1.3 Operators design
When starting the real work, that is, operators design, codification plays a very
important role. Depending on the way the information is coded the four elementary
ideas could be implemented in a more intuitive way. That is, the codification should be
developed having into account the possibilities that each of the options presents
regarding to different operators.
4.1.4 Method selection
Depending on the codification and the way they are defined, some operators will
clearly be more suitable for the problem. For example, if little modifications in the code
can represent solutions that are in essence very different, with objective values that
have nothing to do one with the other, then the neighbour search mechanism will have
no sense. As this element is the basis for most methods, maybe it would be a good idea
to spend some time trying to improve the codification system.
4.1.5 Conclusions
Although some ready-to-use tools based on binary codification can be found in the
market, spending some time in choosing a codification is worth doing. The benefits of
using a right system could be summarised as making constraints implicit, simplifying
the objective calculations, making operators design easier and clarifying the choice of
an optimization method.
In scheduling problems, these advantages become even clearer, given the complex
nature of the problem. That was the reason behind discarding the already-made
operators for this project and developing a codification. In the next section this system
will be introduced.
4.2 The codification used: two lists for a schedule
4.2.1 Introduction: leaving formalities behind
When starting to face the problem of how to codify the solutions, it seems a good
idea to revise the already developed linear programming model. The formulation of
objective function and constraints can clarify the concepts prior to take any decision. In
this case, that will not be done.
Designing a codification. Importance of the basis. 42
Looking back to the model developed, intuition seems to arise a feeling that the
model is more complicated than the problem. There are a huge number of auxiliary
variables and constraints that contain them and are not obvious. Nevertheless, all the
information contained in them could be expressed in a much more easy and informal
way, ideally the same one in which we could show the schedule to follow to a
colleague.
There is some rigidity in mathematical models that can be bypassed someway when
developing a heuristic algorithm. It is not necessary to explicitly formalize neither
constraints nor the objective function in a purely mathematical way. Just
understanding the processes that have to be followed and implementing them
correctly is well enough.
This is a source of informality but also of flexibility, that opens the range of
possibilities and enables the use of more efficient definitions. The codification used
makes advantage of this peculiar characteristic, leaving formalities behind.
4.3 The first idea is not the best
The more intuitive way of expressing a particular jobshop schedule seems to be
specifying the series of operations that each machine must perform. This is idea that
most of the mathematicians and engineers trying to solve the flexible jobshop have
used.
For example, the information could be stored in a matrix in which each one of the
rows corresponds to a machine. The columns then express the operations that the
machine will perform in their particular order. The notation O(3,1) expresses the 3rd job
in its 1st operation. The matrix provided is an example of three machines performing
three jobs, containing respectively two, two and three operations. All the operations
have to be present in the matrix to make it complete.
M1: O(1,2) O(2,1)
M2: O(2,2) O(1,1)
M3: O(3,1) O(3,2)
Table 6. Example of schedule oriented to machines
Designing a codification. Importance of the basis. 43
It is obvious that this information is sufficient to define a schedule. However, is it
necessary?
The main disadvantage of this system becomes clear when trying to carry out the
example below.
M1 starts by the 2nd operation of job 1, so it will have to wait until the 1st one is
finished. M2 starts by the 2nd operation of job 2, so it will also have to wait for the 1st
one to be performed. M3 performs the 1st operation of job 3 and continues with the 2nd
one. Then there is no choice for neither of the machines to keep on working.
We have a problem. There is a circular reference between jobs 1 and 2.
M1: O(1,2) O(2,1)
M2: O(2,2) O(1,1)
M3: O(3,1) O(3,2)
Table 7. Circular reference in a schedule oriented to machines
None of them can be started and the schedule is therefore unfeasible.
If this codification is used, then looking for infeasibilities of this kind would be
compulsory. Note that when the size of the matrix grows, this checking process
increases very fast. Provided that, in each row, the operations are in order (and that
seems easy to achieve), each operation has to be compared with every one that appears
after it to check that the one that precedes them (in the particular job they take part in)
is not placed after the one we were checking originally.
This process is complicated and time-consuming. It would be very useful to find a
way of making this constraint implicit.
4.4 Calling our magician. Introducing the codification used
As stated in4.3, giving more information than necessary is a source of infeasibilities.
In this case, we can reduce the amount of data to a minimum and make this constraint
disappear.
Designing a codification. Importance of the basis. 44
On the one hand, we will have a matrix that stores which machine will perform
each operation. An example that keeps the sizes of the first depicted case is included in
Table 8.
O1 O2
J1 M2 M1
J2 M1 M2
J3 M3 M3
Table 8. Table specifying the machines that will perform each operation
Then, it is necessary to specify the order in which these operations will be
performed. This point is the basis of the codification developed.
The main idea is to work only with jobs and not caring about operations. That way,
when the schedule refers to a starting operation of a certain job, the operation number
is redundant, as it is known that it must be the first one that has not been performed yet.
This avoids giving unnecessary information and eliminates the possibility of circular
references.
A solution could therefore be expressed as a job list, for example 1 3 1 2 2 3.That
means that the first operation to be performed will be the first one of job 1. The
machine in charge of it is M2. It is idle, so it can assume the work. The next one is job 3.
M3 is idle too, so it can start from the first moment. Then job 1 comes again. M1 is
waiting but will have to remain still until the first operation of the job is finished. The 2
afterwards means that the 1st operation of job 2 is waiting for M2 to finish O(1,1). When
it is complete, the 2nd operation of job 2 will be performed on M2, which should be free.
O(3,2) is performed on M3 just when O(3,1) is finished on the same machine. A Gantt
diagram is presented to depict the schedule in Figure 3.
As shown in this example, this codification completely eliminates the possibility of
circular references. Moreover, the constraints of the problem have been astonishingly
reduced. Now, every list of integers where each job number appears as many times as
operations compound it is a valid job list. Machine matrixes have to abide only the
characteristics of the machines, which can perform some operations but not the
Designing a codification. Importance of the basis. 45
remaining ones. Both controls are now very easy – as easy as some counts in the case of
the job list and simple checking in the machine matrix.
0 5 10 15 20 25 30 35 40 450.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(2,1)
(2,2)
(3,1)
(3,2)
time(min)
mac
hine
Gantt Diagram
Figure 3. Example Gantt diagram
A small change is introduced. Instead of working with the machine matrix, it is
reordered and transformed into a second list of the machines corresponding to the
operations shown in the job list. In the case example, the machine list would be 2 3 1 1 2
3. Thus, every solution is encoded in two lists.
The process that converts each ensemble of a machine list and job list into a
comprehensible schedule is simple. The first step is to identify the operation number
that corresponds to each component of the job list. Then, a machine and a job schedule
are constructed having into account the latest operation scheduled for the job and for
the corresponding machine. Processing, change and transportation times are selected
from the data matrixes. The information is structured from two different perspectives:
in the form of a machine schedule (the operations it will perform with its starting and
finishing times) or a job schedule (with the machines on which each operation will be
performed, together with starting and finishing times).
Designing a codification. Importance of the basis. 46
An additional advantage of this codification is that it has a true meaning when
representing the schedule. That implies that similar job lists or machine lists will be
similar too, and have similar objective values as well. This is vital as it gives sense to
the neighbour search fundamental process. Moreover, combinations of two solutions do
have something to do with the original ones, but do not necessarily have to show
intermediate objective values –one of them could be better at the beginning of the
schedule and the other one at the end, so a combination of could outperform both-.
This makes a combination mechanism valuable. Taking these ideas into consideration, it
is evident that all four elemental optimization tools –provided that global search and
selection are useful in all situations- are valid to solve the problem if this codification is
used.
4.5 Our magician’s communication problems
Interpreting such a schedule is not easy. All the relevant data are presented and
ordered, but it is difficult –maybe not for some privileged minds- to get a quick idea of
how jobs are developed through time. That is to say: encoded solutions store only the
minimum necessary data, which have to follow a purely mechanical transformation
that involves the use of time data in order to become complete schedules. Nevertheless,
these schedules are not easily interpretable. For that reason, a graphical output is really
valuable.
A Gantt representation expresses the duration of activities as the length of the bars
in the diagram. A function which constructs this kind of graphs from the schedules has
been developed. The solution adopted to represent a flexible jobshop was to draw
several horizontal bars that will be identified with the available machines. Colours are
used to represent jobs and operations are labelled across the diagram. An example
with eight jobs is provided here to illustrate the function in Figure 4.
This graph can be easily understood. The performance of each job can be tracked by
simply following the corresponding colour. Idle times are easily localised and a rough
idea of work charge can be obtained by mere inspection.
Designing a codification. Importance of the basis. 47
0 20 40 60 80 1000.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 4: A Gantt diagram is understandable by mere inspection
Such a tool is an indispensable help during the process of developing the algorithm
and checking for the reasonability of the solutions provided.
4.6 Conclusions
The codification used is based on the use of two kinds of lists: job lists and machine
lists. It has the peculiar advance of completely eliminating the possibility of encoding
unfeasible solutions. Thus, it enables simpler and quicker algorithms to be
implemented.
It is easily translated into a schedule, thus enabling a very easy calculation of
objective function.
It appears to reflect the true nature of solutions and gives sense to the four
elemental optimization mechanisms –global search, selection, neighbour search and
combination-, so all them can be applied in the final algorithm.
As schedules are not easy to interpret, a function that returns a graphical output in
the form of a Gantt diagram has been developed. This enables to notice interesting
characteristics of the schedule by mere inspection.
5Data generation.
Some practical issues
Equation Chapter (Next) Section 1
Data generation. Some practical issues. 49
5 Data generation. Some practical issues
With the aim of making further algorithm explanations clearer, the data used to
define the particularities of each jobshop problem is introduced. This data structure has
been created with the intention of easing the calculations and taking advantage of the
special features of the codification proposed.
5.1 Basic parameters
The scalars that define the size of the problem are:
n :number of jobs to be scheduled.
o :a vector that stores the number of operations that conform each job in its
n components
m : number of available machines.
5.2 Time related parameters
iij kPT : Processing time of job i in its j -th operation when performed on
machine k .
0iij kST : Setup time when starting to perform job i in its j -th operation on
machine k if it is the first task assigned to the machine.
i xij xy kST : Set up time when changing, on machine k , to j -th operation of job i
from y -th operation of job x .
ijkP : Binary variable that takes the value ‘1’ if and only if the j -th operation of
job i can be performed on machine k . The processing time is recorded as ‘0’ when
the machine is not able to perform the operation.
iijPList : A vector containing the identification of all the machines in which j -th
operation of job i can be performed. It represents just a different way of
expressing ijkP , but results especially useful to work with.
iij klTT : Transportation time for job i in its j -th operation when this one will be
performed on machine k and its previous operation was done on machine l .
Data generation. Some practical issues. 50
5.3 Complexities of a generalised jobshop
The problem that this project aims to solve is a very general version of the jobshop
(works do not usually deal with setup, change or transportation times at once, and the
flexible jobshop corresponds to the more complex machine configuration).
This generality leads to the need of managing complex data.
For instance, the size of iij klTT matrix in the case of a problem with 8 jobs of 5
operations each and 4 machines is 8·5·8·5·4 = 6400 scalars.
Obviously, it is not possible to introduce manually that amount of data to get
valuable information about the algorithm performance. Consequently, a simple tool
that generates coherent data has been developed.
5.4 Generating random jobshop data
The data generated must be coherent with the definitions of n ,o and m . To adjust
the level of redundancy among the machines –that is, how many machines will be on
average capable to perform each activity- a value of SparcityParameter is defined,
where 0 would correspond to a non-flexible jobshop and 1 to having machines that are
all able to perform all the operations.
Just to give a rough idea of the order of magnitude of the time length of the
operations, a TimeParameter is also introduced. The generated time values come from
a normal distribution with mean TimeParameter and standard deviation
/ 3TimeParameter . As these values have to be positive, the simulation is repeated if a
negative value is generated.
5.4.1 Processing time
An empty max( )n o m× × matrix is generated. Iteratively, for each job, operation and
machine, random numbers are generated from a uniform [0,1] distribution. If that
random value does not exceed SparcityParameter , a random normal time is included
in the matrix. If after that process there is some operation that has not been assigned to
any machine, a random one is selected to fill that gap and make the schedule possible.
Data generation. Some practical issues. 51
iijPList is build from iij kPT by searching the possible machines in which an operation
can be performed and listing them in an ordered way.
5.4.2 Change, transportation and Start-up times
They are generated from iij kPT , taking into account the different possible
alternatives and scaling values that fix their importance compared to processing times.
5.5 Making this structure useful to work with real data
As stated below, the generality of the studied problem has the disadvantage of
needing important amounts of data. Measuring all this different possibilities is not
economically viable in a real shop.
Thus, in a real case study, this time simulation tool is also useful. Transportation,
change or start-up times can be neglected if the peculiar characteristics of the case
allow that simplification (just setting their TimeParameter to 0). Or some of the times
can be simulated taking as a starting point a smaller set of times actually measured,
assuming they will be similar in the same way bootstrapping techniques work. In order
to do that, TimeParameter should be adjusted for each case.
6Brief considerations about
problem complexity
Equation Chapter (Next) Section 1
Brief considerations about Jobshop problem complexity 53
6 Brief considerations about Jobshop problem complexity
6.1 Introduction
As stated in the introduction section, the main difficulty that the J.S.P. resolution
poses is the combinatorial explosion of possible solutions. If the problem was to be
solved by exhaustive enumeration and evaluation, then the computation time will be
linked to the total number of solutions and the time spent evaluating each one of them.
This calculation was not included in the introduction. The reason for not doing it is
that this value depends on what is considered a solution, which is linked to the
codification used. Two different systems have been used in this project –the one in the
linear programming model and the one proposed for the algorithms- so their
implications on the problem complexity will be analysed separately.
6.2 Linear programming model
In the L.P. model, each solution can be defined by its decision variables, expressed
as:
iij klM : Binary variable that takes the value ‘1’ if and only if the j -th operation of job
i is performed on machine k on l -th place.
The indexes will have to take the values:
1,...,1,...,/ 1
1,...,
i
ijk
k
i nj ok k Pl p
=== =
=
Where:
n : number of jobs to be scheduled.
io : operations that compose the job i .
m : total number of available machines.
Brief considerations about Jobshop problem complexity 54
kp : maximum number of operations that can be performed on machine k .It is
calculated as k ijkij
p P=∑ and ijkP is a binary parameter that takes the value ‘1’ if
and only if the j -th operation of job i can be performed on machine k .
Then, the total amount of possible combinations is obtained as:
1 1 / 1 ,2
j oi n n
abki j k P a bijk
P
N
==
= = =∑ ∑ ∑ ∑
= (6-1)
Computation time is the product of the total number of solutions and the time
evaluating each one, that is a linear function of ii
o∑ (see ‘From Code to Schedule’), so its
plots will be just shifted in a logarithmic scale. This effect will not be included.
The results for a jobshop with total flexibility in 3 machines and composed by jobs
with 4 operations each are shown in Figure 5.
100
101
100
1050
10100
10150
10200
10250 LP model complexity
Number of jobs
Num
ber
of p
ossi
ble
solu
tions
Figure 5. LP model complexity
6.3 Heuristic algorithm complexity
As explained in ‘Designing a Codification. Importance of the basis’, the codification
system proposed has the advantage of reducing the space of solutions to the feasible
set. Although combinatorial explosion is also present, computation time will be lower
for all problem sizes.
Brief considerations about Jobshop problem complexity 55
A solution is encoded in the form of a job and a machine list. The first one is a
permutation of numbers from 1 to n where each number i appears exactly io times.
Thus, considering just this half of the solution the total number of possibilities is equal
to the permutations with repetition expressed as:
!
!
ii
ii
oN
o
⎛ ⎞⎜ ⎟⎝ ⎠=∑∏
(6-2)
As the corresponding number on the machine list can take ijkk
P∑ values, the total
amount of possible solutions is obtained as:
!·
!
ii
ijkij ki
i
oN P
o
⎛ ⎞⎜ ⎟⎝ ⎠=∑
∏∑∏ (6-3)
The results for the total amount of solutions are shown in Figure 6.
100
101
100
1020
1040
1060
1080
10100 Heuristic model complexity
Number of jobs
Num
ber
of p
ossi
ble
solu
tions
Figure 6. Heuristic model complexity
6.4 Compared problem complexity. Conclusions
When compared, it becomes clear that the codification used mitigates problem
complexity in some way.
Moreover, as stated in ‘Designing a Codification. Importance of the basis’, no checking
process is necessary in this case, so the constant k that links problem complexity in
Brief considerations about Jobshop problem complexity 56
number of solutions and in computation time is lower. This effect is not considered in
the plots.
100
101
100
1050
10100
10150
10200
10250 Compared complexity
Number of jobs
Num
ber
of p
ossi
ble
solu
tions
LP modelHeuristic model
Figure 7. Compared complexity
The Flexible Jobshop problem is one of the most complex NP-problems in terms of
total amount of solutions. This is the main obstacle that prevents classical optimization
methods from solving it successfully. The heuristic algorithms proposed try to
overcome this difficulty but, in addition, reduce considerably the number of possible
solutions and the computation effort needed to evaluate them.
7A simulated annealing algorithm
for solving the Jobshop problem
Equation Chapter (Next) Section 1
A Simulated Annealing algorithm applied to the Jobshop problem 58
7 A Simulated Annealing algorithm applied to the Jobshop
problem
One of the metaheuristical
algorithms that have been
implemented in this project is
Simulated Annealing (S.A.). As
stated in the introduction to
this method, it is inspired in the
metallurgical processes that
underlie this thermal
treatment. It consists in rising
the metal temperature to the
fusion point and then allowing
it to get cold again. The properties of the solid obtained will highly depend on the
speed of this cooling. If this happens very fast (the tempering process), the formed
crystals will contain a relatively high number of imperfections. On the contrary, if the
process is slow enough, the mineral can shape into big perfect crystals, the ones that
achieve an energy minimum.
The fundamentals of this optimization method have already been reviewed, so they
will not be presented again. The characteristics of the created algorithm will be
introduced in next sections, highlighting its most interesting features and the results
that it returns when applied to the selected problem.
7.1 The basis. Looking for a good neighbour first
The proposed S.A. algorithm implements a combination of the mechanism created
by Kirkpatrick and a pure random neighbourhood search.
As described in the introduction section, S.A. relies on a thermodynamical equation,
·( ; ) 1 k tP E t e∂
−∆ ≤ ∂ = − (7-1)
where parameter t is usually interpreted as the system temperature.
A Simulated Annealing algorithm applied to the Jobshop problem 59
The probability of a big energy increment is very low, and can be described through
an exponential distribution function. Therefore, a possible pseudo-code expression for
this classical interpretation of this metaheuristic could be:
Select an initial solution 0s from the space of feasible solutions S ;
Fix the initial temperature to a positive value 0 0t > and select a cooling plan ( )CP t ;
While (Termination_Criterion == false)
Select another state from the neighbourhood of the current one 0( )s N s∈ ;
Evaluate the energy increment 0( ) ( )E E s E s∆ = − ;
If 0E∆ < then 0s s=
Else
Generate a random number [0,1]r∈ ;
If Etr e∆
−< then make the up-hill movement 0s s=
Change the value for temperature if necessary ( )t CP t=
Nevertheless, the algorithm created in this project performs a random search
around the point, evaluating a certain number of neighbours and moving to them if
their fitness exceeds the local value. Only when the selected neighbours have been
searched, the algorithm turns to the classical S.A. thermodynamical equation to escape
from that local optimum.
The resulting pseudo-code could be:
Select an initial solution 0s from the space of feasible solutions S ;
Fix the initial temperature to a positive value 0 0t > and select a cooling plan ( )CP t ;
A Simulated Annealing algorithm applied to the Jobshop problem 60
While (Termination_Criterion == false)
For (Number of neighbours to be searched)
Select another state from the neighbourhood of the current one 0( )s N s∈
Evaluate the energy increment 0( ) ( )E E s E s∆ = −
If 0E∆ < then
0s s=
break
Select another state from the neighbourhood of the current one 0( )s N s∈ ;
Evaluate the energy increment 0( ) ( )E E s E s∆ = − ;
If 0E∆ < then 0s s=
Else
Generate a random number [0,1]r∈ ;
If Etr e∆
−< then make the up-hill movement 0s s=
Change the value for temperature if necessary ( )t CP t=
The combination of both mechanisms has proved to be more efficient, as the
number of up-hill movements is minimized.
The main concepts of the algorithm will be introduced in the next sections.
A Simulated Annealing algorithm applied to the Jobshop problem 61
7.2 Generating a starting point
The codification used easies the generation of random solutions to the problem.
They are encoded in two lists:
A job list that stores numbers from 1 to n repeated jo times each, their first
appearance corresponding to their first operation and so on.
A machine list that shows the machines that will perform the operations on the job
list.
Thus, the structure of a job list is just a permutation with repetition of the numbers
from 1 to n repeated jo times each. The algorithm builds a vector composed of 1
repeated 1o times, 2 repeated 2o times and so on. Then, a random permutation of them
is selected.
Based on the generated job list, machines are randomly selected from iijPList to
perform the operations.
This gives origin to a valid solution to the problem. The codification structure
makes any verification unnecessary. Thus, in such a simple way, a starting point for
the algorithm has been achieved. It is important to notice that LP methods spend a
quite important amount of time just looking for feasible solutions, so this represents a
considerable effort saving.
After generating the solution, its fitness is evaluated and stored.
7.3 The importance of being neighbours
One of the most difficult decisions when implementing a S.A. algorithm is deciding
the definition of neighbour that will be used, as it will constitute the basis for the good-
neighbour search and the thermodynamical jumps. That definition should be coherent
with the codification chosen and represent similar solutions not only in code but in
nature (see section 4).
The election made is related to the halves of the codification: job and machine lists.
A Simulated Annealing algorithm applied to the Jobshop problem 62
7.3.1 Machine list minimum movement
The minimum movement in the machine list is produced changing just one of the
machines to another one that is also capable of performing that operation. That seems
intuitive and simple. An example that illustrates the similar nature of two solutions that
differ in a minimum movement in their machine lists is shown in Figure 8 and Figure 9.
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 8. Representation of a certain solution
0 20 40 60 80 1000.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 9. The same solution after a machine list movement
A Simulated Annealing algorithm applied to the Jobshop problem 63
Both schedules are very similar, and it seems that a change like that could be
considered minimum.
7.3.2 Job list movement
In this case, every movement can be thought as a permutation. It is not obvious to
decide the kind of permutation that can be allowed (i.e. a block permutation or a single
operation permutation). This project uses the second definition, and gets a definition of
neighbour alternative to the machine one.
For example, the same schedule below could be modified changing the priority
between the first and third operations.
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 10. Initial solution
The modification introduced exchanges in priority the first operations of jobs 6 and
7, returning the result represented in Figure 11.
A Simulated Annealing algorithm applied to the Jobshop problem 64
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 11. The same solution after a job list movement
7.3.3 Combining both movements
Considering that both movements are equally important, when searching for a
random neighbour there is the same probability of moving away in each of the two
directions.
Thus, the process for obtaining a neighbour is selecting a direction randomly –
machine or job – and then performing a simple movement.
7.3.4 The key factor
As stated in ‘Designing a Codification. Importance of the basis’, choosing the kind of
algorithm to solve a given problem should be a sound decision based on its own
characteristics and the codification used related to the basic optimization mechanisms –
global search, neighbour search, selection and combination-.
In the particular case of S.A., the first solution is obtained by global search and the
remaining of the algorithm performance is based on neighbour search. Consequently, it
is extremely important to choose a codification system and a method for obtaining
neighbours that responds to the real characteristics of the problem. Neighbours should
A Simulated Annealing algorithm applied to the Jobshop problem 65
be different, but very similar in nature, so the distances between solutions are
minimum.
Actually, the definition of distance underlies the one of neighbour, as the
neighbourhood of a solution should correspond to the set of points immediately close
to the one considered. Moreover, that means that it should be possible to obtain any
solution from any other just performing a finite set of movements from neighbour to
neighbour. The algorithm developed works with a definition of distance as the
minimum job and machine list changes that have to be performed in order to transform
one schedule to another. The large amount of possible combinations makes this
calculation computationally expensive, but it is not necessary for the performance of
the algorithm. The only thing that is needed is to create neighbours and the process is
fairly fast and simple, by the mechanism described.
The so-obtained neighbours respond to these characteristics and play the basic role
of the algorithm.
7.3.5 Advantages of the neighbour creating system compared to classical alternatives
It can seem that the distance and neighbour definitions developed are simple and
intuitive. Surprisingly, simplicity and intuitiveness are not always considered as a
main goal by algorithm designers.
For instance, most heuristics work with a binary codification. There are ready-to-use
tools available to implement the algorithm saving time and effort. If the codification
system –based in two lists- was to be maintained but translated into binary, the
commonly used definition of neighbour would be changing one 0 to 1 or vice versa in
any of the lists. The implicit distance definition would be the Hamming distance3.
That seems equally straightforward at first sight, but presents the particular
disadvantage of spoiling some of the most positive characteristics of the codification
system developed. For example, a jobshop constituted by three jobs with two
operations each and only one machine (then only a job list is necessary) could have as a
possible solution the following list:
3 Hamming distance computes the summation of the elements of a binary chain that are different. E.g.,
the distance between 1 0 0 1 and 1 1 0 0 would be 0+1+0+1 = 2.
A Simulated Annealing algorithm applied to the Jobshop problem 66
1 2 2 1 3 3 →01 10 10 01 01 11 11
A simple movement could change the first 0 into a 1:
11 10 10 01 01 11 11 →3 2 2 1 3 3?
his is not a feasible solution, as the second operation of job 1 is not performed and
job 3 appears once more than needed. Obviously, changing a 1 for a 0 and a 0 for a 1
does not improve the situation:
11 10 10 00 01 11 11 →3 2 2 0 3 3?
Thus, a complex checking process should be done after a modification in order to
ensure that the solution obtained abides the constraints of the problem. The definition
of neighbour is therefore not useful.
In another case, if the codification used was based on machine schedules –the one
that has been applied classically to this problem-, a possible solution is depicted in
Table 9:
M1: O(2,1) O(1,2)
M2: O(2,2) O(1,1)
M3: O(3,1) O(3,2)
Table 9. Example of codification based on machine schedules
and the intuitive definition of neighbour would be moving one operation to a
machine that actually is able to perform it, or exchanging two of them. For instance, as
expressed in Table 10.
A Simulated Annealing algorithm applied to the Jobshop problem 67
M1: O(1,2) O(2,1)
M2: O(2,2) O(1,1)
M3: O(3,1) O(3,2)
Table 10. Neighbour solution
¿What is the problem with this definition? Firstly, the exchange below has created a
circular reference between jobs 1 and 2 (see section 4). A complex checking process is
necessary here too in order to avoid this kind of infeasibilities. It has also the
disadvantage of returning distant relatives. The so-obtained neighbours can be very
different –and this is intuitive, as the process can imply movements in both directions
and not simple ones (as operations can be exchanged between two different machines
and that would mean a movement of two units in the job list and other two in the
machine list direction according to the selected definition).
This implies that the mechanism of neighbour search looses some of its sense
(neighbours can be not very similar in nature), so it becomes closer to a global search
element. This worsens algorithm efficiency and is a threat that should always be
considered when designing this kind of heuristics.
7.4 Too many doors to knock
As highlighted below, looking for better solutions in the neighbourhood before
allowing up-hill movements enhances algorithm efficiency.
Computational explosion that poses the main difficulty to the Jobshop problem
resolution appears also in this neighbour calculation.
In an example with n jobs composed of io operations each, in which each operation
j of job i can be performed on ijm machines, the number of neighbours of a certain
point is given by the expression:
,
2· ij
i j
mc⎛ ⎞⎜ ⎟⎝ ⎠
∏ (7-2)
A Simulated Annealing algorithm applied to the Jobshop problem 68
where ii
c o= ∑ .
The results for a jobshop with total flexibility in 3 machines and composed by jobs
with 4 operations each are shown in Figure 12 and Figure 13.
0 2 4 6 8 100
1
2
3
4
5
6
7
8
9
10x 10
21 Neighbourhood size
Number of jobs
Figure 12. Neighbourhood size in linear scale
100
101
100
105
1010
1015
1020
1025 Neighbourhood size
Number of jobs
Figure 13. Neighbourhood size in logarithmic scale
There are too many doors to knock, so only a random selection of neighbours will
be checked. The election of this parameter fixes a degree of directionality in the search.
If too many neighbours are checked the algorithm will loose efficiency. However, if
only a few of them are analysed there could be also a loss of efficiency because the
opportunity of using interesting information –and its directionality possibilities- is
discarded.
A Simulated Annealing algorithm applied to the Jobshop problem 69
7.5 Thermodynamical jumps. Choosing a cooling plan
If a better solution was found among the set of searched neighbours that kind of
search will be restarted from the new solution. Only if no improvement was
successfully achieved, the mechanism of thermodynamical jumps, the one which
characterises S.A., is set in motion.
In it, the possibility of up-hill movements is open, so the algorithm does not get
trapped in local optima.
The expression that returns the probability of accepting such a worsening of the
solution is given by the Boltzmann equation:
·( ; ) 1 k tP E t e∂
−∆ ≤ ∂ = − (7-3)
Temperature is the key variable in this expression.
7.5.1 Cooling plans state of the art
Several functions to describe the cooling plan have been developed. The most
widely used ones, already mentioned in the introduction section, are described below:
Geometrical rule:
, 1, 11t ct c cii = ≈ <+ (7-4)
Hajek plan:
( )log 1ctk k
=+
(7-5)
This plan is based on the concept of local optimum deepness, as the increment in the
cost function necessary to escape from it to other optimum state. c is at least as
large as the maximum deepness and k is the number of iterations to be done.
Despite its theoretical perfection, it does not have a good practical performance.
Aarts&Korst plan:
( )( )( )1 ln 1 /3
ttt t
αδ σ
=+ +
(7-6)
It takes into consideration the standard deviation of the solutions obtained at each
temperature.
A Simulated Annealing algorithm applied to the Jobshop problem 70
Lundy &Mees plan:
( ) 1tt
tα
β=
+ (7-7)
with relatively small values for β .They suggest the termination criterion to be
established as
( )log 1 /
tS pε
⎡ ⎤⎣ ⎦≤
− (7-8)
were S is defined as the size of the population. This rule returns a solution that
differs from the global optimum in less than ε with a probability p .
In [JOHNSON, 1989], adaptive cooling schedules are tested. These plans take into
account the fact that time spent by the algorithm at very high or low temperatures
seems to be unproductive. Based on a physical analogy, Kirkpatrick, Vecchi and Gelatt
argued that more time is needed between these temperatures, when the cost is
dropping rapidly, in order to achieve equilibrium. Johnson and his colleagues found
that the effect on solutions and the timing of this schedule was indistinguishable from
a geometric schedule of some determined parameters.
Other ideas that appeared in research, such as using better-than-random starting
solutions and lower-than-normal starting temperatures, have not been successful.
7.5.2 Industrial considerations. Interesting ideas about the causes of the success and failure of
the described cooling plans
If the starting point for all metaheuristics was turning to nature and the basis for
S.A. is the real metallurgical process of annealing, it seems a good idea to study the
temperature patterns that industrial annealing applications follow.
Annealing is applied to metals and alloys like steel and copper to enhance ductility
and suppressing residual tensions. Consulting one of the most prestigious metallurgy
publications (The American Manufacturing Society Handbook), one can discover that,
although temperature evolution has been deeply studied for the tempering process,
there is virtually no research for the annealing case. It is assumed that just leaving the
product to cool in the air or inside the oven is enough.
A Simulated Annealing algorithm applied to the Jobshop problem 71
The ‘cooling plan’ for such a simple process, assuming that the cooling takes place
sufficiently slowly and the temperature is uniform across the object would correspond
to the differential equation:
· · ·( ( ) ( ))oo e
Q TC h S T t T tt t
∂ ∂= = −
∂ ∂ (7-9)
Where Q represents evacuated heat, C the object specific heat, h is the convection
coefficient, S is the object surface, and oT , eT correspond to object and environment
temperatures respectively.
This pattern can be described as a first order transitory process, depicted in Figure
14.
0 2 4 6 8 100
50
100
150
200
250
300
350
400
450
500Natural Convection Cooling
time
Tem
pera
ture
Figure 14. Temperature evolution followed by natural convection processes
The simplest and more intuitive cooling schedule should correspond to this pattern,
so imitating it would be worth trying.
A cooling plan that tried to capture the essence of this process should make discrete
the expression:
· · ·( ( ) ( ))oo e
Q TC h S T t T tt t
∂ ∂= = −
∂ ∂ (7-10)
The result comes in the form of:
1 · ·(1 ) ·t t t t tT T k T T k c T+ = − = − = (7-11)
A Simulated Annealing algorithm applied to the Jobshop problem 72
where 1c < .
The values for temperature turn out to be, in the first case:
( ) ( )·t
o e o eT t T T T e τ−
= + − (7-12)
and, in the discrete case:
1( )t e
t tT TT Tτ+−
= − (7-13)
It can be noticed that both plans are identical just by mere inspection of Figure 15.
0 200 400 600 800 10000
50
100
150
200
250
300
350
400
450
500Natural Convection Cooling
iteration
Tem
pera
ture
First order transitoryGeometrical rule
Figure 15. Comparation of the first order transitory and the geometrical rule
Revising the existing research on cooling plans, it seems that plans more
sophisticated than the geometrical rule do not give origin to better performances. This
fact could be related to the metallurgical industry belief that states that controlling
temperature and forcing its evolution to differ from the natural convection pattern
would not improve the results of the annealing process.
7.5.3 Fixing the plan defining parameters
The initial value for temperature should be high enough to allow all up-hill
movements to take place. A good idea to fix this starting point is considering that an
up-hill jump of 100% the initial fitness value should be allowed with a 99% probability,
for example.
A Simulated Annealing algorithm applied to the Jobshop problem 73
( ; ) 1 TP E t e∂
−∆ ≤ ∂ = − (7-14)
0.99 1 o
InitialFitnessTe
−
= − (7-15)
(0.01)o
InitialFitnessTLN
−= (7-16)
The value of c should be calculated assuming that the final value of the
temperature is 0, so 0eT = .
1( ) 1· 1t e
t t tT TT T Tτ τ+− ⎛ ⎞= − = −⎜ ⎟
⎝ ⎠ (7-17)
By analogy:
11cτ
= − (7-18)
and, if the proprieties of first order transitory processes are taken into account,
τ should be fixed noting that the time in which temperature will have descended some
99.75% is 6·τ . Thus, this parameter will control the number of iterations that the plan
will take to be completed.
7.5.4 Dynamic cooling plan. A different proposal
Geometrical rule is simple and gives good results. It fixes temperature and not jump
acceptance probability. The proposed alternative plan does exactly the opposite,
adapting dynamically temperature to match the desired up-hill movement probability.
From the definition of the probability of acceptance of an up-hill movement of a
given size relative to the fitness of the latest accepted solution, it computes
temperature. That probability is updated in a geometrical fashion, decreasing it in the
same way that the geometrical plan developed controls its duration with the aid ofτ .
This plan is not as intuitive as the simple geometrical rule, and in many cases it is
less effective than it. For example, solving a jobshop with 10 jobs composed by 3,4 ,2, 3,
4, 3, 10, 5, 6 and 2 operations respectively on 3 machines, the results obtained were as
represented in Figure 16 and Figure 17.
A Simulated Annealing algorithm applied to the Jobshop problem 74
0 200 400 600 800 100090
100
110
120
130
140
150
160
170Comparative performance
Aceptable geometrical rulePoor geometrical ruleDynamical plan
Figure 16. Algorithm performance with different cooling plans
0 200 400 600 800 10000
5
10
15
20
25
30
35Resulting cooling plans
Aceptable geometrical rulePoor geometrical ruleDynamical plan
Figure 17. Temperature evolution with different cooling plans
It is important to note that the temperature result of these plans will differ from
their theoretical evolution, given that the plan is frozen while the neighbour search is
performed, flattening the temperature plot. The more neighbour search is done, the
more distortions are introduced to the plan. They have also an impact on computing
time, increasing the number of inspected and visited solutions (that can be seen in the
A Simulated Annealing algorithm applied to the Jobshop problem 75
plots below, as they show algorithm performance and temperature evolution versus
solutions accepted).
Although under some circumstances, with a good parameter tuning, geometrical
rule can give the best results in efficiency and effectiveness, if the parameters are not
well chosen, the result underperforms the dynamical plan. The values suggested below
work adequately in the majority of cases, but some others exist in which the parameter
tuning is difficult.
Nevertheless, the dynamical plan has proved to be quite robust, as it controls
directly the probability of a certain jump. Some values that have been adequate
recurrently are:
( 0.1; 0) 0.25P E t∆ ≤ = = (7-19)
The initial probability of accepting an up-hill movement of 10% is 25%.
( 0.1; 1) · ( 0.1; )P E t c P E t∆ ≤ + = ∆ ≤ (7-20)
1 11 1
8
c NIterationsτ= − = − (7-21)
The probability evolves following a geometrical rule.
As a conclusion, it can be said that the simplest geometrical rule can have the best
performance, but there are other plans –like the alternative dynamical plan proposed
in this project- that can be more robust when facing changing problem size or
maximum allowed iterations.
7.6 Computation time vs. Problem Size
The fact that the starting point to design a cooling schedule is the maximum allowed
number of iterations implies that a simple analysis of computation time vs. problem
size cannot be performed easily, as actual time will depend on the initial assumption of
its value. Instead of that, a sound statistical analysis of variance that takes into account
not only problem size but also this initial assumption should be carried out. The latest
lies beyond the scope of this project. Nevertheless, the relationship between problem
size and time consumed per iteration is shown in Figure 18. Some models have been fit
to this data and represented in Figure 19. The results of this fits are shown in Table 11.
A Simulated Annealing algorithm applied to the Jobshop problem 76
Fit Sum of the squared error Adjusted R-Square
Linear polynomial 0.057 0.866
Square polynomial 0.0192 0.956
9-Degree polynomial 1.04 0.999
Exponential 0.0159 0.963
Table 11. Results of the curve fitting to computation time per iteration
0 5 10 15 20 25 30 35 400.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Computation time/Iteration vs. Problem size
size (number of jobs)
time
(s)
Figure 18. Computation time/Iteration vs. Problem size
A Simulated Annealing algorithm applied to the Jobshop problem 77
5 10 15 20 25 30 35 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Curve fitting to Computation time vs. Problem size
y vs. xLinear polynomialQuadratic polynomial3-Degree polynomialExponential
Figure 19. Curve fitting to Computation time per iteration vs. Problem size
It can be seen that computation time per iteration can be expressed as a polynomial
function of problem size.
7.7 Conclusions
Simulated Annealing can be applied to the Jobshop problem with good results. In it,
two different elemental mechanisms appear:
The first solution is provided by a global search method.
The movements according to the Boltzmann equation are an expression of
neighbour search.
The developed algorithm performs that neighbour search in two different ways:
Searching a random set of neighbours looking for better solutions. The size of
this set represents a very important parameter to the algorithm performance.
Up-hill movements are allowed by Boltzmann processes, which characterize S.A.
algorithms.
The definition of neighbourhood has to be consistent with the problem nature and
the codification used. It is essential to find a process to generate neighbours that are at
A Simulated Annealing algorithm applied to the Jobshop problem 78
a minimum distance from the initial point –with all this implies- at a low
computational cost. This process is described.
The evolution of temperature is the other key aspect of the algorithm. The published
research seems to conclude that very sophisticated rules do not give good results.
Moreover, there are less intuitive and more difficult to control, and usually a
geometrical rule of certain parameters gives the same results.
Turning back to the industrial applications of annealing, it can be seem that
manufacturing engineers do not care about temperature evolution in this case, leaving
the metal to cool down in the open air or inside the oven. If the cooling takes
sufficiently long, the pattern followed by temperature can be assimilated to a first
order transitory process that when made discrete corresponds to a geometrical rule.
First order transitory processes characteristics can be applied to obtain the rules for
setting the parameters that were deduced below. They have proved to be adequate in
most cases, but the performance of the algorithm highly depends on the parameters
and their tuning is difficult under certain circumstances.
A dynamical plan, which fixes the probability of acceptance of an up-hill movement
of a given height and makes it evolve in a geometrical fashion, is more robust when
facing changes. This proposed plan can be outperformed by a geometrical rule of
certain parameters, but given that the parameter tuning is difficult sometimes that
robustness is a very interesting characteristic.
8A genetic algorithm for the
Jobshop problem
Equation Chapter (Next) Section 1
A Genetic Algorithm for the Jobshop Problem 80
8 A Genetic Algorithm for the Jobshop Problem
As described in section 2.2, Genetic Algorithms (G.A.) imitate the mechanisms of
evolution to solve complex optimization problems.
This project applies the theory of these methods together with new ideas that will be
highlighted to the Jobshop successfully, achieving acceptable solutions in both
efficiency and effectiveness terms.
The main elements of the developed algorithm will be presented in the next
sections, explaining their most important features.
8.1 Creating the first generation
8.1.1 Originating individuals
The first generation of individuals represents the initial point for the algorithm. The
solutions are generated through a global search method (see ‘Heuristic methods from a
holistic point of view’), that is essentially the same as in the S.A. algorithm.
The codification system developed turns out to be very useful for the generation of
random solutions to the problem. As explained in ‘Designing a codification. Importance of
the basis’, each individual is encoded in two lists:
A Genetic Algorithm for the Jobshop Problem 81
A joblist that stores numbers from 1 to n repeated jo times each, their first
appearance corresponding to their first operation and so on.
A machinelist that shows the machines that will perform the operations on the
joblist.
A valid joblist is just a permutation with repetition of the numbers from 1 to n
repeated jo times each. The algorithm builds a vector composed of 1 repeated 1o times,
2 repeated 2o times and so on. Then, a random permutation of them is performed.
Based on the generated joblist, machines are randomly selected from iijPList to
perform the operations.
This process produces a valid solution to the problem. The codification structure
makes any verification unnecessary. Therefore, a starting point for the algorithm has
been obtained in a simple way. It is important to notice that LP methods spend a quite
important amount of time just looking for feasible solutions, so this represents a
considerable effort saving.
8.1.2 Diversity is the key
When a population of a given species is small, it is more probable that changes in
the environment have a negative impact on it, leading it even to extinction. The
explanation for this phenomenon is that there probably are not enough different
individuals to have a considerable set of them that is resistant to these changes.
As in biological systems, it is important to obtain a population that is sufficiently
diverse in order to guarantee the correct performance of genetic mechanisms. That
diversity is linked not only to the randomness of the solution generating process –that
is guaranteed by the design of the mechanism as explained above- but to population
size.
The time spend on each iteration of the algorithm is a linear function of population
size. Thus, increasing it can lead to a loss of efficiency.
Nevertheless, too small populations imply poor diversity, which generates
convergence to sub-optimal points or –if diversity enhancement methods are used-
slow convergence –with the subsequent efficiency loss-.
A Genetic Algorithm for the Jobshop Problem 82
The time spent on this G.A. can be estimated as:
· ·t k S N c= +
where S represents population size and N the number of iterations performed, as
all the operations that compose the algorithm are applied to each member of the
population. This fact was proved by measuring computation time for different
population sizes.
0 5 10 15 20 25 30 350
5
10
15
20
25Computation time vs PopulationSize
Population Size [individuals]
Com
puta
tion
time
for
100
gene
ratio
ns [s
]
Figure 20 Computation time vs. Population size
A representative plot of the performance of the algorithm is included in Figure 21.
Through this entire chapter, the same example case will be solved with different
parameter tunings. This case is composed by 3 machines and 10 jobs containing 3, 4, 5,
6, 1, 2, 4, 4, 3 and 1 operations respectively.
The performance tests applied show that growing population sizes decrease the
probability of getting trapped in local optima, as they act as a guarantee to preserve
diversity. Computation time for each generation is inevitably higher, but the results
can be better.
Of course, this is not a direct relationship, as randomness plays a vital role in both
the algorithm performance and the initial population creation. In the plot below, it is
A Genetic Algorithm for the Jobshop Problem 83
possible to observe that, in some cases, this effect does not appear. This is due to the
consequences of that random construction of the initial population and subsequent
generations, so it can be concluded that a higher population size allows avoiding local
optima and speeds up convergence at the same time, decreasing the dependence of the
final result on the initial conditions.
0 20 40 60 80 100140
150
160
170
180
190
200
210Performance vs Population Size
Iterations
Tota
l mak
espa
n
351115212531
Figure 21. Performance vs. Population size
Thus, when having decided the maximum computation time for the algorithm, the
population size should be chosen to be the maximum value that maintains a prudent
number of iterations. The tests performed suggest that acceptable values for iterations
number are 80-100, independently of the size of the problem.
8.2 Survival of the fittest.
8.2.1 Natural selection
The main idea underlying Evolution Theory is that natural selection makes the
worse individuals disappear. The best will constitute the basis for the following
generation.
This principle of survival of the fittest has to be introduced in the genetic algorithm.
There are two main methods to replicate this mechanism:
A Genetic Algorithm for the Jobshop Problem 84
The first one, known as the roulette wheel, is based on the simple fitness
proportional rule. This stipulates that the expected number of copies of each
individual in the next generation is proportional to its fitness. More specifically,
if the population has a size of N members, the selection process will be carried
out by building a roulette wheel with N slots. Each slot corresponds to a member
and its angular width is proportional to the fitness of that individual. The wheel
is spun and fixed pointers select the members that will generate the next
generation.
In the second one, known as the tournament, a random permutation of the
existing solutions is obtained. They are then split up into a definite number of
groups and the fittest member of each group is selected.
The expected result is the same for both methods. Nevertheless, it seems that the
first one is more efficient in selecting the best individuals, as randomness is introduced
only once in the process, while in the tournament method it appears twice. For this
reason, the roulette wheel method was chosen.
A goal is built, making its N slots proportional to the relative fitness of each
individual. Then, some darts are thrown randomly, selecting the individuals that will
be used as parents for the following generation.
However, tournament method is not completely discarded, as it is included in the
immigration process that will be described in 8.4.
8.2.2 Never taking a step back
When implementing a S.A. algorithm, it is easy to see that its main advantage when
compared to non-heuristic methods –allowing up-hill movements- is also a
shortcoming, as the best solution visited can be lost in the process (of course, it can be
stored, but the ways it could lead to do not keep open).
The G.A. developed in this project eliminates this disadvantage –and therefore
improves both efficiency and effectiveness- by guaranteeing that the fittest individual
in each generation – the queen – does not disappear. It is identified and passed through
without any change to the following generation, somehow extending its life. This
individual, the best one in fitness terms, is also used in the asexual reproduction
process, imitating the role played by the queens of social insects.
A Genetic Algorithm for the Jobshop Problem 85
8.3 Mating season
8.3.1 Finding a partner
Once the individuals that will constitute the parents for the following generation –
the ones that have survived- have been selected, it is mating time. They look for a
partner randomly, by forming a random queue and forming a couple with the
individual that is closer to them.
Each pair of solutions will combine to form two children that will have similar
characteristics to them, but whose fitness cannot always be expressed as a convex
linear combination of their parents (see ‘Heuristic methods from a holistic point of view’).
8.3.2 Breeding process
The breeding operators described in section 2.2.1.5 are not valid, as the codification
used is completely different, and it is essential to find a mechanism that does not
generate infeasible solutions –monsters-.
This process is based in the codification structure, that is to say, in the job and
machine lists.
8.3.2.1 Machine list breeding
From both lists, a matrix that stores which machine performs each one of the
operations is built for each parent. After that, four random points are generated. The
submatrix they define is exchanged between the progenitors. The resulting machine
matrixes will be passed to the offspring. An example depicts the idea in Table 12, Table
13, Table 14 and Table 15.
O1 O2 O3 O4
J1 M1 M3 M2 M1
J2 M2 M3 M1
J3 M1 M3
Table 12. Parent 1
A Genetic Algorithm for the Jobshop Problem 86
O1 O2 O3 O4
J1 M2 M1 M1 M2
J2 M3 M2 M3
J3 M3 M2
Table 13. Parent 2
O1 O2 O3 O4
J1 M1 M1 M1 M1
J2 M2 M2 M3
J3 M1 M2
Table 14. Child 1
O1 O2 O3 O4
J1 M2 M3 M2 M2
J2 M3 M3 M1
J3 M3 M3
Table 15. Child 2
The feasibility of the offspring is guaranteed because both parents are valid
solutions.
8.3.2.2 Job list breeding
If a bare two-point crossover is tried here (see section 2.2.1.5), the generated children
will probably be monsters. The example below illustrates a possible mechanism for a
jobshop composed of 3 jobs with 3, 3 and 4 operations respectively.
A Genetic Algorithm for the Jobshop Problem 87
Parent 1:
1 2 2 3 1 3 3 1 2 3
Parent 2:
2 3 3 1 2 2 1 3 1 3
Child 1:
1 2 2 3 1 3 1 3 1 3
Child 2:
2 3 3 1 2 2 3 1 2 3
In child 1 job 1 appears 4 times (once more than necessary) and job 2 only twice
(once less than necessary). Child 2 has the opposite problems.
Therefore, subsequent changes have to be introduced in order to guarantee
offspring feasibility.
Firstly, the problems with the children generated by two-point crossover are
identified as missing or extra operations. The last ones are suppressed, eliminating
their first appearance in the bred block. Missing ones are inserted at the beginning of
the bred block, in the same order that they followed in the original sequence.
The result in the example would be:
Child 1:
1 2 2 3 1 3 2 3 1 3
Child 2:
2 3 3 1 2 2 1 3 1 3
In this way, all the produced individuals are feasible solutions and they actually
resemble their parents.
A Genetic Algorithm for the Jobshop Problem 88
Selection and combination are the basis for evolution, but some other elements are
introduced into the algorithm to improve its performance. Looking back to the
exploration vs. exploitation dilemma exposed in section 2, they can be classified
according to the criterion they enhance. They will be introduced in the next sections.
8.4 Exploration enhancing mechanisms
8.4.1 Mutation
Mutation operators are applied to the individuals, introducing random changes
with a low probability, then preventing early convergence and preserving population
diversity.
The mutation process used is basically the same that the one developed for
neighbour generating in the S.A. algorithm, except that the changes generated do not
represent a movement of the minimum distance (see section 7.3, ‘The importance of being
neighbours’) but can give origin to higher distances if mutation probability is
sufficiently elevated. This essentially imitates the DNA duplication processes in nature,
where the probability of introducing a change is fixed for every unitary copy made.
The total mutations introduced will correspond in average to the product of that fixed
probability times chain size.
The mechanism is related to the halves of the codification: job and machine lists.
8.4.1.1 Mutation mechanism
8.4.1.1.1 Machine list mutation
Machine list mutation is produced changing just one of the machines to another one
that is also capable of performing that operation. That seems intuitive and simple. An
example is included in Figure 22 and Figure 23.
A Genetic Algorithm for the Jobshop Problem 89
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 22. Initial solution
And the result changing the machine in which operation 1 of job 8 is performed
from 2 to 1 is included in Figure 23.
0 20 40 60 80 1000.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 23. Solution after a machine list mutation
A Genetic Algorithm for the Jobshop Problem 90
8.4.1.1.2 Job list mutation
In this case, mutation will be structured as a random two-job permutation.
For example, the same schedule in Figure 24 could be modified changing the
priority between the first and third operations and result in Figure 25.
As highlighted below, both mutation processes can happen more than once
depending on the mutation probability introduced.
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 24. Initial solution
The modification introduced exchanges in priority the first operations of jobs 6 and
7, returning the result in Figure 25.
A Genetic Algorithm for the Jobshop Problem 91
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 25. Solution after a job list mutation
8.4.1.2 Mutation probability
Mutation rate has to be tuned with care in order to take advantage of this
mechanism. On the one hand, if the value selected is too low, its effects will not be
noticed. On the other hand, if it is too high then the algorithm will be more similar to
just a global search method (see ‘Heuristic methods from a holistic point of view’) and it will
loose efficiency. It will search new paths without trying to go further along promising
directions. Moreover, the time spent in the mutation process can be expressed as
· ·t S s m= (8-1)
where S represents population size, s is problem size in total number of operations
and m symbolizes mutation rate
Thus, increasing mutation probabilities leads to enlarging computation times.
Performance tests return results as the ones included in Figure 26.
A Genetic Algorithm for the Jobshop Problem 92
0 20 40 60 80 100140
150
160
170
180
190
200Performance vs Mutation rate
Iterations
Tota
l mak
espa
n
0%0.1%1%5%10%25%50%
Figure 26. Performance for several mutation rates
The results suggest that values around 3-6% are preferable.
Nevertheless, mutation rate does not need to be constant. As it is a method for
increasing diversity, it should strengthen when the population uniforms. This project
fixes a minimum and a maximum value for it and takes an intermediate value between
them according to a linear function of the standard deviation of the parent generation
the fitness values. Some boundaries that have proved to give good results are [3%, 6%].
8.4.2 Immigration
In natural systems, immigration is a vital source of diversity. It is well known that
isolated populations tend to uniform and loose the diversity that is the key for being
able to face environmental changes.
8.4.2.1 Immigration rate
The value that will establish the amount of immigrants that will be inoculated to the
population is an important performance parameter for the algorithm. The possible
effects of an incorrect choice are exactly the same as for mutation rate. If it is too low,
then the mechanism will not be profitable. On the contrary, if it turns out to be too
high, global search grows in importance with the subsequent loss of efficiency.
A Genetic Algorithm for the Jobshop Problem 93
Some examples for different immigration rates for the same problem are displayed
in Figure 27.
0 20 40 60 80 100140
150
160
170
180
190
200
210Performance vs Immigration rate
Iterations
Tota
l mak
espa
n
0%5%15%20%40%60%
Figure 27. Performance for several immigration rates
To have a right perspective on the results of the plot, it is important to notice that
computation time has a linear dependence on immigration rate as shown in Figure 28.
5 10 15 20 25 30 35 4022
24
26
28
30
32
34
36
38
40
42Computation time vs Immigration rate
Tim
e [s
]
Immigration rate [%]
Figure 28. Computation time vs. immigration rate
A Genetic Algorithm for the Jobshop Problem 94
The tests performed indicate that immigration rate should take a value between 5
and 15% to obtain the best results. Again, it is possible to set variable immigration
rates. The decision taken in this work was to generate rates that vary with standard
deviation of the current generation between those recommended levels.
8.4.2.2 Immigration mechanism
Immigrants are considered individuals from a different population. They have
nothing to do with the current local members, so they are generated randomly in the
same way the first generation was.
Once they have been born and abandon their land, they are humoristically imagined
to draw close to the existing individuals. They form a team while they observe the local
population to identify the weakest members. Among them, each immigrant selects a
victim and attacks it. The combat, which is an implementation of the tournament
selection method (see ‘Survival of the fittest’) will finish with the immigrant eliminated
or occupying its prey’s place.
8.5 Exploitation enhancing mechanisms
8.5.1 Introducing a new concept. Asexual reproduction
8.5.1.1 Social insects and evolution
The classical definition of fitness in biology has been ‘size of the offspring generated’
from Wilson. Nevertheless, there are some cases in nature in which the individuals
seem to have renounced to reproduce themselves.
Hymenoptera, a group of social insects that includes bees and ants, have their
colonies composed by three different categories of members:
Working females (the most numerous), diploid individuals –with a double set of
genetic material-which come from a fecundated ovule (that is, from sexual
reproduction). They have their reproductive system inhibited, so they will never
generate an offspring, but they spend a high amount of their lives looking after
the queen’s children.
Males, haploid individuals –with a single set of genetic material- which came
from a non-fecundated ovule (from an asexual reproduction process called
A Genetic Algorithm for the Jobshop Problem 95
parthenogenesis). If they are lucky, they will have an opportunity to fecundate
the queen in its ‘nuptial flight’, which happens usually once a year.
A queen, which is entirely devoted to the reproductive process. It stores the
semen of the chosen males in a deposit inside its body and controls the rates of
the sperm releases it in order to define the timing of its fecundation and correctly
alternate sexual and asexual reproductive periods.
For these species, the high majority of males will never have the opportunity to
reproduce themselves, and all females are sterile. That comes to say that the
individuals have a very low –or completely null- fitness value.
Does that sound logical? In fact, social insects have spread to almost every habitat
on earth, from rainforests to deserts. They behave like a super-organism that results
surprising for the complexity of its coordination in matters like forage –having even
inspired the Ant colony optimization methods- or nest building. Definitively, they cannot
be considered badly fitted beings. What is the answer, then?
H. Hamilton proposed that Wilson’s fitness definition was simply wrong. He
introduced the concept of Kin selection.
According to him, females share some 50% of their genetic material with the queen,
about 25% with their haploid brothers and a surprising 75% with their sisters
(assuming a 50-50% mix in sexual reproduction and that males are also sons to the
queen). Therefore, when females look after their sisters there are also preserving their
own genetic material. And 75% is a rate that can be considered higher than the 50%
that a mother would share with its sexually born child, and much more elevated than
the 25% that spinster aunts share with their nieces.
Thus, Hamilton fitness definition was ‘the amount of genetically related individuals left’.
This idea of sexual and asexual reproductive patterns could also be applied to G.A.
8.5.1.2 Introducing an asexual system
Mutation process explores every random direction. What if more resources are
allocated to explore only the most promising direction?
A Genetic Algorithm for the Jobshop Problem 96
In the parthenogenesis mechanism created, the queen reproduces itself in an asexual
way, generating individuals that could be considered its neighbours in a S.A. context
(actually, even the same code is used). Some results of including this system are
included in Figure 29.
This time there is no time problem with increasing the rate, as the total computation
time for one iteration could be expressed as:
· ·((1 )·( · ) · )c m pt S s p K K m p K= − + + (8-2)
Where S represents population size, s is problem size in total number of
operations, p corresponds to parthenogenesis rate and m symbolizes mutation rate.
, ,c m pK K K are the time constants for crossover, mutation and parthenogenesis
respectively. Considering c pK K> , as the crossover process is slightly more time
consuming, increasing parthenogenesis rates decrease computation time linearly to
some extent.
0 20 40 60 80 100140
150
160
170
180
190
200
210Performance vs Pathenogenesis rate
Iterations
Tota
l mak
espa
n
0%5%10%15%20%40%60%80%90%
Figure 29. Performance for several parthenogenesis rates
A Genetic Algorithm for the Jobshop Problem 97
0 10 20 30 40 50 60 70 8014
15
16
17
18
19
20
21Computation time vs Parthenogenesis rate
Tim
e [s
]Parthenogenesis rate [%]
Figure 30. Computation time vs. Parthenogenesis rate
The results suggest that parthenogenesis rates around 15% are a good choice.
8.1 Computation time vs. Problem size
It is interesting to analyse the evolution of computation time for increasingly high
problem sizes. The performance tests carried out gave origin to the results presented in
Figure 31.
0 5 10 15 200
100
200
300
400
500
600Computation time vs. Problem Size
Problem Size (number of jobs)
Com
puta
tion
time
(s)
Figure 31. Computation time vs. Problem size
A Genetic Algorithm for the Jobshop Problem 98
If these results are analysed statistically, some different curves that can be fit to the
data are shown in Figure 32.
4 6 8 10 12 14 16 18 20-100
0
100
200
300
400
500
600
Curve fitting to Computation time vs. Problem Size
y vs. xQuadratic polynomialLinear Polynomial9-Degree polynomialExponential
Figure 32. Curve fitting to Computation time vs. Problem Size
The results of the different fits are presented in Table 16.
Fit Sum of the squared error Adjusted R-Square
Linear polynomial 30095,904 0,920
Square polynomial 7998,051 0,979
9-Degree polynomial 1,360 0,999
Exponential 15448,775 0,959
Table 16. Resuts of the different fits
From these results, it could be inferred that the computation time evolution follows
a polynomial pattern.
A Genetic Algorithm for the Jobshop Problem 99
8.2 Conclusions
Genetic Algorithms include at least one element of the four basic optimization
mechanisms global search, selection, combination and local search (see ‘Heuristic methods
from a holistic point of view’).
The tool developed generates a random population (performing a global search
action) to constitute a starting point. The size of this population is one of the most
important factors that have an influence on the performance of the algorithm, being
preferable the biggest possible ones.
The best solution –the queen- is always passed to the next generation, so the solution
never worsens between consecutive iterations. The fittest ones are selected in an
implementation of the roulette wheel method to be the parents for the next generation.
That is considered a selection mechanism. They form couples and breed, combining their
information.
Mutation and immigration are introduced as means to maintain diversity and are
respectively further local and global search systems.
An original metaphor of the asexual reproduction in social insects is introduced,
with important positive effects on algorithm performance as it explores further the
possibilities of the most promising direction in every moment. This could also be
considered local search.
Advisable values for the rates related to these mechanisms, as well as their impact
on computation time, have been analysed and provided.
Having a population of solutions instead of working with a single one and
implementing all the four elemental mechanisms provide G.A. with a considerable
robustness that, to the extent known by the author, is not matched by other heuristic
methods.
9Turning the mixer on.
Hybrids
Equation Chapter (Next) Section 1
Turning the mixer on. Hybrids 101
9 Turning the mixer on. Hybrids
After having created a
simulated annealing and a
genetic algorithm, it is
interesting to combine them
with the aim of creating hybrids
that could show a better
performance. There is no
guarantee that the resultant
amalgamation will surpass its
composing elements, but
designing a hybrid could be
worth trying.
9.1 First one, then another
The most intuitive way of creating a hybrid is performing both algorithms
sequentially. As G.A. become usually exhausted far before S.A. starts converging, this
will be the natural order in which they will be combined.
Both stages of the algorithm are directly taken from the mechanisms developed in
this project, using the values advised for each parameter and the dynamic cooling plan
proposed –because of its robustness- with a lower-than-usual starting temperature.
9.1.1 When is it time to make the switch?
The time moment in which the shift should be made should not be determined
beforehand. On the contrary, the genetic stage should be left to achieve a point of
exhaustion, in which no improvement has been made for a long time.
That point is determined by setting two variables. The first one will define a
minimum number of iterations that the G.A. will perform by force before being
allowed to give up. The second is the number of generations with no improvement that
will constitute the exhaustion alarm threshold.
Turning the mixer on. Hybrids 102
The first one is set according to the time allowed to solve the problem (notice that
the time spent in that stage will be linearly dependent on the number of iterations.
Independently of problem or population size, values bigger than 250 do not seem
efficient. For both conclusions, see the section ‘The mechanisms of natural selection applied
to complex problem solving. A Genetic Algorithm for the Jobshop Problem’.
Exhaustion threshold is a more delicate parameter. Experience suggests that values
around 75-100 generations are a good choice.
9.1.2 A family of solutions
The first hybrid proposed tries to get advantage of the robustness that working with
a population of solutions and not with a single one confers to G.A.. For that reason, the
S.A. leg of the hybrid will be formed by a small elitist population of solutions taken
directly from the most promising ones of the preceding stage.
The size of that population is the third important parameter of this hybrid. If it is
too high, the algorithm can loose efficiency (computation time of this stage is
approximately linearly proportional to family size), but if it is too low, the advantages
of working with a family of solutions can vanish. It is interesting to notice that the best
solution achieved after the full performance of the algorithm usually do not come from
the one that was the most promising in the previous phase, being this one just a local
optimum. This fact highlights the importance of choosing a set of solutions instead of
only one. Some examples of performance are provided in the Figure 33 , Figure 34 and
Figure 35.
Turning the mixer on. Hybrids 103
0 50 100 150 200 250 300145
150
155
160
165
170
175
180
185
190
195Hybrid performance with 4-member family
Iterations
Tota
l Mak
espa
n
Figure 33. Hybrid performance with 4-member familiy
0 50 100 150 200 250 300 350 400140
150
160
170
180
190
200Hybrid performance with 6-member family
Iterations
Tota
l Mak
espa
n
Figure 34. Hybrid performance with 6-member familiy
Turning the mixer on. Hybrids 104
0 100 200 300 400 500 600140
150
160
170
180
190
200Hybrid performance with 12-member family
Iterations
Tota
l Mak
espa
n
Figure 35. Hybrid performance with 12-member familiy
Experience shows that family sizes between 6 and 12 are a good choice for problem
sizes of around 10 jobs, but should be bigger for more complex problems.
Although this hybrid guarantees better solutions as it continues exploring when the
G.A. is exhausted, it is far more time consuming. Considering the average time spent:
Figure 36. Performance comparison between the first hybrid and the original G.A.
Turning the mixer on. Hybrids 105
9.2 Non-darwinian G.A.
Even nowadays, some biologists find that there are some wonderful facts in
evolution that natural selection mechanisms can hardly explain. It is interesting to turn
back to the roots of the evolution theory in order to revise the fundamental schools of
thought that tried to explain the evidences of evolution.
Jean-Baptiste Lamarck supported the idea of acquired characteristics inheritance.
On the contrary, Alfred Russel Wallace rejected it completely, although even Darwin
would not rule it out. August Weismann, the most prominent "neo-Darwinian" of the
time after Darwin argued that hereditary material, which he called the germ plasm,
was kept utterly separate from the development of the organism. This was seen by
most biologists as an extreme position, however, and variations of neo-Lamarckism,
orthogenesis ("progressive" evolution), and saltationism (evolution by "jumps" or
mutations) were discussed as alternatives.
Mutation is already implemented in most G.A. models, but usual mutation
operators follow a pattern of random jumps. Implementing a Lamarckian process, in
which these jumps are orientated towards the direction of improvement is out of the
standard. Again, the idea is worth trying.
By means of that mechanism, individuals could develop positive characteristics and
pass them through to their offspring. This will therefore take the form of a local search
mechanism.
This idea is implemented inside the mutation operator, which will now consist in a
small S.A. performance, so the chances are that it will actually improve the solution –
and will diminish the probability of a resulting disadvantageous change that will
maybe spoil some of the computation effort needed to produce the following
generations. Of course, that different mutation operator will take more time than the
usual one, but the time savings mentioned can surpass those efficiency losses.
The performance of this second hybrid is compared to the former G.A. developed in
Figure 37.
Turning the mixer on. Hybrids 106
0 20 40 60 80 100150
155
160
165
170
175
180
185
190
Iteration
Tota
l mak
espa
n
Non-darwinian mutationConventional G.A.Non-Darwinian G.A.
Figure 37. Non-darwinian mutation algorithm compared to the conventional G.A.
Computation time is bigger for the hybrid, although it is faster at the beginning
(when the non-darwinian mutation mechanism has more upside for improvement).
1 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7Compared performance: Original version(1) and Non-Darwinian(2)
% Improvement / s
Figure 38. Compared performance between the non-darwinian mutation algorithm and the original G.A.
Experience suggests it is better to set two different stages. In the first one that
mechanism could work, while in the other the conventional mutation operator is used
(as upside has been considerably reduced).
Turning the mixer on. Hybrids 107
An example of compared results for the first iterations of a performance is shown in
Figure 39.
0 20 40 60 80 100140
150
160
170
180
190
200Hybrid performance
Iteration
Tota
l Mak
espa
n
Conventional G.A.Non-darwinian hybridCompound G.A.
Figure 39. Performance comparison between the conventional G.A., the non-darwinian hybrid and the compound
version
As the figure indicates, the compound algorithm takes advantage of the efficiency of
the non-darwinian hybrid at its first stages but then goes back to the ordinary mutation
operator in order to avoid getting trapped in local optima and allow some members of
the population to mutate in disadvantageous directions.
Figure 40. Performance comparison between the original,the non-darwinian and the compound version
Turning the mixer on. Hybrids 108
The results in computation time are also good, outperforming both algorithms.
9.3 Conclusions
In summary, the considered amalgamations have relatively good performances
when compared to their compounding parts.
The sequential one shows an excellent ability to escape from local optima. Working
with a family dramatically increases the possibilities of conventional S.A. methods.
Moreover, despite some unsuccessful trials of using better-than-normal starting points
and lower-than-normal starting temperatures have been documented in the research,
this time that strategy has proved to be very convenient. As an inconvenient,
computation time is higher than in other cases.
The non-darwinian hybrids proposed, specially the compound one, outperform the
most classical version. It is important to note that the first G.A. developed does already
embed a local search mechanism that focuses in the best directions (the parthenogenesis
system). It could be surprising to see that including a more sophisticated version of this
same basic element can actually lead to further improvements.
10Anticipating machine failure.
An approach to the Robust
Jobshop
Equation Chapter (Next) Section 1
Anticipating machine failure. An approach to the robust Jobshop 110
10 Anticipating machine failure. An approach to the robust
Jobshop
10.1 Introduction
One of the most
interesting research lines
to enlarge this project
would be considering a
robust jobshop. These
models were developed in
an attempt to take into
consideration that the
assumption that the machines will all always be available to process the scheduled jobs
is just an idealization. As stated in [JENSEN 1999], ‘A quality solution which can be
modified easily according to changes in the environment may be more valuable than an optimal
solution that does not allow easy modifications’.
The complexities of such a model are quite clear: to return a valuable solution, all
the possible failures should be considered, taking also into account the probabilities of
each one.
10.2 Not so many combinations
Each failure is determined not only by the broken machine but also by the operation
it was performing when that happened and the type of failure occurred. The latest can
fall into one of the two categories considered:
The operation has to be reprocessed in another machine
The job has been entirely spoiled
The occurrence of a failure is supposed to be infrequent, and thus having more than
one failure in the same instant is a discarded possibility. Although some different
Anticipating machine failure. An approach to the robust Jobshop 111
failures can happen sequentially and this situation could be actually studied with the
help of the developed tool, in principle only simple failures will be considered.
Thus, the total space of possible breakdowns can be expressed as:
1
2·i
i
m
ij klk ij l
F M=
= ∑∑ (10-1)
where:
1,...,1,...,1,...,
i
k
i nj ol p
===
n : Number of jobs to be scheduled.
io : Operations that compose the job i .
m : Total number of available machines.
kp : maximum number of operations that can be performed on machine k .It is
calculated as k ijkij
p P=∑ and ijkP is a binary parameter that takes the value ‘1’ if
and only if the j -th operation of job i can be performed on machine k .
To illustrate problem size, an example where all jobs are composed of four
operations, performed on different machines, is included in Figure 41.
0 10 20 30 40 500
200
400
600
800
1000
1200
1400
1600Total amount of possible failures
Number of jobs to schedule
Pos
sibl
e fa
ilure
s
Figure 41. Number of posible failures vs. problem size
Anticipating machine failure. An approach to the robust Jobshop 112
As the figure shows, total amount of possible failures increases linearly with the
number of jobs if the number of operations per job and the machines are kept constant.
The intricacies of solving a robust jobshop do not come from the amount of
potential breakdowns but from the implicit complexity of each individual problem (see
‘Brief considerations about jobshop problem complexity’).
10.3 Concentrating on what is relevant
The most classical criterion would be calculating exhaustively the expected
makespan from the probabilities of each failure and the total makespan resulting from
the recomposed schedule and then choosing the best option. The problem of
combinatorial explosion makes evaluating all the solutions impossible, so only a
definite set of solutions (the best found by the algorithm) will be analysed.
The proposed algorithm, however, suggests reducing the space of failures
considered to only the most probable ones according to statistical studies or the ones
that are foreseen to have a bigger impact –from the sparcity of the generated matrixes,
which gives an idea of redundancy between machines, or from the unitary machine
loads-, for example.
In fact, the patterns of machine failure will probably follow the Pareto rule. This
principle (also known as the 80-20 rule, the law of the vital few and the principle of
factor sparsity) states that for many phenomena, 80% of the consequences stem from
20% of the causes. Mathematically, if a set of participants share something, there will
always be a number k between 50 and 100 such that k% is taken by (100-k)% of the
participants. k may vary from 50 in the case of an equal distribution to almost 100 in
the case of a very small number of participants taking almost all of the available
resources. The number 80 is not special, but many systems actually have k lying
around this region of intermediate imbalance in distribution. So, if machine failure
follows this rule, studying around 20% of the most important breakdowns according to
our criteria will be enough to obtain a solution that is robust in the 80% of the cases.
Anticipating machine failure. An approach to the robust Jobshop 113
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Pareto Distribution
A B C
20% of the total possible
machine failuresaccount for 80% of consequences
Figure 42. Pareto distribution
An alternative and beautiful illustration of this phenomenon is linked to Shannon
information theory. According to him, a good measure of the information contained in
a given data 1 2, ,..., nS s s s= can be expressed as what is known as Shannon entropy –
or neg-entropy, as an ordered system provides with more information -:
( )1
1( )·log2 ( )
n
ii i
I S p sp s=
⎛ ⎞= − ⎜ ⎟
⎝ ⎠∑ (10-2)
The basic unit of information is called bit and refers to the information of the
occurrence of an event with a 50% probability.
This formula can be applied to the case of a manager who wants to be informed
only if the relation between real and standard costs -which are known to follow a
N(1,0.2) and are considered to take only the values [0.96 0.97 0.98 0.99 1.00 1.01 1.02
1.03 1.04] – move from 1 in more than 0.02 units.
The amount of information that the cost rate contains is, when applying formula
(10-2) in the case of an equiprobable distribution, 2log 9 3.16= bits. If the normal
probabilities are taken into account, this value turns out to be 1.66 bits, which means
that knowing the distribution followed by the data stores 3.16-1.66=1.5 bits.
Anticipating machine failure. An approach to the robust Jobshop 114
Coming back to the manager, he would be informed just some 28% of the time,
receiving an amount of information of 3.3 bits. Apart from that, he knows the
distribution, having a total knowledge of 0.28 3.3 + 1.5 = 2.42 bits
Following this criterion, he gets some 2.42/3.16 = 76.5% of the maximum amount of
information in just 28% of the time, so its performance increases to 76.5/28 = 273%
In the same way, the proposed model just takes into account the most probable
failures, dramatically decreasing the amount of resources needed to make the
calculations (for example, to a 20% ) and indeed having a result that will be the best
one in robustness most of the time (for example, in the 80% of the cases).
10.4 Considering a real problem
This project develops also a very useful tool that will take advantage of the positive
features of the algorithms developed to analyse the robustness perspectives of a
proposed solution.
The structure of the failure analysis system is as simple as repeating the jobshop
solving with the new constraints imposed by each breakdown –assuming that the out
of order machine will not be available in the time horizon considered and noticing that,
after its loss, one or more jobs could be impossible to finish-, and taking note of the
worsening of the solution that the failure would lead to.
The model returns an alternative schedule to a given one in each case and computes
its expected fitness. In the case that there is no possible solution, it warns that one or
more jobs cannot be accomplished and returns a list of the conflictive operations and
an alternative schedule for the remaining jobs.
10.5 A simple example
An 8-job 3-machine problem is analysed below.
PList, the matrix that stores the possible machines that can perform each operation,
is included in Table 17:
Anticipating machine failure. An approach to the robust Jobshop 115
Op1 Op2 Op3 Op4
Job1 1 3 2,3
Job2 1 2 1 3
Job3 3 2
Job4 3 2 1,2
Job5 1 1 3 3
Job6 1 3 1
Job7 2
Job8 1,2 1,3
Table 17. Example of performance possibilities for each operation
The pseudo-optimal schedule obtained for these constraints could be represented in
a Gantt diagram as in Figure 43.
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 43. Resulting solution
Anticipating machine failure. An approach to the robust Jobshop 116
If machine 1 breaks down immediately after finishing the first operation of job 5 and
spoiling it completely, the best schedule is represented in Figure 44.
0 10 20 30 40 50 60 70 800.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(6,1)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 44. Resulting schedule after the considered failure
Note that only the jobs that can be performed now have been programmed. The
algorithm returns also a list with the operations that could not be possibly performed
because now there is no machine with that capability - (2,3), (5,2) and (6,3)- . That is the
reason for the number of jobs to be reduced as it appears on Figure 44.
11Introducing random time values.
An approach to the stochastic
Jobshop problem.
Equation Chapter (Next) Section 1
Introducing random time values. An approach to the Stochastic Jobshop problem 118
11 Introducing random time values. An approach to the Stochastic
Jobshop problem
Through all the work presented, all the parameters considered have been
deterministic. This section explores the possibilities of taking into account the fact that
processing, change, set up and transportation times are actually random.
11.1 Randomness inside the Jobshop
If we would like to introduce randomness in our problem, then considering
stochastic time values seems the logical starting point. This leads, in principle, to
abandon optimisation techniques and resource to simulation tools.
However, in the case of metaheuristic algorithms this is not true. Randomness can
be introduced in every part of the mechanism.
In the stochastic model developed this is accomplished by modifying the function
that evaluates fitness for each solution. Now it will perform a sufficient number of
iterations – a few hundreds can be enough – generating each time value from a
probability distribution taking into account the deterministic value. Thus, the
simulation will be structured as one resolution of the whole jobshop in which a
simulation loop will be performed each time that a solution has to be evaluated.
The chosen distribution was the Weibull one, which is often used to model the time
until a given technical device fails.
1
( ; , ) · ·kk xk xF x k e λλ
λ λ
− ⎛ ⎞−⎜ ⎟⎝ ⎠⎛ ⎞= ⎜ ⎟
⎝ ⎠ (11-1)
If the failure rate of the device decreases over time, the parameters should be chosen
so k < 1 (resulting in a decreasing density f). If the failure rate of the device is constant
over time k = 1, again resulting in a decreasing function f. On the contrary, if it
increases over time, k > 1 and the density f increases towards a maximum and then
decreases forever. Manufacturers will often supply the shape and scale parameters for
the lifetime distribution of a particular device or, if they do not, they can be estimated
from statistical failure data.
Introducing random time values. An approach to the Stochastic Jobshop problem 119
In this case, the setup, transportation and processing times are simulated from the
initial value of the deterministic parameter using the expression
( )* 1 0.05·( ( ))T P r E r= + − (11-2)
where:
T : Simulated time value
P : Deterministic parameter for that value
r : random component of the simulated time, which is obtained from a Weibull
distribution W(1,2) –chosen because its shape seems reasonable-, and is represented in
Figure 45.
As having a random component with mean 0 allows keeping the deterministic
parameter as the average of the result and the chosen distribution has a positive mean,
the distribution is displaced by subtracting this value.
As long as 0.05· ( ) 1E r < , the resulting T will be positive, and this condition is true
for the parameters considered.
In addition, the importance of this random component is considered dependent of
the size of the deterministic parameter, choosing 0.05 as an acceptable value (in the
case of a very small parameter the resolution would be essentially the same as in the
deterministic performance and a too high random component would mean that the
problem is not worth solving, as the dispersion of the results would be very elevated).
For example, the resulting distribution for a parameter of 15 results to be as plotted in
Figure 46.
Introducing random time values. An approach to the Stochastic Jobshop problem 120
0 0.5 1 1.5 2 2.5 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Data
Den
sity
Random component from Weibull distribution
Mean = 0.887
Figure 45. Random component from a Weibull distribution
14.5 15 15.5 16 16.5 170
0.2
0.4
0.6
0.8
1
1.2
Data
Den
sity
Resulting distribution for P=15
Mean = 15
Figure 46. Resulting distribution in a specific example
Introducing random time values. An approach to the Stochastic Jobshop problem 121
11.2 Adapting the algorithm
This leads, in principle, to abandon optimisation techniques in favour of simulation
tools. Nevertheless, in this case it is much simpler: when evaluating the fitness of a
solution, inside the function F_CodeToSchedule, the time values are simulated from the
deterministic parameter instead of taking it directly. The evaluation is repeated a
sufficient number of times –a quite hundreds are enough, as stated before-. The
statistical distribution of the results of these iterations is obtained and the algorithm
continues to operate as usual with the mean values.
The computation time taken, ceteris paribus, is a linear function of the number of
iterations performed to evaluate the fitness of a solution, but this should not be
reduced excessively in order to maintain the accuracy of the result.
Thus, the existing models have been successfully adapted to solve the stochastic
version of the Jobshop problem.
12Further suggestions to the
Jobshop responsible.
Applying T.O.C.
Equation Chapter (Next) Section 1
Further suggestions to the Jobshop responsible. Applying T.O.C. 123
12 Further suggestions to the Jobshop responsible. Applying
T.O.C.
The models developed in this
project are more ambitious than
merely suggesting pseudo-optimal
scheduling solutions. One of the most
interesting and original add-ins is the
use of T.O.C. (Theory of Constraints)
applied to the final schedules to
obtain interesting data that can be
very useful to highlight the best ways
in which the shop machinery could be
improved.
12.1 Introduction to the Theory of Constraints
In the 70’s, Eliyahu Goldratt, an Israelite physician, introduced in his well-known
book ‘The goal’ a new production management system that he called Optimised
Production Technology (OPT). It was based in production flux balancing and
bottleneck management. Subsequently, he spread that philosophy to all the company
subsystems and created the Theory of Constraints as a managing system centred on the
system constraints.
As a starting point, Goldratt defines the goal for a production company as simply
earning money –that should be measured through indexes as net profit, profitability or
liquidity-. As these parameters are too general and lack an exploitation meaning, other
three indexes are presented as more useful: net revenue (money entering the company
system), inventories (money that is retained inside the system) and operation expenses
(money invested in transforming inventories to revenues). In order to progress
towards its goal, the company should work to increase net revenue and lower
operation expenses and inventories.
Consistently with T.O.C. , Systems Theory and Games Theory, it can be stated that
that the bare summation of local objectives in each of the production subsystems does
Further suggestions to the Jobshop responsible. Applying T.O.C. 124
not lead to the global optimum. Moreover, the strength of the production chain is
determined by the weakest of its links. That weakness cannot be compensated by
strengthening another subsystem (although there are statistical fluctuations only the
negative ones –the ones that will cause delays- can be passed through because of the
dependence relation in each of the steps –if some of them is performed faster, it will
have to wait to be processed in the next step as the machinery cannot increase its
speed-).
Once those ‘weak links’ have been identified, all the management decisions have to
be taken in such a way that they are subordinated to them. This gives origin an
iterative process, as the constraints can change as a consequence of the management
decisions.
The main principles of that kind of management can be expressed as:
It is not production capacity what should be balanced but production flux.
Bottlenecks (‘weak links’) determine production capacity. The other resources
should follow their rhythm in the same way that a boy-scouts group will finish
their walk without stopping to wait the slowest boys or without moving away
from the first one if they leave the slowest child to lead the group.
Non-bottleneck resources usage should be determined by other system
constraints. If that is not done, it will produce stock that will queue in order to be
processed (and, as stated below, growing inventories prevent the company to
advance towards its goal).
A bottleneck resource constraints each of the other resources. ‘One lost hour in a
bottleneck resource is one lost hour for the whole system’ [GOLDRATT 84].
One hour earned in a non-bottleneck resource is an illusion. The inventory
produced will have to wait to be processed in the bottleneck one.
Varying batch size can supply the production chain with flexibility and, if it is
decreased, it can lower processing time.
The implementation of these ideas will lead to an increase in net revenue and a
decrease in inventories and production costs, and consequently the company will be
nearer to its goal expressed as making money.
Further suggestions to the Jobshop responsible. Applying T.O.C. 125
12.2 Bottleneck concept applied to Jobshop production
12.2.1 The main difference
Although the Theory of Constraints (T.O.C.) body of knowledge was created to be
applied to production chains, its generalised concepts can be also applied to jobshop.
The main difference between these two kinds of environments is that, for a jobshop
system, neither work sequence nor processing times are determined. That provides it
with flexibility to successfully perform very different kinds of job, but at the price of
loosing efficiency.
Variable sequences and processing times result in variable bottlenecks. That is to
say, bottlenecks vary with schedule. And, as the jobshop structure was chosen because
flexibility was needed, that means that bottlenecks will actually vary.
This implies that the T.O.C. analysis is more complex, but can also provide
manufacturing managers with very interesting information.
12.2.2 Traces of a bottleneck
What kind of analysis should be done in order to correctly identify bottlenecks?
Having the pseudo-optimal schedules that the algorithm returns, it should be an easy
job.
As Goldratt stated, ‘One lost hour in a bottleneck is one lost hour for the whole system’.
[GOLDRATT 84]. That means that a bottleneck resource will correspond to critical
activities in a CPM sense. If the schedule is in fact optimal, bottleneck resources will be
charged to their limit, while non-bottleneck ones will have lower charges. Thus, the
first indicator to recognise a bottleneck is having it operating at its maximum charge.
It is also important to notice that inventories tend to accumulate just before
bottlenecks, waiting to be processed. Consequently, it is important to analyse work in
process inventories through all the system to detect these accumulations.
12.2.3 T.O.C. Study
All these variables are easily computed with one of the add-ins of the algorithm,
named ‘T.O.C. Study’. When performed, the function returns total charges for each one
Further suggestions to the Jobshop responsible. Applying T.O.C. 126
of the machines, identifies bottlenecks and plots work in process inventories to give a
quite useful idea of what is really happening on the shop floor.
To depict these ideas, an example of output will be presented.
Firstly, the optimal schedule for a given problem was expressed in the Gantt
Diagram in Figure 47.
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 47. Gantt diagram for the optimal schedule obtained
The T.O.C. Study outputs were:
• MachineCharge: [1.0000 0.8475 0.9904], a vector which expresses unitary
charge for each machine.
• BottleNeck: 1, as machine 1 is loaded to its maximum.
A graphical output of W.I.P. (work in process) is also returned in Figure 48.
Further suggestions to the Jobshop responsible. Applying T.O.C. 127
0 20 40 60 80 1000
2
4WIP Charge on machine 1
0 20 40 60 80 1000
2
4WIP Charge on machine 2
0 20 40 60 80 1000
1
2WIP Charge on machine 3
Figure 48. W.I.P. evolution for that solution
From the graph on Figure 48, it can be seen that although machine 1 is loaded to its
maximum, there are quite high accumulations of WIP before machines 2 and 3.
Actually, they are quite balanced, so there are no important conclusions. See Figure 49
for a different example output.
0 20 40 60 80 100 120 1400.5
1
1.5
2
2.5
3
3.5
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(4,1)
(4,2)
(4,3)
(5,1)
(5,2)
(5,3)
(5,4)
(6,1)
(6,2)
(6,3)
(7,1)
(8,1)
(8,2)
time(min)
mac
hine
Gantt Diagram
Figure 49. Gantt diagram for the solution
Further suggestions to the Jobshop responsible. Applying T.O.C. 128
Its results were:
• MachineCharge: [1 0.3768 0.6680]
• BottleNeck: 1
0 20 40 60 80 100 120 1400
2
4WIP Charge on machine 1
0 20 40 60 80 100 120 1400
1
2WIP Charge on machine 2
0 20 40 60 80 100 120 1400
0.5
1WIP Charge on machine 3
Figure 50. Corresponding W.I.P. evolution
This time, the identified bottleneck continues to be machine 1, but the WIP waiting
to be processed in it is much bigger than the charge in the others. That means that, for
this particular job charge, machine 1 constraints the whole system. Thus, if that job
charge was representative of the jobshop and the machinery responsible were able to
buy new machines, the best choice would be machine 1.
Using this kind of graphs effectively helps taking decisions about increasing
capacity. However, those studies should be performed on several different schedules
that could be considered representative of the future needs of the shop in order to get
valuable information.
In addition, knowing the schedule bottleneck can be useful to pay more attention to
that resource –making a preventive maintenance, for example-, given that every delay
in that machine will cause an overall delay.
Further suggestions to the Jobshop responsible. Applying T.O.C. 129
As a conclusion, it is useful to study the bottleneck resources of the jobshop in order
to pay more attention to them and assess managers to take investment decisions. This
project supplies with a tool that provides the information necessary to correctly
identify these bottlenecks.
13Can you possibly know how far
you are from your goal, if you do
not even know where it is?
Can you possibly know how far you are from your goal, if you do not even know where it is? 131
13 Can you possibly know how far you are from your goal, if you
do not even know where it is?
As stated before,
for getting the
efficiency that
heuristic methods
can provide it is
necessary to make a
renounce. We will
never know what
the global optimal
point is. It seems
that accepting a solution with our eyes blinded, without knowing how good it is, is
quite risky. Should we have employed more time or resources to try to improve the
current solution? Could that be economically reasonable, or the amount of effort
needed is not worth making?
This problem may sound familiar. In branch and bound methods, a bound is
assigned to each branch by solving a relaxed problem. As this boundary would be the
best possible solution for the branch but there is no guarantee beforehand that it can be
obtained, the whole branch will be eliminated if that value means no improvement
when compared to the best solution found until the moment. That implies that there is
always a value for the error level of the approximated solution if not all the branches
are checked.
However, the Jobshop is different. It seems intuitive that the linear programming
model developed in this project creates more complexity than the non negligible
amount already present in the model. In the author’s opinion, this is clear because the
problem is much easier to understand that the LP model and this should not happen.
One of its consequences is that the constraints in the model do not have a
straightforward meaning. As a result, there is no sense in trying to solve a relaxed
problem in which binary variables can take values intermediate between 1 and 0, for
Can you possibly know how far you are from your goal, if you do not even know where it is? 132
example. The problem has no meaning and the solution is therefore not a valid
approximation.
Nevertheless, it is not time to give up. Although this kind of error delimitation has
barely been explored in the research work, this project makes an effort to provide
improvement margins for the solutions obtained by the use of a truly relaxed problem
and by an innovative statistical analysis.
13.1 An obvious bound. True relaxation
The first thing that can be done is setting two completely obvious limits. They
correspond to the solutions of the truly relaxed problem. That is to say, to a problem
that ignores some of the constraints of the problem – not the LP model-. In particular,
the most suitable constraints to be relaxed are:
All the operations of a job must be performed
A machine can perform only one operation at a time
If we only considered the problem from the job perspective, without taking the
machines into account, the ideal situation would be described as having each operation
performed on the possible machine in which the task takes shorter. Then, the
completion time of the job would be the summation of operation processing times. The
ideal schedule finishes with the longest job completion time.
Regarding only the machines, another bound can be calculated. The minimum work
load for each machine is the summation of the processing times of all the operations
that can only be performed on it. That also represents a lower limit for each machine,
and the maximum of them can be used as a second approximation.
Both bounds are very inexact, but definitively more useful than relaxed MIP.
A second method based in statistics will be introduced in the next section.
13.2 Can you estimate it?
13.2.1 Introducing the problems of the data to fit
When studying how to fit density functions to data in statistics lessons, it seems that
we always know beforehand the kind of function that will suit the data best, because of
Can you possibly know how far you are from your goal, if you do not even know where it is? 133
previous assumptions about the data source or just because the case is obvious. For
example, we can be given a set of points with an empirical density function similar to
the one on Figure 51.
1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Data
Den
sity
Empirical Density Function
Figure 51. Example of density function
At first sight it seems quite symmetrical, unimodal… maybe fitting a normal could be a
good idea. If we want to explore other possibilities, we could try with a Normal
distribution, for example, as shown in Figure 52.
1 2 3 4 50
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Data
Den
sity
Empirical Density FunctionFitting a Normal Distribution
Figure 52. Normal fit
Can you possibly know how far you are from your goal, if you do not even know where it is? 134
The results are quite good and it can be inferred that the data actually came from a
N(2.87,0.89) with a 98% confidence level4.
Now it is simple to apply statistical methods to obtain more interesting information.
However…what happens if we are not in that privileged position, that is to say, we
have no idea of which function could fit the data beforehand? That means that
parameter estimation is not enough, as we do not know which parameters should be
estimated and how.
That is the case of the objective values obtained thorough the performance of the
algorithm. They are not as easy to assimilate to a specific function as the example in
Figure 53. To illustrate this affirmation, some data from one of the GA models
developed in this project is plotted in Figure 53.
84 86 88 90 92 94 960
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Data
Den
sity
ObjectiveRecord dataNormal fitWeibull fitLognormal fitGamma fitInverse Gaussian fit
Figure 53. Different widely used distributions applied to the data
As it can be seen from the inspection of the figure, there is no classical function that
could be assimilated to this data.
4 The level of confidence was returned by a Chi-Square test function developed by the author. Both the
graphs and the estimation where obtained with MatLab distribution fitting toolbox. The data for the
example were also generated using MatLab.
Can you possibly know how far you are from your goal, if you do not even know where it is? 135
It seems that its nature (multi modal, asymmetrical, irregularly spaced) will probably
not fit into any density function. This has been assumed by most heuristic designers.
However, this is not true.
13.2.2 No parameters? A different kind of density distribution
The function that has been successfully used to fit the objective function data is
different from the other ones represented in the plot below. The key aspect is that it has
no parameter.
Some disadvantageous aspects of classical fitting will be highlighted first, and the
solution provided by non-parametric distribution fitting will then be analysed.
The first thing is that the whole aspect of the histogram may vary depending on
how we select the boundaries of the bins. We could imagine that smaller bin sizes
allow more precise representation, but when exceeding a certain level this can result in
empty bins. It is difficult to determine the optimal smoothing level. To depict this idea,
an example is provided in Figure 54.
Figure 54. Examples of bins with different sizes
84 86 88 90 92 94 96 980
0.05
0.1
0.15
0.2
0.25
Data
Den
sity
ObjectiveRecord data (2 )
84 86 88 90 92 94 960
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Data
Den
sity
ObjectiveRecord data
86 88 90 92 94 960
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Data
Den
sity
ObjectiveRecord data (4 )
86 88 90 92 94 960
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Data
Den
sity
ObjectiveRecord data (5 )
Can you possibly know how far you are from your goal, if you do not even know where it is? 136
However, many alternatives are available when trying to avoid histograms. One of
the simplest and most well-known is the naïve estimator. It approximates the density
function without setting bin limits in any way, just by placing a box of height 1(2 )nh −
and height 2h on each observation and summing all the boxes (which gives a total area
of 1). Smoothness is controlled by the choice of h . Naïve estimator has the
disadvantage of being discontinuous –with jumps on every step-. A formal expression
of this estimator could be:
1 1
1 1 1( )2 2
n ni i
i i
x x x xx wnh h n h h
φ= =
− −⎛ ⎞ ⎛ ⎞= Π =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
∑ ∑
Where Π represents the rectangle or boxcar function
Kernel density estimate overcomes the difficulty of discontinuity by using an
estimation that takes the form:
1
1( )n
i
i
x xx Knh h
φ=
−⎛ ⎞= ⎜ ⎟⎝ ⎠
∑
Where ( )K x is usually chosen as a symmetric probability density function –with the
constraints that this imposes-, such as1
0
( ) 1K x dx =∫ .
The so-obtained estimation is continuous and differentiable. Again, h is used as a
smoothing parameter or bandwidth. As stated in [AGNEW 04], these estimates suffer
from some drawbacks, especially when applied to long-tailed distributions. The
estimate tends to be noise in the tails of the distribution –as usually data are more
sparse in the tails-. This effect can be mitigated by lowering h , but the price to pay is
resolution loss in the centre of the distribution –and not noticing interesting details
such as multiple modes-.
Further improvements can be made to this distribution. One of them is trying to
adapt the amount of smoothing to the local density of the data. Nearest Neighbour
Method controls the level of smoothing based on the size of a box required to contain a
determined number of observations. It calculates the distance from each observation to
the nearest point ( ) min id x x x= − . A generalization of this concept of distance would
be using the k -th nearest neighbour. The estimation expression results:
Can you possibly know how far you are from your goal, if you do not even know where it is? 137
1
1( )( ) ( )
ni
ik k
x xx Knd x d x
φ=
⎛ ⎞−= ⎜ ⎟
⎝ ⎠∑
13.2.3 Applying non-parametric density function estimation to the data provided by the
algorithms
When comparing the results of the fit provided by this kind of estimation to the
ones shown in last section, it becomes evident that this is a much more suitable
approach.
86 88 90 92 94 960
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Data
Cum
ulat
ive
prob
abili
ty
ObjectiveRecordNon-parametric fit
Figure 55. Example of non parametric fit
Moreover, Kolmogorov-Smirnoff test function returns the following information:
P = 0.9000
MaximumDifference between the empirical and the estimated CDF : 0.1369
KS statistic values for 90,95 and 99% confidence level: [0.3107 0.2591 0.2336]
As it can be seen, the approximation is quite valid, so it is possible to make an
estimation of its minimum value. It is obtained as the point x for which 0
( )t x
t
t dt lφ=
=
=∫ ,
where l is a small parameter that in this case has been fixed to 0.005.
Can you possibly know how far you are from your goal, if you do not even know where it is? 138
For the example case, the estimate is 83.8600, which means that the best schedule
obtained (with a total makespan of 84.64) can be, in the worst case, at a 0.91% distance.
13.3 Conclusions
The error that the heuristic methods incur can be estimated, not only by solving
relaxed problems –noticing what this really means in the context of the problem- but
also by density function estimation.
For this second method, a non-parametric function is fit. A function that develops
such a fit and performs a Kolmogorov-Smirnoff test has been developed. In addition to
data that evaluate the fitness of the estimation, a minimum value is estimated.
This way, it is possible to now how far we are from our goal.
14Next steps
Next steps 140
14 Next steps
Having accomplished the
project objectives, it becomes
clearer to identify the
possible lines of research that
would be logical to follow.
Most of them have already
been introduced, but it is
interesting to briefly express
them explicitly in order to
define future work.
A fuzzy jobshop: it is maybe one of the most interesting lines of research, as it
reflects reality better than other models. The idea is to introduce imprecise time
data, that will be preferredly modeled as trapezoidal numbers as they are more
easily understood and defined. As a consequence, the algorithm should be able
to work with poligonal numbers as they appear when operating trapezoidal
ones. Specific arithmetic operators have already been designed for these
numbers, although an efficient algorithm for calculating fuzzy maximums and
minimums has not been developed yet, and it would be an essential part of the
algorithm. The core of the models could remain the same, but the way in which
fitness is evaluated should give a fuzzy result that would be defuzzied according
to a consistent funtion in order to calculate survivance probabilities. In addition,
the algorithm should be able to dynamically refresh the optimal schedule as time
goes by and information is more detailed –decreasing the imprecision in the
data-.
A more general robust jobshop: in the model developed in this project, it is
assumed that the broken-down machine will take longer to be repaired than the
total makespan for the schedule. A possible generalization could contemplate the
machine being available in an intermediate time value before the schedule is
completed.
Next steps 141
The assembly jobshop remains practically unexplored in research. The
complexities of the precedence constraints make the definition of an ideal
codification impossible (in the sense explained in section 4.1.1), so an efficient
checking process, as well as a more complex random solution generator should
be developed.
A buffer constrained jobshop could be considered by futher using the already
developed T.O.C. tool, as it calculates the amount of work in process that is
waiting to be processed on each machine. The solutions that exceed a determined
level are eliminated.
Implementation of Genetic Expression Models: this new heuristic paradigm –
note that it is different from Genetic Algorithms- is introduced in the annexes but
has not been implemented to test its performance. It should be tried to study its
possibilities.
RReferences
References 143
References
[AGGARWAL 1996] Aggarwal, Charu, James B. Orlin and Ray P.Tai . 1996.
‘Optimized Crossover for the Independent Set Problem ’ Operations Research, Vol. 45.
[AICKELIN 2004] Aickelin, U., E. Burke and A. Mohamed. 2002. ‘Investigating
Artificial Immune Systems for Job Shop Rescheduling in Changing Environments’ Poster
Proceedings of ACDM 2004 Engineer’s House, Bristol, UK.
[AIDYN 2003] M. E. Aydin and T. C. Fogarty. 2003. ‘Modular Simulated Annealing
Algorithm for Job Shop Scheduling running on Distributed Resource Machine (DRM)’. South
Bank University, SCISM, 103 Borough Road, London, SE1 0AA, UK.
[AKERS 1956] Akers, S.B. 1956. ‘A graphical approach to production scheduling
problems’. Operations Research, 4.
[ANILY 1987] Anily, S and A. Federgruen. 1987. ‘Ergodicity in Parametric
Nonstationary Markov Chains: An Application to Simmulated Annealing Methods’.
Operations Research, Vol.35.
[CERNY 1985] Cerny,V. 1985. ‘A Thermodynamical Approach to the TSP: an Efficient
Simulation Algorithm’. J. Optim. Theory Applied. 15, 41-51.
[COFFMAN 1976] Coffman, E.G. 1976. ‘Computer and Job-shop Scheduling Theory’.
John Wiley, New York.
[DE CASTRO 2002] de Castro. L. N., and J. Timmis. 2002. ‘Artificial Immune Systems:
A New Computational Intelligence Approach’, Springer-Verlag.
[DORIGO 1996] Dorigo, M., V. Maniezzo and A. Colorni. 1996. ‘The ant system:
Optimization by a Colony of Cooperating Agents’ IEEE Transactions on Systems, Man and
Cybernetics-Part B, 26(1): 29-4.
[DUDEK 1964] Dudek, R. A. and O. F. Teuton. 1964. ‘Development of M-Stage Decision
Rule for Scheduling n Jobs through m Machines ’. Operations Research, Vol.12.
[DUDEK 1992] Dudek, R.A. , S.S. Panlawar and M.L. Smith. 1992. ‘The Lessons of
Flowshop Scheduling Research’. Operations Research, Vol. 40.
References 144
[EL GAMAL 1987] El Gamal, A.A., L.A. Hemachandra, I. Sheperling and V.K. Wei.
1987. ‘Using Simulated Annealing to Design Good Codes’. IEEE Trans. Inform. Theory 33,
116-223.
[FAYAD 2003] Fayad, C. and S. Petrovic. 2003. ‘A Genetic Algorithm for the Real-World
Fuzzy Job Shop Scheduling.’Engineering and Physics Science Research Council (EPSRC),
UK, Grant No. GR/R95319/01. Sherwood Press Ltd, Nottingham.
[FRENCH 1990] S. French. 1990. ‘Sequencing and Scheduling. An Introduction to the
Mathematics of the Job-Shop’. Ellis Horwood series on Mathematics and Its Applications.
Chelsea College, University of London.
[GEMAN 1984] Geman, S. and D. Geman,. ‘Stochastic Relaxation, Gibbs Distribution
and the Bayesian Restoration of Images’. IEEE Proc. Pattern Analysis and Machine
Intelligence PAMI-6, 721-744.
[GAHETE 05] Gahete Díaz, J. L., ‘Observatorio de Internet. Modelo de Supervisión,
Optimización y Mejora Global del Encauzamiento de Datos entre Sistemas Autónomos’, thesis.
[GONÇALVES 2002] Gonçalves, J.F, J.J. de Magalhaes and M. Resende. 2002. ‘A
Hybrid Genetic Algorithm for the Job Shop Scheduling Problem’. AT&T Labs Research
Technical Report TD-5EAL6J.
[GONZALEZ 1977] T. Gonzalez and S. Sahni. 1977. ‘ Flowshop and Jobshop schedules:
Complexity and Approximation’. Operations Research, Vol. 26.
[GOLDRATT 1982] Goldratt, E. 1982. ‘The goal. An ongoing process’. North River P.
[HINTON 1984] Hinton, G.E., T.J. Sejnowski and D.H. Ackley. 1984. ‘Boltzmann
machines,: Constrain Satisfaction Networks that Learn’. Report CMU-CS-84-119,
Department of Computer Science, Carnegie-Mellon University, Pittsburgh.
[HOLLAND 1975] Holland, J.H. 1975. ‘Adaptation in Natural and Artificial Systems’.
University of Michigan Press. Ann Arbor, MI.
[HOLTHAUS 2002] Holthaus O. snd C. Rajendran. ‘A Study of the Performance of
Scheduling Rules in Buffer-constrained Dynamic Flowshops’. 2002. International Journal of
Production Research, Volume 40, Number 13, pp. 3041-3052.
References 145
[IGNALI 1965] Ignali, E. and L. E. Schrage. 1965 ‘Application of the Branch-and-Bound
Technique to Some Flowshop Scheduling Problems’. Operations Research, Vol. 13.
[JACKSON 1955] Jackson, J. R. 1955. ‘Scheduling a Production Line to Minimize
Maximum Tardiness’. Research Report 43,Management Science Research Project,
University of California, Los Angeles, USA.
[JENSEN 1999] Jensen, T. and T. K. Hansen. 1999. ‘Robust Solutions to Job Shop
Problems’ Proceedings of the 1999 Congress on Evolutionary Computation, pp. 1138-
1144.
[JEPSEN, 1983] Jepsen, D, and D. Gelatt, 1983. ‘Macro Placement by Monte Carlo
Annealing’. In Proc. International Conference on Computer Design., 495-498. Port
Chester , N.Y.
[JOHNSON 1954] Johnson, S. M. 1954, ‘Optimal two and three-stage Production
Schedules with Setup times included’. Naval Research Logistic Quarterly,1
[JOHNSON 1989] Johnson, D, et Al. ‘Graph partitioning by Simulated Annealing’. 1989.
Operations Research, Vol. 37.
[KIRKPATRICK, 1984] Kirkpatrick, S. 1984. ‘Simulated Annealing : Quantitative
Studies’.J. Statis. Phys. 34, 975-986.
[KIRPATRIC 1983] Kirpatric,S, C. Gelatt and M. Vecchi.1983. ‘Optimization by
Simmulated Annealing’. Science 220. 671-680.
[LESH 2003] Lesh,N. L. B. Lopes, J. Marks, M. Mitzenmacher and G. T. Schafer.
2003. ‘Human-Guided Search for Jobshop’ In proceedings of the 3rd International NASA
Workshop on Planning and Scheduling for Space. Houston, Texas.
[LOMNICKI 1965] Lomnicki,Z. 1965 ‘A Branch-and-Bound Algorithm for the Exact
Solution of the Three Machine Scheduling Problem’. Operations Research, Vol. 16.
[MESGHOUNI 2004] Mesghouni, K, S Hammadi and P. Borne. 2004. ‘Evolutionary
Algorithms for Job-Shop Scheduling’ Int. J. Appl. Math. Comput. Sci., 2004, Vol. 14, No. 1,
91–103.
References 146
[METROPOLIS 1953] Metropolis, W., and A. Rosenbluth. 1953 ‘Equation of State
Calculations by Fast Computing Machines’. J. Chem. Phys. 21, 1087-1092.
[NAKANO 1991] Nakano, R. And Yamada, T,. 1991‘Conventional Genetic Algorithm
for Jobshop Problems’, Proceedings of the fourth International Conference on Genetic
Algorithms, Morgan Kauffman; San Mateo, CA, USA; 1991, pp. 477-479.
[REEJA 2000] Reeja, MK and C. Rajendran, ‘Dispatching Rules for Scheduling in
Assembly Jobshops’. 2000. International Journal of Production Research, 38 (10): 2349-
2360.
[ROWEN 1985] Rowen, C. and J.L. Henessy. 1985. ‘A Flexible Logic Implementation
System’. In proceedings 22nd Design Automation Conference, 748-752, Las Vegas , N.M.
[SMITH 1956] Smith, R. D. 1956. ‘Various Optimizers for Single-stage Production’.
Naval Research Logistics Quarterly 59-66.
[SMITH 1967] Smith, R. D. and R.A. Dudek. 1967. ‘A General Algorithm for Solution of
the n-Job m-Machine Sequencing problem of the Flowshop’. Operations Research, Vol. 1.
[VECCHI 1983] Vecchi, M.P and S. Kirkpatrick. 1983. ‘Global wiring by Simulated
Annealing’. IEEE Trans. Computer-Aided Design CAD-2, 215-222.
[WHITE 2003] White, T., S. Kaegi and T. Oda. ‘Revisiting Elitism in Ant Colony
Optimization’ Proceedings of the GECCO 2003, Genetic and Evolutionary Computation
Conference, Chicago, IL, USA, July 12-16, 2003.
[YAMADA 1992] Nakano, R. And Yamada, T. 1992. ‘A Genetic Algorithm Applied to
Large-scale Jobshop Problems’, in R.Männer et al. Parallel Problem Solving from Nature,
Amsterdam, 1992, pp.281-290.
LList of Abbreviations Used
List of Abbreviations Used
J.S.P.: Jobshop Problem
T.S.P.: Travelling Salesman Problem
P.: Polinomial time problems
N.P.: Non- Polinomial time problems
G.A.: Genetic Algorithm
P.M.C.: Partially Mapped Crossover
S.A.: Simulated Annealing
A.I.S.: Artificial Immune Systems
L.P.: Linear Programming
T.O.C.: Theory of Constraints
W.I.P.: Work in Process
Annexes
AGenetic Expression Models
A Genetic Expression Models. An Introduction. 151
A Genetic Expression Models. An Introduction.
As part of a constant search of ways to adapt to the environment, organisms
regulate the way in which their genes are expressed. The information contained in the
genes of a living being (known as genotype) does not completely define its actual
characteristics (referred to as phenotype).
Adaptation occurs through the evolution mechanisms, which produce permanent
changes on the genetic library, but that is not the only form in which it appears.
Evolved species as human beings transform the environment in order to make it more
suitable for them. In a smaller scale, cells regulate the expression of the genetic
information they own. That is to say, the proteins that are encoded in the chromosomes
are produced only when they are needed and in the right amount.
This mechanism is much faster than evolution and represents a dynamic real-time
equilibrium in which the concentration of each one of the proteins is kept up to the
necessary levels according to external and internal variables. To achieve this, a complex
mechanism involving inductors and repressors is developed. Genetic expression is the
basis that makes cellular differentiation possible, playing a crucial role in the
organism’s adaptability and versatility.
This system could be adapted to represent a new paradigm in metaheuristic
optimization methods. Some of the ideas that could depict the metaphor will be
presented.
The main actor in the system could be the polymerase enzyme, which in the right
conditions will link to a promoter (DNA region that represents the initial point for a
sequence that codifies a protein) and transcribe it to messenger RNA. Afterwards, that
messenger RNA leaves the nucleus to be translated into a protein in the ribosomes.
Both the transcription and translation processes are regulated. Nevertheless, only
transcription regulation will be considered for the sake of simplicity. Moreover, just a
single cellular function will be taken into account. How suitable a protein is to develop
such a role will be determined by the fitness value of the solution it codifies.
An infinite DNA chain codifying different proteins (the solutions for the problem) is
the starting point. In diploid beings, there are two different sequences codifying the
A Genetic Expression Models. An Introduction. 152
same character: they are known as alleles. In some cases, only one allele is expressed
(the dominant one), while in others the result is a combination of both.
According to this, a double chain of genes is generated. In some cases a
combination of them will be expressed, while in some others only the dominant
character will arise. For optimization purposes, the relative dominance of alleles could
rely on the fitness of each one of the solutions that the sequences would generate. If
that is very different, then only the best one will be considered, while if they are similar
a hybrid of them will arise.
For each one of the sequences, the messenger RNA concentration and
inhibition/induction information is recorded.
An inhibition signal could be triggered when other protein is found to be much
more effective. That could decelerate and eventually stop the transcription process of
that chain and even reduce its cellular concentration.
An induction signal is generated in the opposite situation, that is, when the protein
is effective. That could set in motion a mechanism that looks for similar proteins and
starts to transcribe them.
All the time, new sequences are tested.
That could lead to finding the best way to perform that cellular function, which will
correspond to the optimal solution to the problem.
BArticle presented to C.I.O. 2006
(Congreso nacional de
Ingeniería de Organización)
B Article presented to C.I.O. 2006 154
B Article presented to C.I.O. 2006
X Congreso de Ingeniería de Organización
Valencia, 7 y 8 de septiembre de 2006
Desarrollo de algoritmos genéticos, de recocido simulado e híbridos
para la planificación de un taller flexible
Sara Lumbreras Sancho1, Ángel Sarabia Viejo1
1 Dpto. de Organización Industrial. Universidad Pontificia Comillas. C/ Alberto
Aguilera, 25, 28015, Madrid. [email protected], [email protected] .
Resumen
El problema del Jobshop ha sido objeto de investigación durante décadas debido a la diversidad de situaciones en las que aparece y a la complejidad computacional que presenta. Debido a esta última no resulta conveniente aplicar para su resolución los métodos clásicos de optimización, que resultan tan eficientes en otros casos. Los algoritmos metaheurísticos son preferidos para este tipo de problemas. Este artículo describe varios algoritmos creados para su resolución, basados en modelos como los algoritmos genéticos o el recocido simulado, pero que presentan peculiaridades originales que les confieren una eficiencia particular.
Palabras clave: algoritmo genético, recocido simulado, secuenciación, jobshop
1. Introducción
Los métodos clásicos de optimización no resultan adecuados para resolver los problemas NP-completos de la planificación, para los que consumen tiempos de cálculo en muchas ocasiones inaceptables. En estos casos es conveniente aplicar técnicas heurísticas, en las que se renuncia al óptimo global a cambio de obtener soluciones muy satisfactorias en tiempos asequibles. Una selección de estas técnicas puede encontrarse en Pham (1998).
La secuenciación de tareas en un taller es uno de las manifestaciones más
interesantes de esta realidad, tanto por la diversidad de entornos en los que se presenta
como por su complejidad, que ha motivado la aparición de una extensa literatura.
B Article presented to C.I.O. 2006 155
El objetivo de este trabajo es el desarrollo y estudio de nuevos algoritmos capaces de
llevar a cabo la optimización de la planificación de un taller en tiempos que quedan muy
por debajo de los empleados para la resolución mediante métodos clásicos. Estas
diferencias de tiempos de cómputo son estudiadas y la posible desviación con respecto
al óptimo global es acotada y estimada. Por último, se realiza un análisis de la solución
aceptada para apoyar la toma de decisiones con respecto a posibles ampliaciones de
capacidad en las instalaciones.
2. Un taller generalizado
El problema del taller, conocido como jobshop, consiste en la secuenciación de
tareas para un conjunto de trabajos compuestos por varias operaciones distintas,
empleando el conjunto de máquinas disponibles.
El caso en estudio es inusitadamente general, abarcando la mayoría de las posibles
complejidades de los modelos deterministas. El taller es de tipo flexible, lo que implica
que una operación puede realizarse en varias máquinas distintas, por supuesto con
diferentes tiempos de ejecución. Además de tiempos de cambio dependientes de la
secuencia, se han considerado tiempos de transporte entre centros que pueden depender
del tamaño de lote o del trabajo en particular.
La función objetivo a optimizar permite trabajar con magnitudes sencillas (como el
tiempo total de proceso o el máximo retraso) o definir una función multicriterio en la
que es posible, por ejemplo, establecer prioridades entre clientes para evitar retrasar los
pedidos más importantes.
Además, se ha incluido una generalización adicional que permite trabajar con
tiempos aleatorios mediante la definición de una función de distribución coherente,
posibilitando así la resolución de un taller estocástico.
La posibilidad de fallo de alguna de las máquinas está también considerada,
permitiendo la obtención de soluciones robustas.
3. Comparación con los métodos clásicos. Desarrollo de un modelo de
programación lineal
Con el único objetivo de comparar los resultados obtenidos con los que arrojaría la
aplicación de los métodos clásicos, se ha desarrollado un modelo de programación
B Article presented to C.I.O. 2006 156
lineal implementado en GAMS que ya para tamaños del problema pequeños –cuatro
trabajos con seis operaciones cada uno a realizar sobre tres máquinas- requirió tiempos
de alrededor de más de 24 horas para su ejecución.
4. Codificación empleada
La codificación matricial que suele emplearse en este tipo de problemas –como en
Mesoughni (2004)- presenta el inconveniente de permitir secuencias infactibles. De
manera más específica, pueden aparecer referencias circulares entre operaciones, que
resultan computacionalmente muy costosas de identificar y eliminar. La figura 1
muestra una secuencia aparentemente normal pero que no podría llevarse a cabo debido
a la existencia de una referencia circular entre los trabajos 1 y 2.
M1: O(1,2) O(2,1) O(3,3)
M2: O(2,2) O(1,1)
M3: O(3,1) O(3,2)
Figura 1. Ejemplo de referencia circular en una secuencia
Las soluciones, en todos los heurísticos desarrollados, son codificadas en forma de
una doble lista:
• Una lista de trabajos, que representa únicamente a éstos y no a sus operaciones. Cuando aparece el símbolo de un trabajo se pretende notar que su primera operación no realizada comenzará a trabajarse en la máquina correspondiente tan pronto como ésta se encuentre libre.
• Una lista de máquina, en paralelo con la anterior, que complementa la información previa con el dato de la máquina sobre la que ha de realizarse la operación que simboliza cada elemento de la lista de trabajos.
A partir de una solución codificada es relativamente sencillo obtener todos los datos
de interés de la secuencia que representa. Por ejemplo, una solución válida para un taller
con tres trabajos de dos operaciones y tres máquinas podría ser:
Trabajos: 1 3 1 2 2 3
Máquinas:2 3 1 1 2 3
B Article presented to C.I.O. 2006 157
y la secuencia resultante podría representarse mediante un diagrama de Gantt como
el de la figura 2, en la que a cada trabajo le ha sido asociado un color y las etiquetas
representan cada operación realizada en el formato (trabajo,operación) (por ejemplo,
“(3,2)” significaría la segunda operación del tercer trabajo).
Figura 2. Representación de la secuencia ejemplo en un diagrama de Gantt
Para que una solución codificada de esta manera sea factible, únicamente debe
cumplir que la lista de trabajos sea una permutación con repetición de los símbolos
empleados para los trabajos en el que cada uno aparezca tantas veces como operaciones
lo componen. La lista de máquinas sólo debe respetar la flexibilidad de las máquinas
del taller.
Estas comprobaciones son muy sencillas y pueden tenerse en cuenta en el algoritmo
como una guía para la generación de la solución o soluciones aleatorias que se tomarán
como punto de partida. Así, la estructura en doble lista imposibilita que aparezcan
infactibilidades, reduciendo el espacio de soluciones a únicamente las factibles.
Además, facilita considerablemente el desarrollo de los operadores básicos de los
algoritmos y es interpretable de manera sencilla. Debido a esto, los tiempos de cálculo
se reducen.
B Article presented to C.I.O. 2006 158
5. Un algoritmo genético original
Este modelo aplica la interpretación de los mecanismos de la evolución de los
algoritmos genéticos usuales adaptándolos al problema junto con nuevas ideas que
mejoran sensiblemente sus resultados. Las secciones siguientes explicitan estas
peculiaridades.
5.1. Creación de la generación inicial
La población inicial de soluciones representa el punto de partida del algoritmo. La
codificación utilizada facilita notablemente este proceso:
• La lista de trabajos se obtiene como una permutación aleatoria de una cadena en
la que el símbolo correspondiente a cada trabajo aparece repetido tantas veces
como operaciones lo componen.
• A partir de ella, la lista de máquinas se construye seleccionando aleatoriamente
para cada posición un centro de trabajo del conjunto de los posibles.
Este proceso garantiza que todos los individuos serán factibles de antemano,
evitando así todo proceso de comprobación posterior.
Es importante notar que el tamaño de la población influenciará notablemente el
desarrollo del proceso de optimización. Por una parte, el tiempo de cómputo total del
algoritmo es proporcional al número de iteraciones y al tamaño de la población. Al
mismo tiempo, y al igual que en poblaciones biológicas, la diversidad es un factor clave.
La probabilidad de que el algoritmo quede atrapado en óptimos locales crece cuando la
población es pequeña. Las pruebas realizadas sugieren que la mejor estrategia es fijar el
tiempo de ejecución y considerar un número prudente de generaciones (de 100 a 200,
independientemente del tamaño del problema, ya que este tipo de algoritmos suele
agotarse o detenerse en su proceso de mejora tras un número de iteraciones de ese
orden). Así se trabajaría con el mayor número posible de individuos en la población, lo
que garantiza aprovechar el tiempo disponible de manera óptima en eficacia y
eficiencia.
5.2. Selección natural
La supervivencia de los individuos más fuertes se modela mediante una selección en
ruleta. A las mejores soluciones se les asignan sectores mayores, de tal manera que unos
B Article presented to C.I.O. 2006 159
dardos lanzados al azar tengan más probabilidades de seleccionarlos para que se
reproduzcan.
Con el objetivo de garantizar que la mejor solución obtenida hasta el momento –la
denominada reina en este modelo-no se pierda, independientemente de que ésta sea
escogida para formar parte del conjunto de padres, se le realiza una copia que pasa
directamente a la generación siguiente sin cambios.
5.3. Cruce
Los individuos seleccionados para ser padres se organizan por parejas al azar.
El cruce se implementa en dos pasos, uno para cada uno de los componentes de la
doble lista. Si en las listas de tareas se aplicase un cruce bipuntual simple, los
individuos hijos podrían resultar monstruos o soluciones infactibles. Por ello, un
proceso sencillo identifica las operaciones sobrantes o faltantes en cada bloque
intercambiado. Las primeras son eliminadas en su primera aparición en el bloque a
intercambiar, mientras que las segundas se añaden al final en el orden en el que
aparecían en la secuencia original. Esta comprobación y posterior modificación de los
cromosomas es sencilla y rápida, evitando los problemas de infactibilidades originadas
en otros operadores de cruce.
Las listas de máquinas se traducen primero en matrices que almacenan el centro de
trabajo asignado a cada trabajo y operación. Después, se determina aleatoriamente una
submatriz que es intercambiada entre ambos padres.
Mediante este proceso se garantiza la obtención de dos cromosomas hijos factibles
que resultan realmente una combinación de sus padres. Sus características serán
similares en los fragmentos que comparten y será posible la mejora en tanto las mejores
partes de cada padre pueden ser seleccionadas para un hijo.
5.4. Mutación
El proceso de generación de mutaciones forma parte del conjunto de mecanismos
cuyo objetivo consiste en la intensificación de la exploración de nuevas soluciones. Se
introducen cambios al azar en los cromosomas, preservando la diversidad y previniendo
la convergencia temprana del algoritmo.
B Article presented to C.I.O. 2006 160
La mutación se implementa también en dos pasos. Por un lado, la lista de máquinas
puede ser modificada cambiando el centro de trabajo en el que se ejecuta una operación
a otro que disponga también de esa posibilidad. Los cambios en la lista de trabajos se
estructuran como una permutación entre dos de sus elementos. Esta definición es
intuitiva y responde a un concepto de mínima distancia, garantizando así que la solución
mutada sea muy similar a la original.
La probabilidad de mutación es un parámetro de ajuste delicado, ya que si resulta
demasiado bajo sus efectos no serán relevantes pero si es muy elevado el algoritmo será
más parecido a una búsqueda aleatoria, con la pérdida de eficiencia que de ello se
seguiría. Las pruebas realizadas sugieren que las tasas de mutación entre el 3 y el 6%
arrojan los mejores resultados. El modelo desarrollado escoge un valor en este intervalo
dependiendo de la desviación típica de los ajustes de la generación anterior, dando
valores mayores para poblaciones más uniformes.
5.5. Inmigración
Este mecanismo tiene también por objeto la conservación de la diversidad. Se
generan nuevos individuos aleatorios que pueden o no ser incluidos en la población. El
mecanismo diseñado considera que los inmigrantes seleccionan aleatoriamente una
presa de entre el conjunto de los individuos más débiles. Cada pareja se bate entonces
en un duelo en el que las probabilidades de sobrevivir son proporcionales a la bondad
del ajuste, resultando así una implementación particular del conocido como método de
selección por torneo. Si el inmigrante resulta vencedor ocupa el lugar del individuo
original.
De la misma forma que la tasa de mutación, el número de aspirantes a inmigrantes
debe ser escogido de manera que se aprovechen sus efectos positivos sobre la diversidad
pero sin elevarlo hasta el extremo de comenzar a perder eficiencia. Las pruebas
realizadas señalan los valores entre el 5 y el 15% como los más adecuados. De nuevo,
este valor se modifica durante la evolución del algoritmo de manera que se incremente
cuando la diversidad disminuya.
5.6. Componente asexual en la reproducción
A imitación de los sistemas reproductivos de algunos insectos sociales, que alternan
ciclos de reproducción sexual y asexual (partenogénesis), un nuevo mecanismo es
B Article presented to C.I.O. 2006 161
introducido en este modelo. Una proporción de la nueva generación es obtenida
mediante un proceso que imita la partenogénesis. Este mecanismo toma la reina como
punto de partida para generar un conjunto de soluciones modificadas mediante el
operador mutación. De esta manera se intensifica la búsqueda en las regiones más
prometedoras, incrementando la eficiencia.
De manera análoga a los casos anteriores, el porcentaje de hijos obtenidos mediante
este proceso es un parámetro que debe seleccionarse cuidadosamente. Esta vez el
problema es el opuesto: si es demasiado alto el algoritmo perderá efectividad,
resultando más fácilmente atrapado en óptimos locales. Las pruebas señalan el 10-15%
como valores entre los que debería oscilar esta proporción. La figura 3 ilustra el efecto
que esta tasa tiene sobre el algoritmo.
Figura 3. Comportamientos del algoritmo en sus primeras iteraciones
para distintas tasas de reproducción asexual
6. Algoritmo de recocido simulado
Los algoritmos de recocido simulado (S.A.) imitan los procesos metalúrgicos del
recocido en el acero o el cobre. El modelo desarrollado aporta algunas ideas
innovadoras a las implementaciones clásicas de estos algoritmos.
B Article presented to C.I.O. 2006 162
6.1. Obtención de soluciones vecinas
En estos heurísticos resulta de vital importancia conseguir generar vecinos que estén
situados a una distancia mínima del punto de origen para que estén directamente
relacionados con él pero puedan presentar valores de ajuste diferentes. Esto se consigue
de manera análoga al procedimiento descrito en la sección dedicada a la mutación,
conservando como ventaja importante que no es necesario hacer comprobaciones ni
correcciones posteriores.
6.2. Mecanismo de búsqueda local
Para mejorar la eficiencia, el algoritmo de recocido contiene un mecanismo de
búsqueda local que, al seleccionarse una solución, evalúa un conjunto de puntos de su
entorno, desplazándose a ellos en el caso de que se encuentre una dirección de mejora.
Si la mejora se produce, el proceso de búsqueda local se repite sin variación de la
temperatura. Sin embargo, si no ha resultado satisfactorio se procede al mecanismo
clásico de recocido en el que la probabilidad de un movimiento de empeoramiento se
determina por su magnitud y por el valor de la temperatura según la ley de Boltzmann.
El tamaño del conjunto de vecinos que serán inspeccionados debe ser
suficientemente grande como para tener impacto en los resultados pero suficientemente
pequeño como para no disminuir la eficiencia. Las pruebas realizadas aconsejan valores
en torno a los 30 puntos.
6.3. Planes de enfriamiento
De entre la extensa literatura publicada sobre la evolución que debería seguir la
temperatura a lo largo del algoritmo puede concluirse que los planes demasiado
sofisticados no resultan eficientes, siendo superados en la gran mayoría de los casos por
una simple regla geométrica de determinados parámetros. Sin embargo, estos valores
son difíciles de ajustar y el resultado es muy sensible a ellos, de manera que en muchas
ocasiones se descarta aplicar la regla geométrica por esta dificultad.
El primero de los planes de enfriamiento propuestos intenta imitar el que resulta
efectivamente escogido en la industria metalúrgica –resulta curioso comprobar que,
mientras que el proceso del temple ha sido sometido a estudios exhaustivos para
B Article presented to C.I.O. 2006 163
mejorar las propiedades de la pieza final, para el recocido se considera suficiente dejar
enfriar el metal al aire o en el horno- que resulta ser el correspondiente a la convección
natural. La discretización de este proceso lleva efectivamente a la regla geométrica, pero
al tener en cuenta las propiedades de los transitorios de primer orden se obtienen pautas
sencillas para determinar los parámetros que resultan en algoritmos eficientes y
robustos:
11cτ
= − (1)
6· nτ = (2)
donde c representa el parámetro de la geométrica, n simboliza el número máximo de
iteraciones permitido y τ. la constante de tiempo del transitorio, que es empleada como
un parámetro intermedio.
Una segunda alternativa es introducida en forma de plan dinámico, que fija el valor
de la temperatura teniendo en cuenta el valor del ajuste de la solución en cada momento,
de manera que la probabilidad de un salto en contra de la pendiente de determinada
magnitud desciende mediante una regla geométrica que se define mediante las sencillas
reglas descritas en (1) y (2). El plan de enfriamiento obtenido resulta en un algoritmo
alternativo al anterior que resulta más eficiente en algunos de los casos simulados.
7. Algoritmos híbridos
Las características positivas de los heurísticos descritos se han combinado para
obtener híbridos que resultan interesantes desde las perspectivas de eficacia y
efectividad.
7.1. Híbrido en dos fases
La forma más sencilla de combinar los algoritmos genéticos y de recocido descritos
es ejecutarlos secuencialmente. Al agotarse antes, el genético será el que comenzará.
Una vez la mejora comience a ser más lenta, pasará el relevo al recocido. Determinar el
momento del cambio no es sencillo. Las pruebas sugieren que es conveniente fijar un
número mínimo de iteraciones que por fuerza serán realizadas (unas 150-200
independientemente del tamaño del problema). Después, la alarma de agotamiento
puede dispararse, y el cambio se realiza una vez se han sucedido 50-75 generaciones sin
que se consiguiera mejora alguna.
B Article presented to C.I.O. 2006 164
Después, el recocido se realiza sobre una familia de soluciones que está formada por
los miembros más prometedores de la etapa anterior. Trabajar con una población y no
con una solución aislada mejora notablemente la efectividad del recocido. El tamaño de
la familia ha de escogerse cuidadosamente, siendo preferidos los valores en el intervalo
6-12 para tamaños del problema de alrededor de 10 trabajos, pero deberían
incrementarse para problemas mayores. La figura 3 muestra un ejemplo de ejecución,
mientras que la figura 4 destaca el principal problema de este algoritmo: aunque
garantiza una mayor exploración que los anteriores, su eficiencia es mucho menor.
Figura 3. Ejemplo de ejecución del híbrido en dos fases con una familia de 12
miembros
B Article presented to C.I.O. 2006 165
Figura 4. Eficiencia comparada del genético original y el híbrido secuencial
7.2. Mutación no darwiniana
El segundo híbrido propuesto se aparta de la teoría clásica de la evolución –el
estándar seguido por los algoritmos genéticos- para considerar que los cambios
aleatorios que se introducen en forma de mutaciones no son tales, sino que la
probabilidad de que resulten en una mejora es mayor que la de que el resultado
empeore. Así, se introduce una pequeña ejecución de recocido en el operador mutación,
que garantiza que la solución obtenida habrá explorado sus vecinos y pasado por el
mecanismo de búsqueda local descrito y el método de recocido clásico. Un ejemplo de
ejecución se incluye en la figura 5.
Este elemento intensifica la búsqueda en las regiones más prometedoras, resultando
en un algoritmo más eficiente que ninguno de los anteriores en las primeras
generaciones, siendo después superado por el genético darwiniano (cuando se ha
alcanzado un óptimo local, y para seguir avanzando, sería necesario inspeccionar puntos
nuevos, incrementando la diversidad). Así, se diseño un tercer híbrido que conmuta del
genético no darwiniano al más clásico después de unas cuantas generaciones (150-200),
alcanzando los mejores resultados en cuanto a eficiencia de todos los modelos
presentados. La figura 6 muestra una comparación en eficiencia de los tres híbridos
B Article presented to C.I.O. 2006 166
desarrollados.
Figura 5. Ejemplo de ejecución de las primeras iteraciones del genético original y
el no darwiniano
Figura 6. Eficiencia comparada entre los tres híbridos desarrollados
B Article presented to C.I.O. 2006 167
8. Otras salidas de los modelos
Además de devolver los datos correspondientes a la solución obtenida, el avance del
algoritmo en el tiempo e información estadística sobre la evolución de la población, los
modelos desarrollados proporcionan otras salidas interesantes sobre el problema.
8.1. Acotamiento y estimación del error cometido
Además de una cota superior del error se facilita una estimación del mismo, que se
basa en el ajuste de una distribución no paramétrica a los resultados obtenidos a lo largo
de la ejecución del algoritmo. Este tipo de distribuciones no habían sido nunca aplicadas
al problema de estimación de errores de heurísticos, arrojando esta herramienta
resultados notablemente buenos.
8.2. Aplicación de la Teoría de las Limitaciones
La Teoría de las Limitaciones (Goldratt (1982)) se introduce en una herramienta que
analiza la evolución del trabajo en curso, identifica cuellos de botella en el taller para la
carga de trabajos considerada y aconseja dónde realizar nuevas inversiones de aumento
de capacidad.
9. Otras generalizaciones de los modelos: taller estocástico y taller robusto
Los modelos desarrollados están preparados para trabajar con un taller estocástico,
en el que los tiempos aleatorios se simulan a partir de una distribución que surge como
combinación del dato del tiempo determinista y una componente aleatoria proporcional
a él que se obtiene de una Weibull.
Además, los algoritmos tienen la posibilidad de considerar el fallo de alguna de las
máquinas (encontrando la secuenciación de tareas alternativa óptima, así como
indicando los trabajos cuya realización no será ya posible). De esta manera, mediante la
introducción de una lista con los fallos más frecuentes y sus probabilidades respectivas
es posible encontrar la solución robusta para el taller considerado siguiendo un proceso
similar al presentado en Jensen (1999).
B Article presented to C.I.O. 2006 168
Referencias
Mesghouni, K, S Hammadi and P. Borne (2004): Evolutionary Algorithms for Job-
Shop Scheduling. Int. J. Appl. Math. Comput. Sci. Vol. 14, No. 1, 91–103.
Pham, D.T. and D. Karaboga (1998): Intelligent Optimization Techniques. Springer-
Verlag
Jensen, T. and T. K. Hansen.‘Robust Solutions to Job Shop Problems’. Proceedings
of the 1999 Congress on Evolutionary Computation, 1999.
E. Goldratt.‘The goal. An ongoing process’(1982). North River.
B Article presented to C.I.O. 2006 169
CData for the LP example
C Data for the LP example 170
C Data for the LP example
The example is composed of four jobs, with 4 ,7, 7 and 5 operations respectively.
The data that were given the most important weight was processing times, so for the
sake of simplicity only it will be shown.
The structure of the data is PT (job, operation, machine), and the values result as
follows:
PT(:,:,1) =
3 3 14 5 0 0 0
3 0 14 5 5 5 5
3 3 0 5 3 3 14
0 3 14 0 0 0 0
PT(:,:,2) =
4 6 4 5 0 0 0
4 0 4 0 5 0 0
0 6 4 5 4 0 4
4 0 4 5 4 0 0
C Data for the LP example 171
PT(:,:,3) =
5 2 12 2 0 0 0
5 2 12 2 2 2 2
5 0 12 0 0 2 0
5 2 0 2 5 0 0
Top Related