Complex Dynamic Flexible Job Shop Example and Performance Discussion

BeautyBrad Eddowes

unread,

Dec 31, 2019, 12:17:08 PM12/31/19

to or-tools-discuss

Greetings!

My hope with this thread is to get some advice on improving the solving performance of the attached program as well as providing it as inspiration for other ideas. It is quite developed (please excuse the messy code, this is a mad science research project at this point) and complex.

The main concept of my model is a domino effect of tasks where the first task type is required to happen and has a negative effect on a reservoir. This requires the model to enable an optional task of the next task type to have a positive effect on the reservoir. The reservoirs are done by first outlining the inactive time slices around the intervals to then infer the actives are the inverse time slices on the timeline. This is done because you cannot go through all optional intervals and say that if variable at time t is active for a particular machine that time t must be between the start and end of all intervals for that machine. You can, however, say that all inactive variables must be either before or after all intervals for a particular machine and say that the sum(inactives) == horizon - sum(durations) where sum(durations) is total time running so horizon - sum(durations) is time not running. Each task is linked to tank task so that if machine and tank are active at time t, then there is a machine&tank active (plus time and effect) being applied to a reservoir representing the tank. This, I believe, is where the complexity of my model comes in because there are multiple machines, each of which has multiple tanks they can theoretically go to. So you have n*m machine active and inactive variables plus n*t tank active and inactive variables plus n*m*t machine&tank active variables where n is the number of time slices, m is the number of machines and t is the number of tanks. All of that is per the product type, or formula because the machines process differently per product type.

All of those variable arrays of actives per machine are effectively n choose k so something that yielded a good performance improvement was saying that machine must be active a specific minimum time and then computing all n choose k combinations that have groupings of 3 or more of active variables. This reduced the difficulty from 16 million combinations to 98 thousand. I do this on lines 380->390 where a small section of multithreaded code computes all of the combinations to be used later in AddAllowedAssignments for each machine's set of active variables.

There are many commented out "model.Add..." statements that were constraints I tested by performing empirical analysis over multiple runs.

I have tried using hints but because of the relatively drastic changes in the solution when adding more formulas the hints degraded performance. It would make sense to use hint between task types but the program right now is only considering the first Equip=>Tank, Tank=>Equp relationship and its struggling where if the model becomes feasible to fully scale-out it will have 5 task=>tank=>task relationships.

As the program stands now, I believe the only libraries I'm using that aren't standard in python are ortools and matplotlib (for a gantt chart of the solution).

I hope to get input on ways to improve my model's performance because as it sits right now I cannot scale it up without an exponential increase in solving time. I also hope to get feedback on how quickly it solves on more resource equipped machines. Right now I only have access to a 2c/4t machine that has the best performance using 4 workers and a 6c/12t machine that performs best with 6 workers (using more workers degrades performance, probably due to limits of the hyperthreading architecture on this particular machine?) I hope to run this on a well equipped AWS EC2 instance soon but I don't have that set up yet.

I feel like I'm kind of at a dead-end beyond trying to change the search type but I'm not sure where I can find all the values that search_branching can take and changing search_branching doesn't seem to work with multiple workers? If anyone can explain this or point me to some documentation on changing the search types and getting that to work with multiple workers that would be awesome!

I average 55.6s using 4 workers and 52.6s using FIXED_SEARCH which seems for force 1 worker... These can be commented in and out at lines 682-691

I know this is a lot....but any and all input is welcome!

Thanks!

Solver_Simplified.py

BeautyBrad Eddowes

unread,

Dec 31, 2019, 12:27:46 PM12/31/19

to or-tools-discuss

P.S

The reason I'm only constraining the actives/inactives of machines and not the tanks is that there can be multiple overlapping intervals at a given time for a tank. More than one machine or task can add/take away from it at a time. I don't have a good way of raining in the actives and inactives proportionate to tank task durations yet.

Aishwarya MV

unread,

Jan 1, 2020, 7:31:16 AM1/1/20

to or-tools-discuss

Hi OR- Tools enthusiasts and experts out there! I have a small problem with the coding part. Let me elaborate with an example:

Consider a job shop wherein many components(with many pieces of each component) have to be machined on a given number of machines and each component has its own operations such as sizing, pocket grooving etc. and each of these operations have to be performed only after installing a setup or a fixture in the machine. Hence the fixture setup takes its own time. My question branches into two here:

1. Is there a possible way where the solver will know about the setup times and create a precedence of the setup and the operation on the same machine?

2. If any other machine is vacant during scheduling, can the solver duplicate this setup operation and run the remaining pieces of the same component in that machine?

Laurent Perron

unread,

Jan 1, 2020, 7:47:39 AM1/1/20

to or-tools-discuss

You can have a look at:

https://github.com/google/or-tools/blob/stable/examples/python/jobshop_ft06_distance_sat.py

https://github.com/google/or-tools/blob/stable/examples/python/single_machine_scheduling_with_setup_release_due_dates_sat.py

They show how to code setup transitions.

Laurent Perron | Operations Research | lpe...@google.com | (33) 1 42 68 53 00

--
You received this message because you are subscribed to the Google Groups "or-tools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to or-tools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/or-tools-discuss/392685d7-ec96-4c0b-8212-ef0b7426e567%40googlegroups.com.

Aishwarya MV

unread,

Jan 1, 2020, 12:15:24 PM1/1/20

to or-tools-discuss

Thank you taking out your valuable time to answer my question. But as I am really new to OR tools, I am unable to understand the setup_times and job_durations array.

May I please have a better explanation on them and an overall picture of how both the programs work? It would be a huge huge favor. Thanks in advance!

On Wednesday, January 1, 2020 at 6:17:39 PM UTC+5:30, Laurent Perron wrote:

You can have a look at:

https://github.com/google/or-tools/blob/stable/examples/python/jobshop_ft06_distance_sat.py

https://github.com/google/or-tools/blob/stable/examples/python/single_machine_scheduling_with_setup_release_due_dates_sat.py

They show how to code setup transitions.
Laurent Perron | Operations Research | lpe...@google.com | (33) 1 42 68 53 00

Le mer. 1 janv. 2020 à 13:31, Aishwarya MV <aishwary...@gmail.com> a écrit :

Hi OR- Tools enthusiasts and experts out there! I have a small problem with the coding part. Let me elaborate with an example:

Consider a job shop wherein many components(with many pieces of each component) have to be machined on a given number of machines and each component has its own operations such as sizing, pocket grooving etc. and each of these operations have to be performed only after installing a setup or a fixture in the machine. Hence the fixture setup takes its own time. My question branches into two here:

1. Is there a possible way where the solver will know about the setup times and create a precedence of the setup and the operation on the same machine?

2. If any other machine is vacant during scheduling, can the solver duplicate this setup operation and run the remaining pieces of the same component in that machine?

--
You received this message because you are subscribed to the Google Groups "or-tools-discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to or-tools...@googlegroups.com.

BeautyBrad Eddowes

unread,

Jan 1, 2020, 2:32:48 PM1/1/20

to or-tools-discuss

Another example that may help you is found here: https://groups.google.com/forum/#!msg/or-tools-discuss/ZsIQd0gjjRE/TZ0qMgGDAwAJ

This talks about adding downtime to a machine when switching between different materials. I do not have this implemented in the simplified program of my original post but I did implement a version of this. My machines need cleaning if they transition between types of products. The intensity of the cleaning means that the cleaning task will vary depending on what task->task progression the solver picks. Pay attention to the addMachDynaDown method. Here, a circuit constraint it defined between all the possible tasks on a machine. You can say that if a certain task->task progression is picked, that if they aren't the same component a time penalty must be added to the leading task to account for the machine setup. Depending on the goal you setting for the solver, like minimizing the maximum end times, the solver will try to choose the cheapest progression in terms of how it affects the end times. This could mean that it runs the same task on two machines to reduce end times. I took this a step farther and added optional cleaning tasks for my machines and impose a vary expensive time penalty if a particular task->task progression would require cleaning but if the leading task is a cleaning task, there is no time penalty to go to the next task with a different product type. The solver will then choose to add a cleaning task in order to minimize end times instead of taking the large time penalty.

Hope this helps!

Aishwarya MV

unread,

Jan 2, 2020, 4:16:07 AM1/2/20

to or-tools-discuss

I thank you for your valuable suggestion. Will consider looking into it. Thank you!

BeautyBrad Eddowes

unread,

Jan 2, 2020, 11:34:05 AM1/2/20

to or-tools-discuss

Adding to this, I setup an EC2 instance to run my program and found that on the most current version of ortools, 7.4, I get an infeasible solution. On my computer, which is running 7.3, I get solutions just fine. I can see that only a fraction of the variables created running in 7.3 are created in 7.4. What changed between 7.3 and 7.4 that would have broken this program?

On Tuesday, December 31, 2019 at 12:17:08 PM UTC-5, BeautyBrad Eddowes wrote:

BeautyBrad Eddowes

unread,

Jan 2, 2020, 6:43:20 PM1/2/20

to or-tools-discuss

I had some time to run the attached program on a 32c/32t EC2 instance today after downgrading the ortools version and these are the questions I have.

Slight changed made to the program for the EC2 runs, updated program attached. Ran a slightly harder problem because I was expecting largely improved performance using 32 cores.

1. Is it true that search_branching and num_search_workers cannot be changed at the same time?

2. What major changes occurred from 7.3 to 7.4 which may be breaking my program?

3. Seen in the attached logs (represent typical runs at their respective core counts), 4c on my personal machine and 32c on the EC2 machine yielded very similar results for time to wall on average. During the EC2 run, however, on average only 6 cores were maxed out until the solution call back started returning feasible solutions at which point all 32 cores went to 100 until it hit the wall. Is this the correct behavior? Not only that but after running multiple tests at different core/worker amounts, the 16c runs were the fastest on the 32c machine but still only 20% faster on average than the 4c machine?

4. Again, looking for pointers in the formulation of my model to improve performance.

Thanks!

EC2 32c run.txt

Personal 4c run.txt

EC2 16c run.txt

Solver_Simplified.py

Aishwarya MV

unread,

Jan 4, 2020, 1:24:05 PM1/4/20

to or-tools-discuss

Hello! Hope you all are doing good!. I am currently in the 2nd year of my Bachelor's degree in Engineering, studying the specialization of Industrial and Production Engineering.

I am trying to create a program that automatically schedules machine hours in a job shop. In this instance of a job shop, orders are received for many jobs wherein each job has to be produced in certain quantity (specified no. of pieces), and all those pieces have to go through a number of operations. Those operations have specific setup procedures, which, if carried out once on each machine on which the piece is being machined, is enough.

I have attached the program of a flexible job shop and its output.

Job 0 calls for machining 2 pieces and so does Job 1

Here task_0_0 and task_0_1 are same operations performed on piece 1 and piece 2. The same goes for all identical tasks for both the jobs.

Basically what I wish this program does is,

task_0_0 is a new operation so, it must add 60(the setup time corresponding to task_0_0) to the duration.

task_0_1 should start at 63 ( setup time of 60 + the duration required for task_0_0)

task_0_2 should start at 66

task_0_3 is given to a different machine so its duration should be incremented by the setup time.

Overall, the motive is,

if at any point of time, either the operation or the machine on which the piece is being processed changes, the setup time should be added to the duration.

As I am really new to OR tools and python as well, I tried to build a logic using whatever I am aware of, and also by trying to incorporate the program links given by you (github links). I don't know where I am going wrong but, my logic always lands me in an INFEASIBLE solution.

It will be a great help if you suggest how I go about with the logic to solve the above motive. I will be grateful for you for my entire life. Thank you for taking out your valuable time to read this huge email.

Screen Shot 2020-01-04 at 23.03.02.png

https://drive.google.com/file/d/0By7IqJv7cQpdcWVIaWlWbzNRdVVZUENtbXJVWmw5bjVCNzZZ/view?usp=sharing - Link to the program

Brad Eddowes

unread,

Jan 6, 2020, 9:58:22 AM1/6/20

to or-tools-discuss

I don't see the logic to add setup times in the code you linked too but what you want to do is check what variables and what constraints on those variables the setup time constraint could be breaking. If you are adding the setup time directly to the duration of a task, check to make sure you are not exceeding the upper bound on your duration variables. Also, the makespan variable is defined as a value from 0 to horizon where the horizon is defined as the sum of maximum durations from each task in each job. Makespan gets its value from a maximum equality constraint that says its value is to be the largest end time value. Your horizon value is only 38, your setup values make your end times exceed that variable's upper bound.

Aishwarya MV

unread,

Jan 6, 2020, 1:38:55 PM1/6/20

to or-tools-discuss

Okay.

I have some questions.

What is the difference between (start and l_start) or (end and l_end) and such other similar variables?

How can I know which one of these similar ones is enforced and on which machine is it enforced?

How can I know which one of these' durations/starts/ends I should change?

The horizon is actually being printed in the beginning of the program and hence it is seen as 38. But the horizon is actually expanding in the program.

Thank you.

Brad Eddowes

unread,

Jan 6, 2020, 1:54:49 PM1/6/20

to or-tools-discuss

1. Is it true that search_branching and num_search_workers cannot be changed at the same time?

2. What major changes occurred from 7.3 to 7.4 which may be breaking my program?

3. Seen in the attached logs (represent typical runs at their respective core counts), 4c on my personal machine and 32c on the EC2 machine yielded very similar results for time to wall on average. During the EC2 run, however, on average only 6 cores were maxed out until the solution call back started returning feasible solutions at which point all 32 cores went to 100 until it hit the wall. Is this the correct behavior? Not only that but after running multiple tests at different core/worker amounts, the 16c runs were the fastest on the 32c machine but still only 20% faster on average than the 4c machine?

4. Again, looking for pointers in the formulation of my model to improve performance.

Digging into my second question, I have narrowed down the failure in my model to the model.AddReservoirContraintwithActive statements. If I make the minimum and maximum of the reservoir very large (-999999, 999999) I can get a solution out of it and am able to read the variable values. The actives for the task that takes from the reservoir are showing up which would drive it into a negative state. As soon as I change the capacity values to have a 0 floor the solver fails. In 7.3 the solver's behavior was spot on. If it scheduled a task that drove a reservoir into the negative, below the lower capacity limit, it would then schedule another task that added to the reservoir to counteract the deficit. What changed in regard to the reservoir constraints in 7.4?

Any input on my other questions?

Thanks in advance.

Brad Eddowes

unread,

Jan 6, 2020, 2:10:34 PM1/6/20

to or-tools-discuss

The start variable is for the task while l_start is for a specific piece of equipment performing the task. The model then says that the task start == l_start if l_presence where l_presence is the model's way of choosing which machine will perform that task. In your example, scratch.py, you have machine 0 and machine1. Therefore each task you will an l_* variable of each type for each machine. The statement model.Add(sum(l_preseneces) == 1) is telling it that it can only select one set of l_* variables to represent the task.

You know which variables are chosen by looking at which l_presence variable is true for a given task.

In the example, I reference to found here, the method addMachDynaDown creates a circuit constraint that represents the progression of tasks on a machine. If the model chooses the presence value of task i to be on machine 0 and task j to be on machine 0 then the circuit constraint says that j comes after i. This is done by the model choosing to set the specific lit variable of that relationship true. Now in that logic you can say that if lit is true, and j comes after i on a particular machine, j's start or end or duration must be altered in a certain way to account for tooling.

In the code you linked to I do not see anything making adjustments to the value of the horizon variable after the print statement. It still has a value of 38 at line 173, defining the upper bound of the makespan variable and thus by the AddMaxEquality constraint on line 174, saying that the maximum end time of all tasks cannot exceed 38.

Again, in that example code, I do not see your attempt to add a circuit constraint so I'm not sure if you posted the correct one.

Aishwarya MV

unread,

Jan 7, 2020, 9:28:44 AM1/7/20

to or-tools...@googlegroups.com

Your inputs on this are very comprehensible! Thank you!. This code that I linked is a direct manipulation of a code that you and Laurent linked me to on Github. I had a lot of difficulty in incorporating the circuit constants for my program as the variables used there are different and a lot of errors used to pop up every time. I actually make logical changes on a copy of this code so that the main logic stays safe. As I am really confused with what each variable does and since I do not have changes in machine_materials(only tasks are changed), may I ask your suggestions on the variables that I can use to incorporate AddMachDynaDown function to my program ?

And one more question:

transition_time = machineDownMaterial[machine_id]

This line in the code you linked sets the transition time with respect to material linked to a machine id if I am not wrong. But in my case, I have a different down time for each task. May I ask how am I supposed to address this?

Thank you for bearing with me and my stupid questions,

Have a great day!

Message has been deleted

Aishwarya MV

unread,

Jan 7, 2020, 1:22:06 PM1/7/20

to or-tools-discuss

With the help of your inputs, I tried incorporating the AddMachDynaDown in my own way into the code. But I get this error:

line 242, in flexible_jobshop

arcs.append([0, i + 1, model.NewBoolVar('')])

TypeError: cannot concatenate 'str' and 'int' objects

Is there any way to fix this?

Thank you in advance!

P.S: I've attached the modified code.

scratch2.py

Brad Eddowes

unread,

Jan 9, 2020, 6:39:30 PM1/9/20

to or-tools-discuss

I am not getting that error when I run your program. The error I'm getting is on line 255 and is as follows

Traceback (most recent call last):

File "c:/Users/brad74/Downloads/scratch2.py", line 308, in <module>

flexible_jobshop()

File "c:/Users/brad74/Downloads/scratch2.py", line 255, in flexible_jobshop

if op_names[res[0], res[1], i] != op_names[res[0], res[1], j]: # if the task names of the two are not

KeyError: (0, 0, 2)

You are using the values found in res[] which is the list of numbers found in the name of the start variable, assigned on line 237. The numbers found in the start variable come from the variable alt_suffix defined on line 163 as the job index, task index, and alt index. You're attempting to use res[0] and res[1], which could have the values [0,1] and [0,1,2,3] respectively because you have 2 jobs and 4 tasks per job. Then you are using i and j as a third key-value and both i and j can be the values [0-8] because there is a start variable for every task from each job going to each machine. The dictionary you are mapping does not have values for every possible combination of value from the three things you are using as keys in the code. I removed the third key from the dictionary(ln260, ln44-60), changed res to res_i (ln234-238) to represent the ith start and added res_ j (ln250-254) to represent the jth start.

This caused the solution to be infeasible because as I described in a previous response, you would make variables exceed their upper bound because the upper bound is defined as the horizon (is set to 38 in your problem) and you are not adjusting the horizon to account for the much larger than 38 setup times. Therefore, to get a solution, I arbitrarily set the horizon to 999 (ln95).

Now your program appears to run correctly. It is attached.

scratch2.py

Aishwarya MV

unread,

Jan 11, 2020, 10:18:11 AM1/11/20

to or-tools...@googlegroups.com

Thank you very much for that! I really am grateful!

After I ran the code, it worked absolutely fine! But I realized that the solver has not added a setup time to any of the tasks in Job 1, due to the OnlyEnforceIf(lit) in line 298 if I am not wrong. Is there a way to get the solver to compulsorily add a setup time either when the machine changes or the task changes?

I also have another question:

In my code, if you observe tasks (0, 0) is same as (0, 1) but the operation is being carried out on different pieces. This is applicable to all those tasks that have the same name. My question is: Is there a way to get the solver to assign the tasks with same names to the same machine (Ex. If task (0,0) is carried out on machine 0, task(0, 1) should also be carried out on machine 0) ?

Thanks in advance for taking time from your busy schedule.

Hope you have a great day!

Brad Eddowes

unread,

Jan 13, 2020, 10:09:04 AM1/13/20

to or-tools-discuss

if alt_0_0_0 is for task (0,0) machine 0, alt_0_0_1 is for task (0,0) machine 1, alt_0_1_0 is for task (0,1) machine 0, and alt_0_1_1 is for task (0,1) machine 1 then you could say something like model.add(alt_0_1_0 == True).OnlyEnforceIf(alt_0_0_0) and model.add(alt_0_1_1 == True).OnlyEnforceIf(alt_0_0_1) since the algorith has to pick either alt_0_0_0 or *_1 for the first task then that choice will force the decision of the second task.

Laurent Perron

unread,

Jan 13, 2020, 10:23:04 AM1/13/20

to or-tools-discuss

model.add(alt_0_1_1 == True).OnlyEnforceIf(alt_0_0_1)

you can rewrite it as:

model.AddImplication(alt_0_0_1, alt_0_1_1)

Laurent Perron | Operations Research | lpe...@google.com | (33) 1 42 68 53 00

To unsubscribe from this group and stop receiving emails from it, send an email to or-tools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/or-tools-discuss/5e12e3c4-3dce-4f28-9dda-7b21723dd615%40googlegroups.com.

Brad Eddowes

unread,

Jan 13, 2020, 11:36:23 AM1/13/20

to or-tools-discuss

Good catch, Laurent! I don't often think of boolean constraints first!

I missed your first question, Aishwarya, but the reason job 1 isn't seeing any setup times is because you defined on line 260 that you only get a setup time if the tasks are different. Task_1_0 and task_1_1 are the same types. task_1_2 and task 1_3 are the same but different from tasks 0 and 1 so the solver chose a different machine to avoid assigning setup times on the machine used for tasks 0 and 1. If you always want setup time, remove the if...else of lines 260-264 and just use line 262 to assign a value into the transition_time variable.

To view this discussion on the web visit https://groups.google.com/d/msgid/or-tools-discuss/5e12e3c4-3dce-4f28-9dda-7b21723dd615%40googlegroups.com.

Aishwarya MV

unread,

Jan 13, 2020, 1:33:57 PM1/13/20

to or-tools-discuss

Thank for the valuable inputs!

I tried writing model.AddImplication(alt_0_0_1, alt_0_1_1) but it gave an error as : "Unresolved reference to alt_0_0_1"

But surprisingly, now the solver groups similar tasks on the same machine (I even added another similar task to check this fact).

If I consider your suggestion to just assign a value to the transition_time variable, the solver add a setup time for each and every task which I wouldn't require. I desire that the sover adds a setup time every time the operation changes (not when the piece changes for the same operation) (Ex. When the operation changes from Sizing to Pocket or Sizing to drilling). I also need it to add a setup time at the beginning of the first operation, i.e, in this case, Sizing for each job.

How do I go about this? Should I still remove the if...else lines of 260 - 264 or is there an alternative condition that works in this case? If yes, may i know what it is and the line nos. where I should add it?

Thank You in advance.

With warm regards.

(P.S I've attached my updated code)

scratch2_updatedc.py

Reply all

Reply to author

Forward