Populate a Pyomo model from JSON/YAML

416 views
Skip to first unread message

Gustavo Bittencourt

unread,
Oct 3, 2016, 12:07:55 PM10/3/16
to Pyomo Forum
Hi.

I'm sorry if this question is doubled, but I didn't find the answer. I saw in Pyomo 4.1 release that it offers "New JSON/YAML format for parameter data", but I didn't found any documentation about it.

I have my Pyomo abstract model and a python script that I programmed to generate my test instances, which outputs my sets and parameters nested inside a dictionary and saves it in a JSON file.

I need to import this JSON data to my model, populating it to solve. Does anyone have any tips about how could I make it?

Thank you in advance,

Gustavo Bittencourt

Gabriel Hackebeil

unread,
Oct 3, 2016, 12:32:54 PM10/3/16
to pyomo...@googlegroups.com
Gustavo,

If you are comfortable importing your data into Python (the opposite of what you are doing to save it to JSON), you might enjoy starting from a ConcreteModel. For example, you can define your model inside of a function that you pass concrete data objects into (e.g., integers, lists, dictionaries, filenames). For instance

def create_my_model(a_list, a_dictionary, a_number):
    model = ConcreteModel()
    model.index = Set(initialize=a_list)
    model.p = Param(model.index, initialize=a_dictionary)
    model.q = Param(initialize=a_number)
    model.x = Var(model.index)
    model.o = Objective(expr= summation(model.x, model.p))
    model.c = Constraint(expr= model.x >= model.q)
    return model

Calling this function with data is equivalent to calling create_instance on an AbstractModel object with a filename where specially formatted data is stored. If you import your data using JSON, you can pass this data dictionary directly into your function and use it to define the concrete model. If you generate your data inside Python every time you solve your model, you can bypass this file creation process altogether (you can even use hybrid approaches).

I believe the JSON functionality is available, but it involves nesting the data dictionary with None keys in various places that don’t make much sense to a user, so we have not documented it. Let me know if you still would rather go this route and I’ll see if can put together a small example.

Gabe
    
--
You received this message because you are subscribed to the Google Groups "Pyomo Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyomo-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gustavo Bittencourt

unread,
Oct 13, 2016, 5:52:08 PM10/13/16
to pyomo...@googlegroups.com
Dear Gabe,

I just saw your answer today. Thank you very much for the kindness of your response!

I did exactly what you said, and it worked correctly! I was wondering if there are other advantages of creating abstract models instead of concrete models, beyond the ability to import .dat files.

Also, I have a two index parameter which represents a distance matrix between vertices (i0, i1) and is stored as a Pandas DataFrame. As my model was taking a while to be generated, I profiled the code and found that it was spending 17% of the time populating this parameter and 26% of the time generating the objective function. Do you have any clue about what can be done to make it faster?

Populating the parameter TB:
    def TB_init(model, i0, i1):
        return inst.TB_ii.loc[i0, i1]
    model.TB = Param(model.I, model.I, initialize=TB_init, doc='Transport cost of second echelon')

Generating objective function:
    def objective_function(model):
        return sum(model.FJ[j] * model.w[j] for j in model.J) + \
            sum(model.FK[k] * model.z[k] for k in model.K) + \
            sum(model.FV[v] * model.q[v] for v in model.V) + \
            sum(model.TB[i0, i1] * model.e[i0, i1, v]
                for i0 in model.I for i1 in model.I for v in model.V) + \
            sum((model.TA[j, k] + model.H[k] / 2) * model.x[j, k, l, v]
                for j in model.J for k in model.K for l in model.L for v in model.V) + \
            sum(model.S * model.s[l] for l in model.L)

    model.objective = Objective(
        rule=objective_function, sense=minimize, doc='Objective function')



Thank you again!

Best regards,

Gustavo

Update: Your tip about initializing parameters using dictionaries gave me the ideia to convert the DataFrame to a dictionary indexed by tuples and to use it to initialize my matrix. The results were awesome, from 888 ms/iteration to 35 ms/iteration. I'll put the code and the profile times here, because it might help someone else. If you have any clue about the objective function I'd love to try it!

Line profile:

File: Modelo.py
Function: GerarModelo at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    53       200         2249     11.2      0.0      if method == "old":
    54       100          934      9.3      0.0          def TB_init(model, i0, i1):
    55                                                       return inst.TB_ii.loc[i0, i1]
    56       100     88857385 888573.8     11.5          model.TB = Param(model.I, model.I, initialize=TB_init, doc='Transportation cost of second echelon')
    57
    58       200         2046     10.2      0.0      if method == "new":
    59       100       604928   6049.3      0.1          dic_TB = dict(((i0, i1), dist) for i0, serie in inst.TB_ii.iterrows() for i1, dist in serie.iteritems())
    60       100      2935910  29359.1      0.4          model.TB = Param(model.I, model.I, initialize=dic_TB, doc='Transportation cost of second echelon')

Code:
    # The old method - 888 ms/iter
    if method == "old":
        def TB_init(model, i0, i1):
            return inst.TB_ii.loc[i0, i1]
        model.TB = Param(model.I, model.I, initialize=TB_init, doc='Transportation cost of second echelon')
   
    # The new method, converting the Pandas DataFrame to a dictionary - 35 ms/iter    
    if method == "new":
        dic_TB = dict(((i0, i1), dist) for i0, serie in inst.TB_ii.iterrows() for i1, dist in serie.iteritems())
        model.TB = Param(model.I, model.I, initialize=dic_TB, doc='Transportation cost of second echelon')




Gabriel Hackebeil

unread,
Oct 14, 2016, 12:12:17 AM10/14/16
to pyomo...@googlegroups.com
I did exactly what you said, and it worked correctly! I was wondering if there are other advantages of creating abstract models instead of concrete models, beyond the ability to import .dat files.

I find that debugging is a bit easier with ConcreteModels because you can interact with objects right after you declare them. With AbstractModels, you have to do this after you construct an instance or by adding some code inside a “rule” somewhere. There are good arguments for using either. I think it comes down to personal preference in the end.

As my model was taking a while to be generated, I profiled the code and found that it was spending 17% of the time populating this parameter

Note that once you start using ConcreteModels, there is no longer a need to use the Param object unless you make it mutable. When you declare a Param with “mutable=True”, this allows you to update the Parameter value at a later point in time (which updates any expressions that reference it without having to reconstruct anything). So you might obtain some speedup by using the DataFrame array directly in place of where you are using that Param.

If DataFrame is implemented using numpy arrays (it might not be), it would also be a good idea to extract that data in bulk into a Python list or dictionary once at the beginning (e.g., numpy.array.tolist()). Numpy arrays are very inefficient when you access them at individual indices. They are designed to be fast at vector operations.
 
and 26% of the time generating the objective function.

That expression looks like it could contain quite a large number of terms depending on the sizes of J, K, L, and V. I don’t see any obvious ways to speed it up. One of the other developers might have a suggestions. If you’re comfortable posting a working example. We might be able to take a closer look / use it for future profiling.

Gabe

On Oct 13, 2016, at 5:52 PM, Gustavo Bittencourt <gustavo.b...@engenharia.ufjf.br> wrote:

Dear Gabe,

I just saw your answer today. Thank you very much for the kindness of your response!

I did exactly what you said, and it worked correctly! I was wondering if there are other advantages of creating abstract models instead of concrete models, beyond the ability to import .dat files.

Also, I have a two index parameter which represents a distance matrix between vertices (i0, i1) and is stored as a Pandas DataFrame. As my model was taking a while to be generated, I profiled the code and found that it was spending 17% of the time populating this parameter and 26% of the time generating the objective function. Do you have any clue about what can be done to make it faster?

Populating the parameter TB:
    def TB_init(model, i0, i1):
        return inst.TB_ii.loc[i0, i1]
    model.TB = Param(model.I, model.I, initialize=TB_init, doc='Transport cost of second echelon')

Generating objective function:
    def objective_function(model):
        return sum(model.FJ[j] * model.w[j] for j in model.J) + \
            sum(model.FK[k] * model.z[k] for k in model.K) + \
            sum(model.FV[v] * model.q[v] for v in model.V) + \
            sum(model.TB[i0, i1] * model.e[i0, i1, v]
                for i0 in model.I for i1 in model.I for v in model.V) + \
            sum((model.TA[j, k] + model.H[k] / 2) * model.x[j, k, l, v]
                for j in model.J for k in model.K for l in model.L for v in model.V) + \
            sum(model.S * model.s[l] for l in model.L)

    model.objective = Objective(
        rule=objective_function, sense=minimize, doc='Objective function')



Thank you again!

Best regards,

Gustavo 

Reply all
Reply to author
Forward
0 new messages