I'm new to optimization and Gurobi. I'm trying to maximize revenue that relies on the probability output of a logistic regression function (however I plan to use other algorithms later, if possible). My project involves distributing financial aid to prospective students of a university. As I give them aid (scholarship money), their probability of enrollment increases. I'm trying to determine which students to give the aid to and how much they should get. Is what I'm trying to do possible? Is there a better way to formulate this while still incorporating an "outside" logistic regression function?
I have a dictionary that contains all of the student information needed to calculate the probability of enrollment (TEST_17_A_DICT, used in the objective function below). I'm using sklearn's logistic regression function. This dictionary contains the aid amount and the net tuition along with many constants (removed from the code for clarity).
My enrollment probability function and simplified code are below:
#--------------------------------------------------------------------
import pandas as pd
import numpy as np
from gurobipy import *
from sklearn import linear_model
def enroll_probability(TEST_17_A_DICT, ID_n, additional_aid_n):
data = {'Financial_Aid':TEST_17_A_DICT[ID_n][2] + additional_aid_n,
'NetPrice':TEST_17_A_DICT[ID_n][11] - additional_aid_n}
index = np.arange(1)
df = pd.DataFrame(data,index = index)
prob = logistic.predict_proba(df)
return (prob[0][1]) # Returns a probability from 0.0 to 1
IDS = TEST_17_A['ID'] # All student IDs in the test dataframe
#/////////////////////GUROBI OPTIMIZATION//////////////////////
m = Model()
# Add Variables
ID_n = {} # Student ID
additional_aid_n = {} # Amount of aid to give
for i in IDS:
ID_n[i] = m.addVar(vtype=GRB.BINARY, name="%d" % (i))
additional_aid_n[i] = m.addVar(vtype=GRB.INTEGER, lb=0, name="aid_%d" % (i))
m.update()
m.addConstr( quicksum( additional_aid_n[i] for i in IDS ) <= 100000 ) # Total aid should not exceed 100000
m.addConstr( quicksum( ID_n[i] for i in IDS ) <= 100 ) # Only award up to 100 students
m.addConstrs((additional_aid_n[i] >= 100 for i in (IDS)), name='c') # Aid amount must be greater than 100
m.addConstrs((additional_aid_n[i] <= 4000 for i in (IDS)), name='c1') # Aid amount must be less than 4000
m.update()
# Revenue: probability_of_enrollment*(NetPrice - additional_aid)
# Student NetPrice is TEST_17_A_DICT[i][11]
m.setObjective(quicksum( (enroll_probability( TEST_17_A_DICT, i, (additional_aid_n[i]) )\
*(TEST_17_A_DICT[i][11]-(additional_aid_n[i])))*ID_n[i] for i in IDS),
GRB.MAXIMIZE)
m.optimize()
#--------------------------------------------------------------------
I'm getting this error (AttributeError: 'gurobipy.LinExpr' object has
no attribute 'ndim') when I try to create a pandas dataframe of the data in the enroll_probability function. I'm assuming it has to do with 'Financial_Aid':TEST_17_A_DICT[ID_n][2] + additional_aid_n since the first expression is an integer and additional_aid_n is 'gurobipy.LinExpr'. Being new to this, it's very likely that I'm going about this entirely wrong and would appreciate any advice.
Thanks for any help or suggestions!
Chris