Estimation of closed form likelihood function

Jason Hawkins

unread,

Apr 1, 2020, 9:51:17 AM4/1/20

to TensorFlow Probability

Hi,

I am experimenting with tensorflow probability. I'm trying to pass a likelihood to an mcmc sampler using a closed form likelihood. All the examples I have seen use a setup of specifying a joint distribution and calling logprob() on it. E.g.,

neg_log_likelihood = -tf.reduce_sum(mixture_dist.log_prob(targets))

My model can be specified as a set of logistic functions, but it has a standard closed form solution that I would like to use (in traditional optimization the closed form drastically improves speed. I'm new to Bayesian estimation, so I'm not sure what difference in performance there would be specifying the closed form. I assume still some.).

I am following the setup here for mcmc:

https://rlhick.people.wm.edu/posts/custom-likes-tensorflow.html

and some of the examples here for setting up priors, etc.:

https://github.com/tensorflow/probability/blob/master/tensorflow_probability/examples/jupyter_notebooks/Multilevel_Modeling_Primer.ipynb

My code is in a colab notebook here:

https://drive.google.com/file/d/1L9JQPLO57g3OhxaRCB29do2m808ZUeex/view?usp=sharing

I get the error: OperatorNotAllowedInGraphError: iterating overtf.Tensoris not allowed: AutoGraph did not convert this function. Try decorating it directly with @tf.function.It would also be ideal if I could pass the starting parameter values as a single input (example I am working off doesn't do it, but I assume it is possible).

@tf.function def mmnl_log_prob(init_mu_b_time,init_sigma_b_time,init_a_car,init_a_train,init_a_sm,init_b_cost,init_scale): 
# Create priors for hyperparameters 
mu_b_time = tfd.Sample(tfd.Normal(loc=init_mu_b_time, scale=init_scale),sample_shape=1).sample() 
# HalfCauchy distributions are too wide for logit discrete choice 
sigma_b_time = tfd.Sample(tfd.Normal(loc=init_sigma_b_time, scale=init_scale),sample_shape=num_idx).sample() 
# Create priors for parameters 
a_car = tfd.Sample(tfd.Normal(loc=init_a_car, scale=init_scale),sample_shape=1).sample() 
a_train = tfd.Sample(tfd.Normal(loc=init_a_train, scale=init_scale),sample_shape=1).sample() 
a_sm = tfd.Sample(tfd.Normal(loc=init_a_sm, scale=init_scale),sample_shape=1).sample() 
b_cost = tfd.Sample(tfd.Normal(loc=init_b_cost, scale=init_scale),sample_shape=1).sample() 
# Define a heterogeneous random parameter model with MultivariateNormalDiag() 
# Use MultivariateNormalDiagPlusLowRank() to define nests, etc. 
b_time = tfd.MultivariateNormalDiag( # b_time loc=mu_b_time, scale_diag=sigma_b_time).sample() 
# Definition of the utility functions 
V1 = a_train + b_time * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED 
V2 = a_sm + b_time * SM_TT_SCALED + b_cost * SM_COST_SCALED 
V3 = a_car + b_time * CAR_TT_SCALED + b_cost * CAR_CO_SCALED 
# Definition of loglikelihood 
eV1 = tfm.multiply(tfm.exp(V1),TRAIN_AV_SP) 
eV2 = tfm.multiply(tfm.exp(V2),SM_AV_SP) 
eV3 = tfm.multiply(tfm.exp(V3),CAR_AV_SP) 
eVD = eV1 + eV2 + eV3 
l1 = tfm.multiply(tfm.truediv(eV1,eVD),tf.cast(tfm.equal(CHOICE,1),tf.float32)) 
l2 = tfm.multiply(tfm.truediv(eV2,eVD),tf.cast(tfm.equal(CHOICE,2),tf.float32)) 
l3 = tfm.multiply(tfm.truediv(eV3,eVD),tf.cast(tfm.equal(CHOICE,3),tf.float32)) 
ll = tfm.reduce_sum(tfm.log(l1+l2+l3)) 

return ll

The function is called with the inputs:

 init_state = [ tf.zeros(1, name='init_mu_b_time'), 
0.5*tf.ones(1, name='init_sigma_b_time'), 
tf.zeros(1, name='init_a_car'), 
tf.zeros(1, name='init_a_train'), 
tf.zeros(1, name='init_a_sm'), 
tf.zeros(1, name='init_b_cost'), 
0.5*tf.ones(1, name='init_scale') 
]

Jason Hawkins

unread,

Apr 1, 2020, 5:08:12 PM4/1/20

to TensorFlow Probability

It looks like I needed to change the location of the @tf.function decorator. I will update once confirmed and working properly.

Jason Hawkins

unread,

Apr 2, 2020, 11:09:26 AM4/2/20

to TensorFlow Probability

The sampler now runs, but it gives me the same value for all samples for each of the parameters. Is it a requirement that I pass a joint distribution through the log_prob() function? I am clearly missing something. I can run the likelihood through bfgs optimization and get reasonable results (I've estimated the model via maximum likelihood with fixed parameters in other software). My function is as follows:

@tf.function
def mmnl_log_prob(init_mu_b_time,init_sigma_b_time,init_a_car,init_a_train,init_b_cost,init_scale):

    # Create priors for hyperparameters
    mu_b_time = tfd.Sample(tfd.Normal(loc=init_mu_b_time, scale=init_scale),sample_shape=1).sample()
    # HalfCauchy distributions are too wide for logit discrete choice

    sigma_b_time = tfd.Sample(tfd.Normal(loc=init_sigma_b_time, scale=init_scale),sample_shape=1).sample()

    # Create priors for parameters
    a_car = tfd.Sample(tfd.Normal(loc=init_a_car, scale=init_scale),sample_shape=1).sample()
    a_train = tfd.Sample(tfd.Normal(loc=init_a_train, scale=init_scale),sample_shape=1).sample()

    # a_sm = tfd.Sample(tfd.Normal(loc=init_a_sm, scale=init_scale),sample_shape=1).sample()

    b_cost = tfd.Sample(tfd.Normal(loc=init_b_cost, scale=init_scale),sample_shape=1).sample()
    # Define a heterogeneous random parameter model with MultivariateNormalDiag()
    # Use MultivariateNormalDiagPlusLowRank() to define nests, etc.

    b_time = tfd.Sample(tfd.MultivariateNormalDiag(  # b_time
          loc=mu_b_time,
          scale_diag=sigma_b_time),sample_shape=num_idx).sample()

    # Definition of the utility functions

    V1 = a_train + tfm.multiply(b_time,TRAIN_TT_SCALED) + b_cost * TRAIN_COST_SCALED
    V2 = tfm.multiply(b_time,SM_TT_SCALED) + b_cost * SM_COST_SCALED
    V3 = a_car + tfm.multiply(b_time,CAR_TT_SCALED) + b_cost * CAR_CO_SCALED
    print("Vs",V1,V2,V3)

    # Definition of loglikelihood
    eV1 = tfm.multiply(tfm.exp(V1),TRAIN_AV_SP)
    eV2 = tfm.multiply(tfm.exp(V2),SM_AV_SP)
    eV3 = tfm.multiply(tfm.exp(V3),CAR_AV_SP)
    eVD = eV1 + eV2 +

 eV3
    print("eVs",eV1,eV2,eV3,eVD)

    l1 = tfm.multiply(tfm.truediv(eV1,eVD),tf.cast(tfm.equal(CHOICE,1),tf.float32))
    l2 = tfm.multiply(tfm.truediv(eV2,eVD),tf.cast(tfm.equal(CHOICE,2),tf.float32))
    l3 = tfm.multiply(tfm.truediv(eV3,eVD),tf.cast(tfm.equal(CHOICE,3),tf.float32))
    ll = tfm.reduce_sum(tfm.log(l1+l2+l3))

    print("ll",ll)
    
    return ll

I call it from here:
nuts_samples = 1000
nuts_burnin = 500
chains = 4
## Initial step size
init_step_size=.3
init = [0.,0.,0.,0.,0.,.5]

##
## NUTS (using inner step size averaging step)
##
@tf.function
def nuts_sampler(init):
    nuts_kernel = tfp.mcmc.NoUTurnSampler(
      target_log_prob_fn=mmnl_log_prob, 
      step_size=init_step_size,
      )
    adapt_nuts_kernel = tfp.mcmc.DualAveragingStepSizeAdaptation(
  inner_kernel=nuts_kernel,
  num_adaptation_steps=nuts_burnin,
  step_size_getter_fn=lambda pkr: pkr.step_size,
  log_accept_prob_getter_fn=lambda pkr: pkr.log_accept_ratio,
  step_size_setter_fn=lambda pkr, new_step_size: pkr._replace(step_size=new_step_size)
       )

    samples_nuts_, stats_nuts_ = tfp.mcmc.sample_chain(
  num_results=nuts_samples,
  current_state=init,
  kernel=adapt_nuts_kernel,
  num_burnin_steps=100,
  parallel_iterations=5)
    return samples_nuts_, stats_nuts_

samples_nuts, stats_nuts = nuts_sampler(init)

Message has been deleted

Franklin Abodo

unread,

Jan 19, 2021, 4:00:40 PM1/19/21

to TensorFlow Probability, jfha...@gmail.com

Hey Jason,

Do you remember what particular change alleviated you "same sample every iteration" problem? I'm having the same problem after starting to use @tf.function, even after ruling out the possibility of any Python side effects occurring in the graph construction (or so I think).

One thing that I'm doing differently is creating my JointDistribution model outside of my custom joint log-prob function and calling that same objects log_prob method each iteration. This is something that has worked for me in the past in TF1 graph mode and TF2 eager mode.