45 views

Skip to first unread message

Feb 21, 2023, 9:24:47 PM2/21/23

to TensorFlow Probability

I am getting the below error -

ValueError: Dimensions must be equal, but are 10 and 60000 for '{{node mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/add_4}} = AddV2[T=DT_FLOAT](mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/add_3, mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/Deterministic/log_prob/Log)' with input shapes: [10,10], [10,60000].

for -

%%time

@tf.function

def utilities(x, betas, errors):

"""

`x * betas + errors` with broadcasting.

"""

x = tf.cast(x, dtype=tf.float32)

util = (tf.transpose(x) * betas) + errors

return util

k = 10

sigma_beta = 1.

sigma_error = 1.

def alt_pooled_model(X_train):

return tfd.JointDistributionSequential([

tfd.HalfCauchy(loc=0., scale=sigma_beta, name="sigma_beta"),

tfd.HalfCauchy(loc=0., scale=sigma_error, name="sigma_error"),

tfd.Normal(loc=tf.zeros(k), scale=sigma_beta, name="beta"),

tfd.Gumbel(loc=0., scale=sigma_error, name="error"),

lambda error, beta: tfd.Deterministic(

tf.math.argmax(

tfd.Multinomial(

total_count=1,

logits=utilities(X_train, beta[..., tf.newaxis], error[..., tf.newaxis]),

).sample(), axis=0

)

),

])

def target_log_prob(sigma_beta, sigma_error, beta, error):

return alt_pooled_model(X_train).log_prob(sigma_beta, sigma_error, beta, error,best_choices)

# Use NUTS for inference

hmc = tfp.mcmc.NoUTurnSampler(

target_log_prob_fn=target_log_prob,

step_size=.01)

# Unconstrain the scale parameters, which must be positive

hmc = tfp.mcmc.TransformedTransitionKernel(

inner_kernel=hmc,

bijector=[

tfp.bijectors.Identity(), # sigma_beta

tfp.bijectors.Identity(), # sigma_error

tfp.bijectors.Identity(), # beta

tfp.bijectors.Identity(), # error

])

# Adapt the step size for 100 steps before burnin and main sampling

hmc = tfp.mcmc.DualAveragingStepSizeAdaptation(

inner_kernel=hmc,

num_adaptation_steps=100,

target_accept_prob=.75)

# Initialize 10 chains using samples from the prior

joint_sample = alt_pooled_model(X_train).sample(10)

initial_state = [

joint_sample[0],

joint_sample[1],

joint_sample[2],

joint_sample[3],

# tf.ones((X_train.shape[0],), dtype=tf.int32) * -1 # initialize with invalid choices

]

# Compile with tf.function and XLA for improved runtime performance

@tf.function(autograph=False, experimental_compile=True)

def run():

return tfp.mcmc.sample_chain(

num_results=500,

current_state=initial_state,

kernel=hmc,

num_burnin_steps=200,

trace_fn=lambda _, kr: kr,

# trace_fn=lambda current_state, kernel_results: kernel_results

)

samples, traces = run()

print('R-hat diagnostics: ', tfp.mcmc.potential_scale_reduction(samples))

@tf.function

def utilities(x, betas, errors):

"""

`x * betas + errors` with broadcasting.

"""

x = tf.cast(x, dtype=tf.float32)

util = (tf.transpose(x) * betas) + errors

return util

k = 10

sigma_beta = 1.

sigma_error = 1.

def alt_pooled_model(X_train):

return tfd.JointDistributionSequential([

tfd.HalfCauchy(loc=0., scale=sigma_beta, name="sigma_beta"),

tfd.HalfCauchy(loc=0., scale=sigma_error, name="sigma_error"),

tfd.Normal(loc=tf.zeros(k), scale=sigma_beta, name="beta"),

tfd.Gumbel(loc=0., scale=sigma_error, name="error"),

lambda error, beta: tfd.Deterministic(

tf.math.argmax(

tfd.Multinomial(

total_count=1,

logits=utilities(X_train, beta[..., tf.newaxis], error[..., tf.newaxis]),

).sample(), axis=0

)

),

])

def target_log_prob(sigma_beta, sigma_error, beta, error):

return alt_pooled_model(X_train).log_prob(sigma_beta, sigma_error, beta, error,best_choices)

# Use NUTS for inference

hmc = tfp.mcmc.NoUTurnSampler(

target_log_prob_fn=target_log_prob,

step_size=.01)

# Unconstrain the scale parameters, which must be positive

hmc = tfp.mcmc.TransformedTransitionKernel(

inner_kernel=hmc,

bijector=[

tfp.bijectors.Identity(), # sigma_beta

tfp.bijectors.Identity(), # sigma_error

tfp.bijectors.Identity(), # beta

tfp.bijectors.Identity(), # error

])

# Adapt the step size for 100 steps before burnin and main sampling

hmc = tfp.mcmc.DualAveragingStepSizeAdaptation(

inner_kernel=hmc,

num_adaptation_steps=100,

target_accept_prob=.75)

# Initialize 10 chains using samples from the prior

joint_sample = alt_pooled_model(X_train).sample(10)

initial_state = [

joint_sample[0],

joint_sample[1],

joint_sample[2],

joint_sample[3],

# tf.ones((X_train.shape[0],), dtype=tf.int32) * -1 # initialize with invalid choices

]

# Compile with tf.function and XLA for improved runtime performance

@tf.function(autograph=False, experimental_compile=True)

def run():

return tfp.mcmc.sample_chain(

num_results=500,

current_state=initial_state,

kernel=hmc,

num_burnin_steps=200,

trace_fn=lambda _, kr: kr,

# trace_fn=lambda current_state, kernel_results: kernel_results

)

samples, traces = run()

print('R-hat diagnostics: ', tfp.mcmc.potential_scale_reduction(samples))

shape of X_train = (60000, 10)

shape of best_choice = (60000,)

Can somebody plz help me, I am not able to figure this out

Feb 22, 2023, 12:00:42 PM2/22/23

to Surbhi Gupta, TensorFlow Probability

I can get this to run with the following modifications:

def alt_pooled_model(X_train):

return tfd.JointDistributionSequential([

tfd.HalfCauchy(loc=0., scale=sigma_beta, name="sigma_beta"),

tfd.HalfCauchy(loc=0., scale=sigma_error, name="sigma_error"),

tfd.Independent(

tfd.Normal(loc=tf.zeros(k), scale=sigma_beta),

reinterpreted_batch_ndims=1,

name="beta"),

tfd.Gumbel(loc=0., scale=sigma_error, name="error"),

lambda error, beta: tfd.Independent(

tfd.Deterministic(

tf.math.argmax(

tfd.Multinomial(

total_count=1,

logits=utilities(X_train, beta[..., tf.newaxis], error[..., tf.newaxis]),

).sample(), axis=0)),

reinterpreted_batch_ndims=1),

])

I added tfd.Independent(..) around terms that were erroneously contributing batch shapes.

But there's more weirdness here:

- the half cauchy distributions are not actually being used for the scale terms of beta and error

- Most worrisome: there is a sampling procedure in the definition of the final distribution. The log probs of this model will not be deterministic. Not sure what you are aiming to accomplish exactly; if you can say more maybe I can offer some guidance code-wise. Maybe this is of relevance?

--

You received this message because you are subscribed to the Google Groups "TensorFlow Probability" group.

To unsubscribe from this group and stop receiving emails from it, send an email to tfprobabilit...@tensorflow.org.

To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/3f778844-16eb-4e30-9894-db295e9d3f7en%40tensorflow.org.

Feb 22, 2023, 12:27:00 PM2/22/23

to Christopher Suter, TensorFlow Probability

Thanks Christopher.

I am trying to build a utilities model for a max diff survey.

X_train is of the shape (60000, 10) --> (total responses, total no of items). Total items (k) is 10. Total responses is no of respondents * no of questions sets per respondent (5000, 12). Each question set only shows 5 items and respondents select the best and worst items. X_train is a binary matrix with 1 representing the item present in the question set and 0 means absence.

best_choices is of shape (60000,) and represents the best choices of the respondents for the questions.

i am using the utility function = X_trains * betas + error

each item should have a beta

utility func should be giving a utility for each of the 10 items

does this help? 😬

Thanks,

Surbhi Gupta

Feb 22, 2023, 12:27:53 PM2/22/23

to Christopher Suter, TensorFlow Probability

best choice should be the item with max utility

Cheers,

Surbhi Gupta

Feb 22, 2023, 12:37:30 PM2/22/23

to TensorFlow Probability, Surbhi Gupta, TensorFlow Probability, c...@google.com

do u think multinomial is not required?

Feb 23, 2023, 12:56:44 PM2/23/23

to Surbhi Gupta, TensorFlow Probability

Hi Surbhi, I wrote down what I think you are trying to do: https://colab.research.google.com/drive/1CdFx-CeYUiSJmsA_05n9A0LDMas-x6qq

Hope this helps! Happy to clarify anything you like.

Feb 27, 2023, 12:11:22 PM2/27/23

to Christopher Suter, TensorFlow Probability

HI Christopher,

Sorry for the late reply (i was travelling).

The code works, thanks for the help :)

Cheers,

Surbhi Gupta

Feb 27, 2023, 12:14:33 PM2/27/23

to Christopher Suter, TensorFlow Probability

Hi Christopher,

I just have an additional question, I know it's possible to define multiple gumbel distributions but how do I define the covariance between all the gumbel dist --> just pass the covariance matrix to scale?

Cheers,

Surbhi Gupta

Feb 27, 2023, 12:18:14 PM2/27/23

to Surbhi Gupta, TensorFlow Probability

I don't believe we have a multivariate (correlated) Gumbel in TFP. You could do something like define a batch of them, use tfd.Independent to make a single multivariate distribution, then use tfb.MatvecScaleTriL bijector to mix them together.

Feb 27, 2023, 12:19:21 PM2/27/23

to Surbhi Gupta, TensorFlow Probability

sorry meant tfb.ScaleMatvecTriL

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu