dimension error

47 views
Skip to first unread message

Surbhi Gupta

unread,
Feb 21, 2023, 9:24:47 PM2/21/23
to TensorFlow Probability
I am getting the below error - 
ValueError: Dimensions must be equal, but are 10 and 60000 for '{{node mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/add_4}} = AddV2[T=DT_FLOAT](mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/add_3, mcmc_sample_chain/dual_averaging_step_size_adaptation___init__/_bootstrap_results/transformed_kernel_bootstrap_results/NoUTurnSampler/.bootstrap_results/process_args/maybe_call_fn_and_grads/value_and_gradients/value_and_gradient/JointDistributionSequential/log_prob/Deterministic/log_prob/Log)' with input shapes: [10,10], [10,60000].

for - 
%%time

@tf.function
def utilities(x, betas, errors):
    """
    `x * betas + errors` with broadcasting.
    """
    x = tf.cast(x, dtype=tf.float32)
    util = (tf.transpose(x) * betas) + errors
    return util

k = 10
sigma_beta = 1.
sigma_error = 1.


def alt_pooled_model(X_train):
    return tfd.JointDistributionSequential([
        tfd.HalfCauchy(loc=0., scale=sigma_beta, name="sigma_beta"),
        tfd.HalfCauchy(loc=0., scale=sigma_error, name="sigma_error"),
        tfd.Normal(loc=tf.zeros(k), scale=sigma_beta, name="beta"),
        tfd.Gumbel(loc=0., scale=sigma_error, name="error"),
        lambda error, beta: tfd.Deterministic(
                tf.math.argmax(
                    tfd.Multinomial(
                        total_count=1,
                        logits=utilities(X_train, beta[..., tf.newaxis], error[..., tf.newaxis]),
                    ).sample(), axis=0
                )
            ),
    ])


def target_log_prob(sigma_beta, sigma_error, beta, error):
    return alt_pooled_model(X_train).log_prob(sigma_beta, sigma_error, beta, error,best_choices)

# Use NUTS for inference
hmc = tfp.mcmc.NoUTurnSampler(
    target_log_prob_fn=target_log_prob,
    step_size=.01)

# Unconstrain the scale parameters, which must be positive
hmc = tfp.mcmc.TransformedTransitionKernel(
    inner_kernel=hmc,
    bijector=[
        tfp.bijectors.Identity(),  # sigma_beta
        tfp.bijectors.Identity(),  # sigma_error
        tfp.bijectors.Identity(),  # beta
        tfp.bijectors.Identity(),  # error
    ])

# Adapt the step size for 100 steps before burnin and main sampling
hmc = tfp.mcmc.DualAveragingStepSizeAdaptation(
    inner_kernel=hmc,
    num_adaptation_steps=100,
    target_accept_prob=.75)

# Initialize 10 chains using samples from the prior
joint_sample = alt_pooled_model(X_train).sample(10)
initial_state = [
    joint_sample[0],
    joint_sample[1],
    joint_sample[2],
    joint_sample[3],
#     tf.ones((X_train.shape[0],), dtype=tf.int32) * -1 # initialize with invalid choices
]

# Compile with tf.function and XLA for improved runtime performance
@tf.function(autograph=False, experimental_compile=True)
def run():
    return tfp.mcmc.sample_chain(
        num_results=500,
        current_state=initial_state,
        kernel=hmc,
        num_burnin_steps=200,
        trace_fn=lambda _, kr: kr,
#         trace_fn=lambda current_state, kernel_results: kernel_results
    )

samples, traces = run()
print('R-hat diagnostics: ', tfp.mcmc.potential_scale_reduction(samples))


shape of X_train = (60000, 10)
shape of best_choice = (60000,)

Can somebody plz help me, I am not able to figure this out

Christopher Suter

unread,
Feb 22, 2023, 12:00:42 PM2/22/23
to Surbhi Gupta, TensorFlow Probability
I can get this to run with the following modifications:

def alt_pooled_model(X_train):
return tfd.JointDistributionSequential([
tfd.HalfCauchy(loc=0., scale=sigma_beta, name="sigma_beta"),
tfd.HalfCauchy(loc=0., scale=sigma_error, name="sigma_error"),
tfd.Independent(
tfd.Normal(loc=tf.zeros(k), scale=sigma_beta),
reinterpreted_batch_ndims=1,
name="beta"),
tfd.Gumbel(loc=0., scale=sigma_error, name="error"),
lambda error, beta: tfd.Independent(
tfd.Deterministic(
tf.math.argmax(
tfd.Multinomial(
total_count=1,
logits=utilities(X_train, beta[..., tf.newaxis], error[..., tf.newaxis]),
).sample(), axis=0)),
reinterpreted_batch_ndims=1),
])


I added tfd.Independent(..) around terms that were erroneously contributing batch shapes.

But there's more weirdness here:
  - the half cauchy distributions are not actually being used for the scale terms of beta and error
  - Most worrisome: there is a sampling procedure in the definition of the final distribution. The log probs of this model will not be deterministic. Not sure what you are aiming to accomplish exactly; if you can say more maybe I can offer some guidance code-wise. Maybe this is of relevance?

--
You received this message because you are subscribed to the Google Groups "TensorFlow Probability" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfprobabilit...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfprobability/3f778844-16eb-4e30-9894-db295e9d3f7en%40tensorflow.org.

Surbhi Gupta

unread,
Feb 22, 2023, 12:27:00 PM2/22/23
to Christopher Suter, TensorFlow Probability
Thanks Christopher.

I am trying to build a utilities model for a max diff survey. 
X_train is of the shape (60000, 10) --> (total responses, total no of items). Total items (k) is 10. Total responses is no of respondents * no of questions sets per respondent (5000, 12). Each question set only shows 5 items and respondents select the best and worst items. X_train is a binary matrix with 1 representing the item present in the question set and 0 means absence.
best_choices is of shape (60000,) and represents the best choices of the respondents for the questions.

i am using the utility function = X_trains * betas + error
each item should have a beta 
utility func should be giving a utility for each of the 10 items  

does this help? 😬

Thanks,
Surbhi Gupta

Surbhi Gupta

unread,
Feb 22, 2023, 12:27:53 PM2/22/23
to Christopher Suter, TensorFlow Probability
best choice should be the item with max utility

Cheers,
Surbhi Gupta

Surbhi Gupta

unread,
Feb 22, 2023, 12:37:30 PM2/22/23
to TensorFlow Probability, Surbhi Gupta, TensorFlow Probability, c...@google.com
do u think multinomial is not required?

Christopher Suter

unread,
Feb 23, 2023, 12:56:44 PM2/23/23
to Surbhi Gupta, TensorFlow Probability
Hi Surbhi, I wrote down what I think you are trying to do: https://colab.research.google.com/drive/1CdFx-CeYUiSJmsA_05n9A0LDMas-x6qq

Hope this helps! Happy to clarify anything you like.

Surbhi Gupta

unread,
Feb 27, 2023, 12:11:22 PM2/27/23
to Christopher Suter, TensorFlow Probability
HI Christopher, 

Sorry for the late reply (i was travelling). 

The code works, thanks for the help :)

Cheers,
Surbhi Gupta

Surbhi Gupta

unread,
Feb 27, 2023, 12:14:33 PM2/27/23
to Christopher Suter, TensorFlow Probability
Hi Christopher,

I just have an additional question, I know it's possible to define multiple gumbel distributions but how do I define the covariance between all the gumbel dist --> just pass the covariance matrix to scale?

Cheers,
Surbhi Gupta

Christopher Suter

unread,
Feb 27, 2023, 12:18:14 PM2/27/23
to Surbhi Gupta, TensorFlow Probability
I don't believe we have a multivariate (correlated) Gumbel in TFP. You could do something like define a batch of them, use tfd.Independent to make a single multivariate distribution, then use tfb.MatvecScaleTriL bijector to mix them together.

Christopher Suter

unread,
Feb 27, 2023, 12:19:21 PM2/27/23
to Surbhi Gupta, TensorFlow Probability
Reply all
Reply to author
Forward
0 new messages