Hi all.
I am trying to build a probabilistic output layer for real values to calculate the log probability aka log likelihood. I am trying a beta distribution to empower the degrees of freedom that this distribution offers, but of course the results should be mapped somehow to R (instead of [0,1]), thus I am trying to use a bisector, although I am not sure whether it is the best approach or not.
The code that I have implemented is the following:
b = k.layers.Dense(2, activation=k.activations.relu)(x)
x = tfp.layers.DistributionLambda(
lambda t: tfp.distributions.Beta(1e-03 + t[..., :1], 1e-03 + t[..., 1:])
)(x)
out.append(
tfp.layers.DistributionLambda(
lambda t: tfp.distributions.TransformedDistribution(
t, tfp.bijectors.Invert(tfp.bijectors.Sigmoid())
),
name=f"out_{idx}"
)(x))
losses.append(lambda y_true, y_pred: -y_pred.log_prob(y_true))
The problem is that the output is not always as expected. For instance, I am comparing this result with the one obtained with an IndependentNormal layer and another with mse (the typical regressor) and there are some cases in which the beta produces some values for the accuracy whereas the other two just don't learn (there are no correlations between inputs and outputs)
Moreover, I can see this warn when the beta model gets "trained:WARNING:tensorflow:@custom_gradient grad_fn has 'variables' in signature, but no ResourceVariables were used on the forward pass"
I may be missing something as I am not really used to work with tfp, so any kind of help, hint or whatever you may suggest is appreciated.
Many thanks in advance.
Borja