HMM and JointDistributions

Julio Aguilar

unread,

Jan 15, 2014, 2:27:40 PM1/15/14

to accor...@googlegroups.com

Hi there Cesar and everyone who sees this,

I'm trying to use HMM with some sort of multivariate discrete distribution (i.e. JointDistribution) for gesture recognition.

I would like to use a feature vector of 4 different features and each feature has diferent values.
Example:

feature1 could have values like {0, 1, 2}
feature2 could have values like {0, 1, 2, 3}
feature3 could have values like {0, 1}
feature4 could have values like {100.23}

(please correct me if I could use something different)
That's why I thought of using the JointDistribution because I can have something like this (from framework)

   // Lets create a joint distribution for two discrete variables:
   // the first of which can assume 3 distinct symbol values: 0, 1, 2
   // the second which can assume 5 distinct symbol values: 0, 1, 2, 3, 4
   
   int[] symbols = { 3, 5 }; // specify the symbol counts
   
   // Create the joint distribution for the above variables
   JointDistribution joint = new JointDistribution(symbols);
 
   // Now, suppose we would like to fit the distribution (estimate
   // its parameters) from the following multivariate observations:
   //
   double[][] observations = 
   {
       new double[] { 0, 0 },
       new double[] { 1, 1 },
       new double[] { 2, 1 },
       new double[] { 0, 0 },
   };
 
   
   // Estimate parameters
   joint.Fit(observations);

So I did some test code with a feature vector of 2 features with different number of symbols

double[][] observations = 
{
       new double[] { 0, 0 },
       new double[] { 1, 0 },
       new double[] { 1, 1 },
       new double[] { 0, 1 },
};


ITopology topology = new Forward(states: 3, deepness: 2, random: false);
// random variable 1 = {0, 1, 2} --> zero, negative, positive
// random variable 2 = {0, 1}    --> opened, closed
int[] symbols = { 3, 2 };
JointDistribution emissions = new JointDistribution(symbols);

// estimate the parameters (prob) based on observations
emissions.Fit(observations);

m_HMM = new HiddenMarkovModel<JointDistribution>(topology, emissions);

// Create a Baum-Welch learning algorithm to teach it
BaumWelchLearning<JointDistribution> teacher = new BaumWelchLearning<JointDistribution>(m_HMM);

// and call its Run method to start teaching
double error = teacher.Run(observations);

But I get this error: "This model expects univariate observations Parametername: observations"

Question 1: Why is that? I thought I had specified it pretty well with the JointDistribution.

Question 2: At first I thought of using MultinomialDistribution but knwo how to set the probabilities --> they are unknown (or at least for any "hidden" MM, I think) Or could I just use a dummy initial prob and then use the Fit method to have a better estimate?

(sort of unrelated) Question 3: I have real data. Is it a good or bad idea at all to discretize it? Also, that real data is 3D. Should I translate the 3D data to 2D?

If anyone has a suggestion of another (better) approach regarding the distributions, please let me know.

Also, if any of what I said makes no sense to you, please let me know :-)

Best Regards,
Julio

Julio Aguilar

unread,

Jan 16, 2014, 4:40:07 AM1/16/14

to accor...@googlegroups.com

Hi again,

regarding question 2.

I don't think a multinomial dist is a possible solution because it uses the bernoulli dist, which is just yes/no type of output. In this case, every feature of my vector must have two possible output values. But the features of my vector have more than 2 possible values.

Again, if any of what I said makes no sense to you, please let me know :-)

Any help regarding the JointDistribution would be much appreciated.
Thanks,

Best Regards,
Julio

César

unread,

Jan 16, 2014, 5:14:27 AM1/16/14

to accor...@googlegroups.com

Hi Julio!

Regarding your question, I suppose the following code would help you:

    double[][] sampleOfGesture = 
    {


        new double[] { 0, 0 },
        new double[] { 1, 0 },
        new double[] { 1, 1 },
        new double[] { 0, 1 },
    };



    double[][] anotherSampleOfTheSameGesture = 
    {


        new double[] { 0, 0 },
        new double[] { 1, 0 },
        new double[] { 1, 1 },
        new double[] { 0, 1 },
    };



    double[][] yetAnotherSampleOfTheSameGesture = 
    {


        new double[] { 0, 0 },
        new double[] { 1, 0 },
        new double[] { 1, 1 },
        new double[] { 0, 1 },
    };




    double[][][] allGestureSamples = 
    {
        sampleOfGesture, anotherSampleOfTheSameGesture, yetAnotherSampleOfTheSameGesture
    };





    ITopology topology = new Forward(states: 3, deepness: 2, random: false);
    // random variable 1 = {0, 1, 2} --> zero, negative, positive
    // random variable 2 = {0, 1}    --> opened, closed
    int[] symbols = { 3, 2 };
    JointDistribution emissions = new JointDistribution(symbols);




    var target = new HiddenMarkovModel<JointDistribution>(topology, emissions);




    // Create a Baum-Welch learning algorithm to teach it


    var teacher = new BaumWelchLearning<JointDistribution>(target);



    // and call its Run method to start teaching


    double error = teacher.Run(allGestureSamples);

In the example, as you can see, the model expects multivariate observations to be given as different samples of the same sequence. This is often the case in gesture recognition, where we can give several samples of the same gesture as different training points so the training algorithm can attempt to "generalize" over them.

By the way, I did something quite similar to what you are trying to do; and I would suggest you to use another distribution instead. Instead of a Joint, which in the framework is available only for discrete distributions, I would suggest using an Independent distribution with different components instead. For example, in this way you could use:

  Independent emissions = new Independent(
                new GeneralDiscreteDistribution(0, 3),
                new GeneralDiscreteDistribution(0, 4),
                new GeneralDiscreteDistribution(0, 2),
                new NormalDistribution(0, 1));

If you would like more information, a similar approach using those kind of mixed model (with bot discrete and continuous data) is described in this paper: "Sign Language Recognition with Support Vector Machines and Hidden Conditional Random Fields: Going from Fingerspelling to Natural Articulated Words", Machine Learning and Data Mining in Pattern Recognition; Lecture Notes in Computer Science Volume 7988, 2013, pp 84-98. Perhaps you could found it interesting in your research :-)

Hope it helps!

Best regards,

Cesar

Julio Aguilar

unread,

Feb 6, 2014, 5:25:24 AM2/6/14

to accor...@googlegroups.com

Hi there Cesar,

First of all, thank you very much for your help. It was realy useful.

Now, I've got another problem.

The Teacher.Run method is returning -infinity ... and whatever I do to test my HMM, I get a zero probability. Although the input sequences are pretty good (almost exactly as the ones used to train it)

I realy dont know where to look ... I mean my code is like the one you suggested but using only 3 GeneralDiscreteDistribution(0,3) ... I already had one model working with only one GeneralDiscreteDistribution(0,3) and one GeneralDiscreteDistribution(0,2).

Before (working)

Independent emissions = new Independent(
                new GeneralDiscreteDistribution(0, 3),


                new GeneralDiscreteDistribution(0, 2));

Now (not working)

Independent emissions = new Independent(
                new GeneralDiscreteDistribution(0, 3),


                new GeneralDiscreteDistribution(0, 3),
                new GeneralDiscreteDistribution(0, 3));

Could it be that the data is not the right type of data? I mean, in this model (with 3 observations) one of the observations is always zero, one changes between two values and one between 3.

It doens't matter the number of states I specify. I even changed new GeneralDiscreteDistribution(0, 3) to new GeneralDiscreteDistribution(0, 1) of the observations that is always zero to see if something changes. But I always get the same mean and variance and the -infinity value from the teacher.

Well, I hope you can help me.

Best regards,
Julio

César

unread,

Feb 6, 2014, 5:45:15 AM2/6/14

to accor...@googlegroups.com

Hi Julio,

That is interesting; can I have an example of your data so I can reproduce the issue and investigate what might be going on?

Best regards,

Cesar

Julio Aguilar

unread,

Feb 6, 2014, 7:33:22 AM2/6/14

to accor...@googlegroups.com

Hi Cesar,

Sure no problem. I just sent you the data.

The observations are like in the last post a 3D double array

double AllObservations[][][] = { firstObs[][], ... , lastObs[][] }

Where each obs corresponds to a file in the zip.

And like said, I'm using the following emissions:

Independent emissions = new Independent(
                new GeneralDiscreteDistribution(0, 3),
                new GeneralDiscreteDistribution(0, 3),
                new GeneralDiscreteDistribution(0, 3));

BTW: I always get the same mean and variance before and after training.
Note: I'm no initializing the distributions. Like determining variance and mean. I assumed that is done by the hidden markov methods. Right?

Thank you and best regards,
Julio

CurrentFeatureVectors.zip

César

unread,

Feb 6, 2014, 10:44:31 AM2/6/14

to accor...@googlegroups.com

Hi Julio,

I think I spotted one possible problem by checking the samples you attached. The Generic Discrete Distribution was originally conceived to work only with symbols. So, actually, the distribution function is expecting vectors containing values ranging from 0 to the number of symbols passed (3). From what I tested, it is possible to get your example working by adding +1 to every observation value in your observation vectors so the vectors contain only elements ranging from 0 to 3.

It will also be needed to set the GeneralDiscreteDistributions as

Independent emissions = new Independent(


        new GeneralDiscreteDistribution(0, 4),
        new GeneralDiscreteDistribution(0, 4),
        new GeneralDiscreteDistribution(0, 4));

Now, the point is that I had updated the GeneralDiscreteDistribution to support an offset value so it wouldn't be needed to manually shift the observation. However, I forgot to enable this support in the fitting function, which will be corrected in the next release. Please note, however, that the code should already work in this current version after the negative values are shifted into positive ones.

Reply all

Reply to author

Forward