I need to write a function that takes a row vector as input, and counts the number of repetitions of each value in the vector, returning a structure with each value and the corresponding number of repetitions.
For example:
A = [1 1 1 2 2 2 2 3 3 3 3 3];
B = my_function(A)
val = [1 2 3]
rep = [3 4 5]
I am stuck on how to count the repetitions of each value. Advice please?
NOTE: This is for a piece of coursework I have to complete. Therefore, I do NOT want a code solution, I just want some advice as to how to tackle this problem.
Thanks in advance.
some hints...
lookfor hist
help unique
--
Thanks for the advice, though I didn't explain the problem very well the first time around!
The idea is to start with a row vector, such as:
X = [1 1 2 2 2 1 1 4 4 4 4]
And to produce an encoded version, represented as a structure such as:
Enc =
val: [1 2 1 4]
rep: [2 3 2 4]
..so that the original row vector could be reconstructed at a later date.
For the above vector, the 'unique' function will group both sets of 1s together so that using:
enc.val = unique(X);
enc.rep = histc(X,enc.val);
Will give me:
enc =
val: [1 2 4]
rep: [4 3 4]
Obviously, this is no good for reconstructing the original vector and I am stuck trying to think of a way around this as I am only just starting to learn Matlab.
Any other advice would be greatly appreciated.
Thanks
There are several ways to do this.
If your vector is always integers, then accumarray
will work. If not, or you need a tolerance, then
my own consolidator will work. Find it on the
file exchange.
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?
objectId=8354&objectType=file
You might also consider run length encoding,
look for it on the file exchange too.
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?
objectId=6436&objectType=file
HTH,
john
as quoted from my previous reply:
rep=diff(find(diff([-Inf A Inf])))
val=A(cumsum(rep))
Bruno
Only problem is, that I need to understand how it works and write it myself, otherwise it's plagiarism.
I understand how the val=A(cumsum(rep)) line works - it just indexes A at the last value in each repetition set, effectively forming a list of single values.
Can't quite figure out the rep=diff(find(diff([-Inf X Inf]))) line at the moment, but with a bit of thought I'm sure I can get my head around it. I'll post again if not.
Thanks again.
diff([-Inf A Inf]) finds out where the repetitions end, because elements where the next consecutive element is the same will cancel to 0.
find(diff([-Inf A Inf]) finds the indices of the nonzero elements, which are effectively the indices where the next new set of repetetive elements starts. The difference between these elements being a count of the number of elements in each set of repetition.
The A(cumsum(rep)) section just indexes the last element of each set of repetetive elements in A.
Cheers.