Hello,
I'd do that with the functional API. You create your network like any other network and then you just create several output layers, like so:
from keras.layers import Input, Dense
from keras.models import Model
inputs = Input(shape=(N,)) # N is the width of any input element, say you have 50000 data points, and each one is a vector of 3 elements, then N is 3
x = Dense(64, activation='relu')(inputs) # this is your network, let's say you have 2 hidden layers of 64 nodes each (don't know if that's enough for you)
x = Dense(64, activation='relu')(x)
output1 = Dense(M, activation='softmax')(x) # now you create an output layer for each of your K groups. And each output has M elements, out of which because of 'softmax' only 1 will be activated. (practically this is of course a distribution, but after sufficient training, this usually makes one element close to one and the other elements close to zero)
output2 = Dense(M, activation='softmax')(x)
output3 = Dense(M, activation='softmax')(x)
... #you have to fill in the remaining layers here, or better: use a for loop
outputK = Dense(M, activation='softmax')(x)
model = Model(input=inputs, output=[output1, output2, output3, ..., outputK])
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(inputData, [outputData1, outputData2, outputData3, ... outputDataK], nb_epochs=10, batch_size=64)