non-standard transition params for linear chain crf

33 views
Skip to first unread message

adamc...@gmail.com

unread,
Sep 27, 2017, 4:02:39 AM9/27/17
to Factorie

Hello,

We're using Factorie to create a CRF for query understanding, to both segment
words into search phrases and label them in various categories. So we have
something like the following:

sealed trait Label
case object Name extends Label
case object Company extends Label
case object Skill extends Label
case object Location extends Label
//etc..

sealed trait Position
case object Begin extends Position
case object Internal extends Position


The CRF's hidden markov states are each a (Label, Position) pair. The Label
is a category label for a phrase, and the Position is for labelling
phrase boundaries because a phrase boundary always preceeds a Begin Position
and nothing else.

Using Factorie we model this like this, pretty much just following the linear
chain CRF examples in the tutorials:

class Query extends Chain[Query,Token]

class Token(val word:String, label: Label, pos: Postion))
 
extends FeatureVectorVariable[String] with ChainLink[Token,Query] {
 
//boilerplate like the example crf code
}

object CRFLabelDomain extends CategoricalDomain[(Label, Position)]
class CRFLabel(label: Label, pos: Position, val token: Token)
 
extends LabeledCategoricalVariable((label, pos)) {
 
//boilerplate...
}

class FactorieCRF(val tokenDomain: CategoricalVectorDomain[String])
 
extends TemplateModel with Parameters {

 
object transition extends DotTemplateWithStatistics2[CRFLabel, CRFLabel] {
    val weights
= Weights(new la.DenseTensor2(CRFLabelDomain.size,
     
CRFLabelDomain.size))
   
def unroll1(label: CRFLabel): Iterable[Factor] =
     
if (label.hasPrev) Factor(label.prev, label) else Nil
   
def unroll2(label: CRFLabel): Iterable[Factor] =
     
if (label.hasNext) Factor(label, label.next) else Nil
 
}
 
object evidence extends DotTemplateWithStatistics2[CRFLabel, Token] {
    val weights
= Weights(new la.DenseTensor2(CRFLabelDomain.size,
      tokenDomain
.dimensionSize))
   
def unroll1(label: CRFLabel): Iterable[Factor] = Factor(label, label.token)
   
def unroll2(token: Token): Iterable[Factor] =
     
throw new Error("Token values shouldn't change")
 
}
 
this += evidence
 
this += transition
}

There are two ways I want to amend this model.

1) Rule out illegal state transitions. A transition from Name, Internal
to Skill, Internal is illegal. A transition to any Internal state can only
come from a previous state with an identical Label.

I tried setting the illegal transition weights to -∞, but caused learning to
crash. It seems like those transitions just should not be parameters, so I
think I should be able to do something like the following (with unroll2
elided and label.prev expressed as an Option for clarity):

object transition extends Template2[CRFLabel, CRFLabel] with ??? {
  val weights
= //some approprately sized tensor
 
def unroll1(label: CRFLabel): Iterable[Factor] = (label.prev, label) match {
   
case (None, _) => Nil
   
case (Some(fst), snd@(_, Begin)) => Factor(weights((fst, snd)))
   
case (Some(fst@(l1, _)), snd@(l2, Internal)) => if(l1 == l2)
     
Factor(weigts(fst, snd)) else Factor(Double.NegativeInfinity)
 
}
}

Is this possible? I'm not sure how to correctly express this with Factorie.
This definitely doesn't seem like the right use of Family2.Factor. Instead
it looks like I should provide a score definition directly somehow, but
something about how to proceed here eludes me.

2) Because of limitations of our training data, we really can't estimate
transitions between phrases within a query. We can really only estimate
approximate distributions of likely segment lengths. So it seems reasonable to
only have parameters for (Name, Begin) -> (Name, Internal) and
(Name, Internal) -> (Name, Internal) transitions and replace the second
case in my pseudo scala above with:

   case (Some(fst), snd@(_, Begin)) => 0.0


and have a transition parameter tensor with only 2*|Labels| weights.

Any guidance as to how I can do this?

Thank you!
adam

Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages