non-standard transition params for linear chain crf

36 views

Skip to first unread message

adamc...@gmail.com

unread,

Sep 27, 2017, 4:02:39 AM9/27/17

to Factorie

Hello,

We're using Factorie to create a CRF for query understanding, to both segment
words into search phrases and label them in various categories. So we have
something like the following:

sealed trait Label
case object Name extends Label
case object Company extends Label
case object Skill extends Label
case object Location extends Label
//etc..

sealed trait Position
case object Begin extends Position
case object Internal extends Position

The CRF's hidden markov states are each a (Label, Position) pair. The Label
is a category label for a phrase, and the Position is for labelling
phrase boundaries because a phrase boundary always preceeds a Begin Position
and nothing else.

Using Factorie we model this like this, pretty much just following the linear
chain CRF examples in the tutorials:

class Query extends Chain[Query,Token]

class Token(val word:String, label: Label, pos: Postion))
  extends FeatureVectorVariable[String] with ChainLink[Token,Query] {
  //boilerplate like the example crf code
}

object CRFLabelDomain extends CategoricalDomain[(Label, Position)]
class CRFLabel(label: Label, pos: Position, val token: Token)
  extends LabeledCategoricalVariable((label, pos)) {
  //boilerplate...
}

class FactorieCRF(val tokenDomain: CategoricalVectorDomain[String])
  extends TemplateModel with Parameters {

  object transition extends DotTemplateWithStatistics2[CRFLabel, CRFLabel] {
    val weights = Weights(new la.DenseTensor2(CRFLabelDomain.size,
      CRFLabelDomain.size))
    def unroll1(label: CRFLabel): Iterable[Factor] =
      if (label.hasPrev) Factor(label.prev, label) else Nil
    def unroll2(label: CRFLabel): Iterable[Factor] =
      if (label.hasNext) Factor(label, label.next) else Nil
  }
  object evidence extends DotTemplateWithStatistics2[CRFLabel, Token] {
    val weights = Weights(new la.DenseTensor2(CRFLabelDomain.size,
      tokenDomain.dimensionSize))
    def unroll1(label: CRFLabel): Iterable[Factor] = Factor(label, label.token)
    def unroll2(token: Token): Iterable[Factor] =
      throw new Error("Token values shouldn't change")
  }
  this += evidence
  this += transition
}

There are two ways I want to amend this model.

1) Rule out illegal state transitions. A transition from Name, Internal
to Skill, Internal is illegal. A transition to any Internal state can only
come from a previous state with an identical Label.

I tried setting the illegal transition weights to -∞, but caused learning to
crash. It seems like those transitions just should not be parameters, so I
think I should be able to do something like the following (with unroll2
elided and label.prev expressed as an Option for clarity):

object transition extends Template2[CRFLabel, CRFLabel] with ??? {
  val weights = //some approprately sized tensor
  def unroll1(label: CRFLabel): Iterable[Factor] = (label.prev, label) match {
    case (None, _) => Nil
    case (Some(fst), snd@(_, Begin)) => Factor(weights((fst, snd)))
    case (Some(fst@(l1, _)), snd@(l2, Internal)) => if(l1 == l2)
      Factor(weigts(fst, snd)) else Factor(Double.NegativeInfinity)
  }
}

Is this possible? I'm not sure how to correctly express this with Factorie.
This definitely doesn't seem like the right use of Family2.Factor. Instead
it looks like I should provide a score definition directly somehow, but
something about how to proceed here eludes me.

2) Because of limitations of our training data, we really can't estimate
transitions between phrases within a query. We can really only estimate
approximate distributions of likely segment lengths. So it seems reasonable to
only have parameters for (Name, Begin) -> (Name, Internal) and
(Name, Internal) -> (Name, Internal) transitions and replace the second
case in my pseudo scala above with:

   case (Some(fst), snd@(_, Begin)) => 0.0

and have a transition parameter tensor with only 2*|Labels| weights.

Any guidance as to how I can do this?

Thank you!
adam