[hltdi-l3] 2 new revisions pushed by onlysk...@gmail.com on 2014-05-11 05:20 GMT

6 views

Skip to first unread message

hltd...@googlecode.com

unread,

May 11, 2014, 1:20:51 AM5/11/14

to hltdi-...@googlegroups.com

2 new revisions:

Revision: e830a78f87e2
Branch: default
Author: Michael Gasser <gas...@cs.indiana.edu>
Date: Sun May 11 00:25:21 2014 UTC
Log: Removed old projects.
http://code.google.com/p/hltdi-l3/source/detail?r=e830a78f87e2

Revision: 5902fb7637ec
Branch: default
Author: Michael Gasser <gas...@cs.indiana.edu>
Date: Sun May 11 05:20:29 2014 UTC
Log: LGLP paper first draft mostly done.
http://code.google.com/p/hltdi-l3/source/detail?r=5902fb7637ec

==============================================================================
Revision: e830a78f87e2
Branch: default
Author: Michael Gasser <gas...@cs.indiana.edu>
Date: Sun May 11 00:25:21 2014 UTC
Log: Removed old projects.
http://code.google.com/p/hltdi-l3/source/detail?r=e830a78f87e2

Added:
/paperdrafts/lglp/coling2014.sty
Deleted:
/generation/README
/generation/dimension.py
/generation/en.py
/generation/jp-infl.html
/generation/jp.py
/generation/l3.py
/generation/latex/HomeworkStyle.sty
/generation/latex/Makefile
/generation/latex/entitlement.sty
/generation/latex/finalreport.tex
/generation/latex/progressreport0.tex
/generation/latex/progressreport1.tex
/generation/latex/proposal.tex
/generation/lex.py
/generation/linearize.py
/generation/xdg_constraint/README
/generation/xdg_constraint/__init__.py
/generation/xdg_constraint/constraints/__init__.py
/generation/xdg_constraint/constraints/constraint.py
/generation/xdg_constraint/constraints/equality.py
/generation/xdg_constraint/constraints/ordering.py
/generation/xdg_constraint/constraints/set_membership.py
/generation/xdg_constraint/constraints/summation.py
/generation/xdg_constraint/constraints/xdg.py
/generation/xdg_constraint/problem.py
/generation/xdg_constraint/solvers/__init__.py
/generation/xdg_constraint/solvers/solver.py
/generation/xdg_constraint/variable.py
/ordermodel/NOTES
/ordermodel/learnmodel.py
/ordermodel/ordermodel.py
/ordermodel/runtests.py
/ordermodel/testoncorpus.py
/ordermodel/tests/__init__.py
/ordermodel/tests/test_learnmodel.py
/ordermodel/tests/test_utils.py
/ordermodel/trainoncorpus.py
/ordermodel/utils.py
/xdg/__init__.py
/xdg/dimension.py
/xdg/l3.py
/xdg/languages/__init__.py
/xdg/languages/am.py
/xdg/languages/am.yaml
/xdg/languages/en.py
/xdg/languages/en.yaml
/xdg/languages/en_adjs.yaml
/xdg/languages/en_am.py
/xdg/languages/en_misc.yaml
/xdg/languages/en_nouns.yaml
/xdg/languages/en_nouns_prop.yaml
/xdg/languages/en_verbs.yaml
/xdg/languages/language.py
/xdg/languages/languages.py
/xdg/languages/lex.py
/xdg/languages/morpho/FST/cfilt.fst
/xdg/languages/morpho/FST/cfilt0.fst
/xdg/languages/morpho/__init__.py
/xdg/languages/morpho/fs.py
/xdg/languages/morpho/fst.py
/xdg/languages/morpho/geez/__init__.py
/xdg/languages/morpho/geez/am_conv_sera.txt
/xdg/languages/morpho/geez/geez.py
/xdg/languages/morpho/geez/sil_conv_sera.txt
/xdg/languages/morpho/geez/ti_conv_sera.txt
/xdg/languages/morpho/internals.py
/xdg/languages/morpho/letter_tree.py
/xdg/languages/morpho/logic.py
/xdg/languages/morpho/morphology.py
/xdg/languages/morpho/semiring.py
/xdg/languages/morpho/utils.py
/xdg/languages/qu.py
/xdg/languages/qu.yaml
/xdg/linearize.py
/xdg/node.py
/xdg/xdg_constraint/__init__.py
/xdg/xdg_constraint/constraints/__init__.py
/xdg/xdg_constraint/constraints/constraint.py
/xdg/xdg_constraint/constraints/equality.py
/xdg/xdg_constraint/constraints/ordering.py
/xdg/xdg_constraint/constraints/set_membership.py
/xdg/xdg_constraint/constraints/summation.py
/xdg/xdg_constraint/constraints/xdg.py
/xdg/xdg_constraint/demos/nqueens.py
/xdg/xdg_constraint/demos/palindromes.py
/xdg/xdg_constraint/demos/shorter.py
/xdg/xdg_constraint/problem.py
/xdg/xdg_constraint/solvers/__init__.py
/xdg/xdg_constraint/solvers/solver.py
/xdg/xdg_constraint/solvers/wcsp.py
/xdg/xdg_constraint/solvers/weighted.py
/xdg/xdg_constraint/variable.py
Modified:
/paperdrafts/lglp/lglp14.pdf
/paperdrafts/lglp/lglp14.tex

=======================================
--- /dev/null
+++ /paperdrafts/lglp/coling2014.sty Sun May 11 00:25:21 2014 UTC
@@ -0,0 +1,354 @@
+% File coling2014.sty
+% January and February 2014
+
+% This is the LaTeX style file for Coling 2014. It is nearly identical to
+% the style file for ACL 2014.
+%
+% Changes made: switched to single column format and removed margin around
+% abtract
+
+% This is the LaTeX style file for ACL 2014. It is nearly identical to
+% the style files for ACL 2013, EACL 2006, ACL2005, ACL 2002, ACL
+% 2001, ACL 2000, EACL 95 and EACL 99.
+%
+% Changes made include: adapt layout to A4 and centimeters, widen abstract
+
+% This is the LaTeX style file for ACL 2000. It is nearly identical to the
+% style files for EACL 95 and EACL 99. Minor changes include editing the
+% instructions to reflect use of \documentclass rather than \documentstyle
+% and removing the white space before the title on the first page
+% -- John Chen, June 29, 2000
+
+% To convert from submissions prepared using the style file aclsub.sty
+% prepared for the ACL 2000 conference, proceed as follows:
+% 1) Remove submission-specific information: \whichsession, \id,
+% \wordcount, \otherconferences, \area, \keywords
+% 2) \summary should be removed. The summary material should come
+% after \maketitle and should be in the ``abstract'' environment
+% 3) Check all citations. This style should handle citations correctly
+% and also allows multiple citations separated by semicolons.
+% 4) Check figures and examples. Because the final format is double-
+% column, some adjustments may have to be made to fit text in the column
+% or to choose full-width (\figure*} figures.
+% 5) Change the style reference from aclsub to acl2000, and be sure
+% this style file is in your TeX search path
+
+
+% This is the LaTeX style file for EACL-95. It is identical to the
+% style file for ANLP '94 except that the margins are adjusted for A4
+% paper. -- abney 13 Dec 94
+
+% The ANLP '94 style file is a slightly modified
+% version of the style used for AAAI and IJCAI, using some changes
+% prepared by Fernando Pereira and others and some minor changes
+% by Paul Jacobs.
+
+% Papers prepared using the aclsub.sty file and acl.bst bibtex style
+% should be easily converted to final format using this style.
+% (1) Submission information (\wordcount, \subject, and \makeidpage)
+% should be removed.
+% (2) \summary should be removed. The summary material should come
+% after \maketitle and should be in the ``abstract'' environment
+% (between \begin{abstract} and \end{abstract}).
+% (3) Check all citations. This style should handle citations correctly
+% and also allows multiple citations separated by semicolons.
+% (4) Check figures and examples. Because the final format is double-
+% column, some adjustments may have to be made to fit text in the column
+% or to choose full-width (\figure*} figures.
+
+% Place this in a file called aclap.sty in the TeX search path.
+% (Placing it in the same directory as the paper should also work.)
+
+% Prepared by Peter F. Patel-Schneider, liberally using the ideas of
+% other style hackers, including Barbara Beeton.
+% This style is NOT guaranteed to work. It is provided in the hope
+% that it will make the preparation of papers easier.
+%
+% There are undoubtably bugs in this style. If you make bug fixes,
+% improvements, etc. please let me know. My e-mail address is:
+% pf...@research.att.com
+
+% Papers are to be prepared using the ``acl'' bibliography style,
+% as follows:
+% \documentclass[11pt]{article}
+% \usepackage{acl2000}
+% \title{Title}
+% \author{Author 1 \and Author 2 \\ Address line \\ Address line \And
+% Author 3 \\ Address line \\ Address line}
+% \begin{document}
+% ...
+% \bibliography{bibliography-file}
+% \bibliographystyle{acl}
+% \end{document}
+
+% Author information can be set in various styles:
+% For several authors from the same institution:
+% \author{Author 1 \and ... \and Author n \\
+% Address line \\ ... \\ Address line}
+% if the names do not fit well on one line use
+% Author 1 \\ {\bf Author 2} \\ ... \\ {\bf Author n} \\
+% For authors from different institutions:
+% \author{Author 1 \\ Address line \\ ... \\ Address line
+% \And ... \And
+% Author n \\ Address line \\ ... \\ Address line}
+% To start a seperate ``row'' of authors use \AND, as in
+% \author{Author 1 \\ Address line \\ ... \\ Address line
+% \AND
+% Author 2 \\ Address line \\ ... \\ Address line \And
+% Author 3 \\ Address line \\ ... \\ Address line}
+
+% If the title and author information does not fit in the area allocated,
+% place \setlength\titlebox{<new height>} right after
+% \usepackage{coling2014}
+% where <new height> can be something larger than 5cm
+
+\typeout{Conference Style for COLING 2014 -- prepared 29th January 2014}
+
+% NOTE: Some laser printers have a serious problem printing TeX output.
+% These printing devices, commonly known as ``write-white'' laser
+% printers, tend to make characters too light. To get around this
+% problem, a darker set of fonts must be created for these devices.
+%
+
+
+
+% A4 modified by Eneko; again modified by Alexander for 5cm titlebox
+\setlength{\paperwidth}{21cm} % A4
+\setlength{\paperheight}{29.7cm}% A4
+\setlength\topmargin{-0.5cm}
+\setlength\oddsidemargin{0cm}
+\setlength\textheight{24.7cm}
+\setlength\textwidth{16.0cm}
+\setlength\columnsep{0.6cm}
+\newlength\titlebox
+\setlength\titlebox{5cm}
+\setlength\headheight{5pt}
+\setlength\headsep{0pt}
+\thispagestyle{empty}
+\pagestyle{empty}
+
+
+\flushbottom \sloppy
+
+% We're never going to need a table of contents, so just flush it to
+% save space --- suggested by drstrip@sandia-2
+\def\addcontentsline#1#2#3{}
+
+% Footnote without marker for copyright/licence statement
+% Code taken from
+% http://tex.stackexchange.com/questions/30720/footnote-without-a-marker
+% which claims to have taken it, in turn, from
+% http://help-csli.stanford.edu/tex/latex-footnotes.shtml#unnumber
+% Note the comment that there may be numbering problems if
+% you are using the hyperref package.
+\def\blfootnote{\xdef\@thefnmark{}\@footnotetext}
+
+% Title stuff, taken from deproc.
+\def\maketitle{\par
+ \begingroup
+ \def\thefootnote{\fnsymbol{footnote}}
+ \def\@makefnmark{\hbox to 0pt{$^{\@thefnmark}$\hss}}
+ \@maketitle \@thanks
+ \endgroup
+ \setcounter{footnote}{0}
+ \let\maketitle\relax \let\@maketitle\relax
+ \gdef\@thanks{}\gdef\@author{}\gdef\@title{}\let\thanks\relax}
+\def\@maketitle{\vbox to \titlebox{\hsize\textwidth
+ \linewidth\hsize \vskip 0.125in minus 0.125in \centering
+ {\Large\bf \@title \par} \vskip 0.2in plus 1fil minus 0.1in
+ {\def\and{\unskip\enspace{\rm and}\enspace}%
+ \def\And{\end{tabular}\hss \egroup \hskip 1in plus 2fil
+ \hbox to 0pt\bgroup\hss \begin{tabular}[t]{c}\bf}%
+ \def\AND{\end{tabular}\hss\egroup \hfil\hfil\egroup
+ \vskip 0.25in plus 1fil minus 0.125in
+ \hbox to \linewidth\bgroup\large \hfil\hfil
+ \hbox to 0pt\bgroup\hss \begin{tabular}[t]{c}\bf}
+ \hbox to \linewidth\bgroup\large \hfil\hfil
+ \hbox to 0pt\bgroup\hss \begin{tabular}[t]{c}\bf\@author
+ \end{tabular}\hss\egroup
+ \hfil\hfil\egroup}
+ \vskip 0.3in plus 2fil minus 0.1in
+}}
+
+% margins for abstract
+\renewenvironment{abstract}%
+ {\centerline{\large\bf Abstract}%
+ \begin{list}{}%
+ {\setlength{\rightmargin}{0.6cm}%
+ \setlength{\leftmargin}{0.6cm}}%
+ \item[]\ignorespaces}%
+ {\unskip\end{list}}
+
+%\renewenvironment{abstract}{\centerline{\large\bf
+% Abstract}\vspace{0.5ex}\begin{quote}}{\par\end{quote}\vskip 1ex}
+
+
+% bibliography
+
+\def\thebibliography#1{\section*{References}
+ \global\def\@listi{\leftmargin\leftmargini
+ \labelwidth\leftmargini \advance\labelwidth-\labelsep
+ \topsep 1pt plus 2pt minus 1pt
+ \parsep 0.25ex plus 1pt \itemsep 0.25ex plus 1pt}
+ \list
{[\arabic{enumi}]}{\settowidth\labelwidth{[#1]}\leftmargin\labelwidth
+ \advance\leftmargin\labelsep\usecounter{enumi}}
+ \def\newblock{\hskip .11em plus .33em minus -.07em}
+ \sloppy
+ \sfcode`\.=1000\relax}
+
+\def\@up#1{\raise.2ex\hbox{#1}}
+
+% most of cite format is from aclsub.sty by SMS
+
+% don't box citations, separate with ; and a space
+% also, make the penalty between citations negative: a good place to break
+% changed comma back to semicolon pj 2/1/90
+%
\def\@citex[#1]#2{\if@filesw\immediate\write\@auxout{\string\citation{#2}}\fi
+% \def\@citea{}\@cite{\@for\@citeb:=#2\do
+% {\@citea\def\@citea{;\penalty\@citeseppen\ }\@ifundefined
+% {b@\@citeb}{{\bf ?}\@warning
+% {Citation `\@citeb' on page \thepage \space undefined}}%
+% {\csname b@\@citeb\endcsname}}}{#1}}
+
+% don't box citations, separate with ; and a space
+% Replaced for multiple citations (pj)
+% don't box citations and also add space, semicolon between multiple
citations
+\def\@citex[#1]#2{\if@filesw\immediate\write\@auxout{\string\citation{#2}}\fi
+ \def\@citea{}\@cite{\@for\@citeb:=#2\do
+ {\@citea\def\@citea{; }\@ifundefined
+ {b@\@citeb}{{\bf ?}\@warning
+ {Citation `\@citeb' on page \thepage \space undefined}}%
+ {\csname b@\@citeb\endcsname}}}{#1}}
+
+% Allow short (name-less) citations, when used in
+% conjunction with a bibliography style that creates labels like
+% \citename{<names>, }<year>
+%
+\let\@internalcite\cite
+\def\cite{\def\citename##1{##1, }\@internalcite}
+\def\shortcite{\def\citename##1{}\@internalcite}
+\def\newcite{\def\citename##1{{\frenchspacing##1} (}\@internalciteb}
+
+% Macros for \newcite, which leaves name in running text, and is
+% otherwise like \shortcite.
+\def\@citexb[#1]#2{\if@filesw\immediate\write\@auxout{\string\citation{#2}}\fi
+ \def\@citea{}\@newcite{\@for\@citeb:=#2\do
+ {\@citea\def\@citea{;\penalty\@m\ }\@ifundefined
+ {b@\@citeb}{{\bf ?}\@warning
+ {Citation `\@citeb' on page \thepage \space undefined}}%
+{\csname b@\@citeb\endcsname}}}{#1}}
+\def\@internalciteb{\@ifnextchar
[{\@tempswatrue\@citexb}{\@tempswafalse\@citexb[]}}
+
+\def\@newcite#1#2{{#1\if@tempswa, #2\fi)}}
+
+\def\@biblabel#1{\def\citename##1{##1}[#1]\hfill}
+
+%%% More changes made by SMS (originals in latex.tex)
+% Use parentheses instead of square brackets in the text.
+\def\@cite#1#2{({#1\if@tempswa , #2\fi})}
+
+% Don't put a label in the bibliography at all. Just use the unlabeled
format
+% instead.
+\def\thebibliography#1{\vskip\parskip%
+\vskip\baselineskip%
+\def\baselinestretch{1}%
+\ifx\@currsize\normalsize\@normalsize\else\@currsize\fi%
+\vskip-\parskip%
+\vskip-\baselineskip%
+\section*{References\@mkboth
+ {References}{References}}\list
+ {}{\setlength{\labelwidth}{0pt}\setlength{\leftmargin}{\parindent}
+ \setlength{\itemindent}{-\parindent}}
+ \def\newblock{\hskip .11em plus .33em minus -.07em}
+ \sloppy\clubpenalty4000\widowpenalty4000
+ \sfcode`\.=1000\relax}
+\let\endthebibliography=\endlist
+
+% Allow for a bibliography of sources of attested examples
+\def\thesourcebibliography#1{\vskip\parskip%
+\vskip\baselineskip%
+\def\baselinestretch{1}%
+\ifx\@currsize\normalsize\@normalsize\else\@currsize\fi%
+\vskip-\parskip%
+\vskip-\baselineskip%
+\section*{Sources of Attested Examples\@mkboth
+ {Sources of Attested Examples}{Sources of Attested Examples}}\list
+ {}{\setlength{\labelwidth}{0pt}\setlength{\leftmargin}{\parindent}
+ \setlength{\itemindent}{-\parindent}}
+ \def\newblock{\hskip .11em plus .33em minus -.07em}
+ \sloppy\clubpenalty4000\widowpenalty4000
+ \sfcode`\.=1000\relax}
+\let\endthesourcebibliography=\endlist
+
+\def\@lbibitem[#1]#2{\item[]\if@filesw
+ { \def\protect##1{\string ##1\space}\immediate
+ \write\@auxout{\string\bibcite{#2}{#1}}\fi\ignorespaces}}
+
+\def\@bibitem#1{\item\if@filesw \immediate\write\@auxout
+ {\string\bibcite{#1}{\the\c@enumi}}\fi\ignorespaces}
+
+% sections with less space
+\def\section{\@startsection {section}{1}{\z@}{-2.0ex plus
+ -0.5ex minus -.2ex}{1.5ex plus 0.3ex
minus .2ex}{\large\bf\raggedright}}
+\def\subsection{\@startsection{subsection}{2}{\z@}{-1.8ex plus
+ -0.5ex minus -.2ex}{0.8ex plus .2ex}{\normalsize\bf\raggedright}}
+%% changed by KO to - values to get teh initial parindent right
+\def\subsubsection{\@startsection{subsubsection}{3}{\z@}{-1.5ex plus
+ -0.5ex minus -.2ex}{0.5ex plus .2ex}{\normalsize\bf\raggedright}}
+\def\paragraph{\@startsection{paragraph}{4}{\z@}{1.5ex plus
+ 0.5ex minus .2ex}{-1em}{\normalsize\bf}}
+\def\subparagraph{\@startsection{subparagraph}{5}{\parindent}{1.5ex plus
+ 0.5ex minus .2ex}{-1em}{\normalsize\bf}}
+
+% Footnotes
+\footnotesep 6.65pt %
+\skip\footins 9pt plus 4pt minus 2pt
+\def\footnoterule{\kern-3pt \hrule width 5pc \kern 2.6pt }
+\setcounter{footnote}{0}
+
+% Lists and paragraphs
+\parindent 1em
+\topsep 4pt plus 1pt minus 2pt
+\partopsep 1pt plus 0.5pt minus 0.5pt
+\itemsep 2pt plus 1pt minus 0.5pt
+\parsep 2pt plus 1pt minus 0.5pt
+
+\leftmargin 2em \leftmargini\leftmargin \leftmarginii 2em
+\leftmarginiii 1.5em \leftmarginiv 1.0em \leftmarginv .5em
\leftmarginvi .5em
+\labelwidth\leftmargini\advance\labelwidth-\labelsep \labelsep 5pt
+
+\def\@listi{\leftmargin\leftmargini}
+\def\@listii{\leftmargin\leftmarginii
+ \labelwidth\leftmarginii\advance\labelwidth-\labelsep
+ \topsep 2pt plus 1pt minus 0.5pt
+ \parsep 1pt plus 0.5pt minus 0.5pt
+ \itemsep \parsep}
+\def\@listiii{\leftmargin\leftmarginiii
+ \labelwidth\leftmarginiii\advance\labelwidth-\labelsep
+ \topsep 1pt plus 0.5pt minus 0.5pt
+ \parsep \z@ \partopsep 0.5pt plus 0pt minus 0.5pt
+ \itemsep \topsep}
+\def\@listiv{\leftmargin\leftmarginiv
+ \labelwidth\leftmarginiv\advance\labelwidth-\labelsep}
+\def\@listv{\leftmargin\leftmarginv
+ \labelwidth\leftmarginv\advance\labelwidth-\labelsep}
+\def\@listvi{\leftmargin\leftmarginvi
+ \labelwidth\leftmarginvi\advance\labelwidth-\labelsep}
+
+\abovedisplayskip 7pt plus2pt minus5pt%
+\belowdisplayskip \abovedisplayskip
+\abovedisplayshortskip 0pt plus3pt%
+\belowdisplayshortskip 4pt plus3pt minus3pt%
+
+% Less leading in most fonts (due to the narrow columns)
+% The choices were between 1-pt and 1.5-pt leading
+\def\@normalsize{\@setsize\normalsize{11pt}\xpt\@xpt}
+\def\small{\@setsize\small{10pt}\ixpt\@ixpt}
+\def\footnotesize{\@setsize\footnotesize{10pt}\ixpt\@ixpt}
+\def\scriptsize{\@setsize\scriptsize{8pt}\viipt\@viipt}
+\def\tiny{\@setsize\tiny{7pt}\vipt\@vipt}
+\def\large{\@setsize\large{14pt}\xiipt\@xiipt}
+\def\Large{\@setsize\Large{16pt}\xivpt\@xivpt}
+\def\LARGE{\@setsize\LARGE{20pt}\xviipt\@xviipt}
+\def\huge{\@setsize\huge{23pt}\xxpt\@xxpt}
+\def\Huge{\@setsize\Huge{28pt}\xxvpt\@xxvpt}
=======================================
--- /generation/README Fri Nov 27 10:54:58 2009 UTC
+++ /dev/null
@@ -1,1 +0,0 @@
-This is where Wren, Alex and Yin are working on using the XDG formalism to
generate text.
=======================================
--- /generation/dimension.py Sat Nov 28 09:39:59 2009 UTC
+++ /dev/null
@@ -1,1420 +0,0 @@
-# Implementation of dimensions and dimension-specific constraint
instantiation
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# 2009.10.31
-# Added group arc principles and constraints
-
-from xdg_constraint import *
-
-class Dimension:
- """Abstract class for XDG dimensions."""
-
- def __init__(self, language, problem):
- """
- @param language: the language for the dimension, needed for
- implementation of constraints
- @type language: Language
- @param problem: the constraint satisfaction problem
- @type problem: XDGProblem
- """
- self.language = language
- self.problem = problem
- self.principles = []
- self.abbrev = ''
-
-## def set_principles(self, groups):
-## """Assign principles in groups (lists of method instances)."""
-## self.principle_groups = groups
-##
-## def get_principles_iter(self):
-## """
-## Generator for getting groups of principles.
-## """
-## for principles in self.principle_groups:
-## yield principles
-
- def set_principles(self, principles):
- self.principles = principles
-
- def get_principles(self):
- """Return all principles."""
- return self.principles
-## return reduce(lambda x, y: x + y, self.principle_groups)
-
- def get_node_vars(self, node, kind):
- """Return the list of variables of kind variable dictionary in
node for
- this dimension."""
- dim_vars = node.vars.get(self.abbrev)
- if dim_vars:
- return dim_vars.get(kind, [])
- return []
-
- def get_lex_dim(self, lex):
- """Return the LexDim in lex for this dimension."""
- lex_dim = lex.dims.get(self.abbrev)
- if lex_dim is None:
- print "! no dimension for " + self.abbrev + " in " + str(lex)
- return lex_dim
-
-
-class ArcDimension(Dimension):
- """
- Abstract class for dimensions with arcs, such as syntax and
- semantics. These have the valency principle, but not necessarily
- tree, order, or agreement.
- """
-
- def group_arc_principle(self):
- """
- Add group arc constraints, constraining arcs with particular
- labels to have daughters that belong to the same group and
- have particular (word) labels.
- """
- if self.problem.groups:
- # Dict of arc variables by source, dest pairs
- arc_daughs = self.problem.dim_vars[self.abbrev]['arc_daughs']
- for gid, group in self.problem.groups.iteritems():
- # group is dict with various group properties
- # A list of lists of node,lex,var,entry_index tuples
- nlve = group['n_l_v_e']
- # Add node words to each sublist
- for index, (lex, word_list) in enumerate(nlve):
- nlve[index] = [word_list[0][0].word, lex, word_list]
- for word, lex, word_list in nlve:
- # Each word_list is a list of node,lex,var,entry_index
tuples for a single word
- # For this word, check to see if its lex has a
groupouts features
- lex_dim = self.get_lex_dim(lex)
- lex_groupouts = lex_dim.groupouts.items()
- if lex_groupouts:
- # Any node in word_list must satisfy the set of
arc_label, daugh_label
- # constraints if its entry_var is bound to the
group entry_index
- for node, entry_var, entry_index in word_list:
- # A dict of arc label : daughter label pairs
- for arc_label, daugh_label in lex_groupouts:
- # Find which nodes have daugh_label as
their name
- matching_nodes = filter(lambda x: x[0] ==
daugh_label, nlve)[0]
- variables = [entry_var]
- d_entry_indices = [x[2] for x in
matching_nodes[2]]
- for d_node, d_entry_var, d_entry_index in
matching_nodes[2]:
- m_index = node.index
- d_index = d_node.index
- arc_var = arc_daughs.get((m_index,
d_index))
- variables.extend([d_entry_var,
arc_var])
- # Add the constraint for this combination
of mother and daughter
- # entry variables and the arc variable
- self.problem.addConstraint(
-
XDGConstraint(group_arc_agreeC(entry_index, arc_label, d_entry_indices),
- name =
gid + ':GroupArc_' + arc_label),
- variables)
-
- def valency_principle(self):
- """Add valency constraints."""
- # For all nodes except end-of-sentence node, add in and out
valency constraints.
- for node in self.problem.nodes[:-1]:
- for label in self.labels + ['root']:
- if label:
- # label is not None
- self._valency_principle1(node, label, ins=True)
- self._valency_principle1(node, label, ins=False)
-
- def _valency_principle1(self, node, label, ins=True):
- """Is valency satisfied for label in node's ins or outs?
-
- @param node: the node that the constraint applies to
- @type node: Node
- @param label: an arc label
- @type label: string
- @param ins: whether the constraint applies to the ins or outs
of the node
- @type ins: boolean
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- # Constraint name
- name = str(node.index) + ':'
- name += 'Valency_' + label
- if ins:
- name += '_in'
- else:
- name += '_out'
- # Variables: node mother or daughter vars: arc labels
- variables = self.get_node_vars(node, 'mother_vars') if ins \
- else self.get_node_vars(node,'daughter_vars')
- if node.n_entries == 1:
- # Only one entry; entry var is not included in constraint
- lex = node.get_entry()
- lex_dim = self.get_lex_dim(lex)
- if lex_dim:
- lex_constraint = lex_dim.ins.get(label, 0) if ins \
- else lex_dim.outs.get(label, 0)
- if variables:
- # No need to handle the 0 cases since the in and out
label are already
- # constrained to be possible labels only
- # if lex_constraint == 0:
- # name += '0'
- #
self.problem.addConstraint(XDGConstraint(none_equalC(label), name=name),
variables)
- if lex_constraint == '!':
- # Exactly 1 of label
- name += '!'
-
self.problem.addConstraint(XDGConstraint(one_equalC(label), name=name),
- variables)
- elif lex_constraint == '?':
- # 0 or 1 of label
- name += '?'
-
self.problem.addConstraint(XDGConstraint(zero_or_one_equalC(label),
name=name),
- variables)
- elif node.n_entries > 1:
- # Node is ambiguous, so entry variable is included with
daughter/mother variables in constraint
- variables = [node.entry_var] + variables
- name += '_lex'
-
self.problem.addConstraint(XDGConstraint(valency_constraintC(label, node,
self.abbrev, ins), name=name),
- variables)
-
-class Syntax(ArcDimension):
- """Class for the syntax dimension."""
-
- def __init__(self, language=None, problem=None):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem)
- self.abbrev = 'syn'
- self.labels = language.labels.get(self.abbrev, []) + [None]
-
- # Principled are grouped so that they can some can have priority
over others.
- self.set_principles([self.tree_principle,
self.projectivity_principle,
- self.group_arc_principle,
- self.order_principle, self.valency_principle,
- self.agr_principle])
-
- def tree_principle(self):
- """Add constraints that no node can have more than one mother."""
- # For all nodes except end-of-sentence node, there must be only
one mother
- for node in self.problem.nodes[:-1]:
- self.problem.addConstraint(XDGConstraint(one_exists,
name=str(node.index) + ':Tree'),
-
self.get_node_vars(node, 'mother_vars'))
- # End-of-sentence node has only one daughter
- self.problem.addConstraint(XDGConstraint(one_exists,
name='EOS:Tree'),
-
self.get_node_vars(self.problem.nodes[-1], 'daughter_vars'))
-
- def projectivity_principle(self):
- """For any arc with at least one node in the middle, prevent
internal
- nodes from having mothers outside the interval."""
- arc_dict = self.problem.dim_vars[self.abbrev]['arc_vars']
- for node in self.problem.nodes[:-1]:
- # Get all daughter vars with daughters separated by at least
one node
- daughs = self.get_node_vars(node, 'daughter_vars')
- name = str(node.index) + ':'
- for daugh in daughs:
- # For each arc variable, check whether the distance between
- # the children >= 2
- source, dest = arc_dict[daugh]
- if source - dest >= 2:
- # Left arc
- # Check the arcs into nodes in the interval
- for interval_index in range(dest+1, source):
- interval_node = self.problem.nodes[interval_index]
- for int_node_moth in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs coming from nodes
outside the interval
- int_source, int_dest = arc_dict[int_node_moth]
- if int_source < dest or int_source > source:
- cname = name + str(source) + '|' +
str(dest) + ':Proj'
-
self.problem.addConstraint(XDGConstraint(no_cross, name=cname),
- [daugh,
int_node_moth])
-## print 'Left', node, source, dest,
interval_node, int_node_moth
-## else:
-## print 'XLeft', node, source, dest,
interval_node, int_node_moth
- elif dest - source >=2:
- # Right arc
- # Check arcs in the nodes in the interval
- for interval_index in range(source+1, dest):
- interval_node = self.problem.nodes[interval_index]
- for int_node_moth in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs from nodes outside the
interval
- int_source, int_dest = arc_dict[int_node_moth]
- if int_source > dest or int_source < source:
- cname = name + str(source) + '|' +
str(dest) + ':Proj'
-
self.problem.addConstraint(XDGConstraint(no_cross, name=cname),
- [daugh,
int_node_moth])
-## print 'Right', node, source, dest,
interval_node, int_node_moth
-## else:
-## print 'XRight', node, source, dest,
interval_node, int_node_moth
-
- def order_principle(self):
- """Create constraints for the order pairs in each node's order
attribute.
-
- Variables:
- daughter arc_labels
- if node has multiple lex entries: entry_var
- """
- # For all nodes except the end-of-sentence node
- for node in self.problem.nodes[:-1]:
- if node.n_entries > 1:
- # Node is ambiguous; create a single constraint for the
node
-
self.problem.addConstraint(XDGConstraint(order_constraintC(node,
self.abbrev),
-
name=str(node.index) + ':Order_lex'),
- [node.entry_var] +
self.get_node_vars(node, 'daughter_vars'))
- elif node.n_entries == 1:
- # Only 1 entry for node; create a separate constraint for
each ordering pair
- entry = node.entries[0]
- for order_labels in self.get_lex_dim(entry).order:
- self._order_principle1(node, order_labels)
-
- def _order_principle1(self, node, order_labels):
- """
- Is ordering satisfied for the order_labels pair (either of which
may be '^')?
-
- @param node: node for which the constraint is being created
- @type node: Node
- @param order_labels: partial order pairs for node entry
- @type order_labels: list of tuples of two strings
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- variables = self.get_node_vars(node, 'daughter_vars')
- daughters = self.get_node_vars(node, 'var_daughters')
- name = str(node.index) + ':'
- if variables:
- if order_labels[0] == '^':
- # Node with second label must follow this node
-
self.problem.addConstraint(XDGConstraint(before_allC(node.index, daughters,
order_labels[1]),
-
name=str(node.index) + ':Order_D>^'),
- variables)
- elif order_labels[1] == '^':
- # Node with first label must precede this node
-
self.problem.addConstraint(XDGConstraint(all_beforeC(node.index, daughters,
order_labels[0]),
-
name=str(node.index) + ':Order_D<^'),
- variables)
- else:
- # Constraint relates two daughters of this node
-
self.problem.addConstraint(XDGConstraint(all_before_allC(daughters,
order_labels[0], order_labels[1]),
-
name=str(node.index) + ':Order_D<D'),
- variables)
-
- def agr_principle(self):
- """Create constraints for the labels in each node's agree field."""
- # For all nodes except the end-of-sentence node
- for node in self.problem.nodes[:-1]:
- if node.n_entries == 1:
- # Only 1 entry for node; its entry var not included in
constraint
- entry = node.get_entry()
- agree = self.get_lex_dim(entry).agree
- entry_agrs = self.get_lex_dim(entry).agrs
- # Entry must have both agree and agrs (English past verbs
have agree but not agrs)
- if agree and entry_agrs:
- # Node has agree constraints.
- if self.get_node_vars(node, 'max_agrs') > 1:
- # Node's agr var must be included in the
constraint.
- # Make a single constraint for all agr dicts in
agrs.
- self._agr_principle_mult(node, entry_agrs, agree)
- else:
- # There's only one agr
- # What happens depends on whether the
has_mult_agrs is true for the language.
- if self.language.has_mult_agrs:
- # Make a separate constraint for each agr dict
in agrs whose label is in agree.
- for agr_label, agr in
entry_agrs[0].iteritems():
- if agr_label in agree:
- self._agr_principle1(node, agr_label,
agr)
- else:
- # Make a constraint for each agree label.
- for agr_label in agree:
- self._agr_principle1(node, agr_label,
entry_agrs[0])
-
- elif any([(self.get_lex_dim(entry).agree and
self.get_lex_dim(entry).agrs)\
- for entry in node.entries]):
- # Node is ambiguous; each entry has its own agrs.
- # Node's entry_var must be included in constraint.
- # Make a single constraint for all node entries.
- self._agr_principle_lex(node)
-
- def _agr_principle1(self, node, agr_label, agr):
- """Does the daughter with agr_label agree with this node?
-
- Constraint variables: at least arc_labels
- Other possible variables: daughter entry_vars, daughter agr_vars.
- 4 possible var combinations.
-
- @param node: node that this agreement constraint is for
- @type node: Node
- @param agr_label: arc label constraining mother-daughter agreement
- @type agr_label: string
- @param agr: agrs for node (which is unambiguous)
- @type agr: list of tuples (or dicts if language has
multiple agrs)
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- # Variables always end with daughter arc vars
- variables = self.get_node_vars(node, 'daughter_vars')
- constraint = None
- name = str(node.index) + ':'
- # Does the language have multiple agrs for a single word/node
- mult_agrs = self.language.has_mult_agrs
- # Node associated with each daughter arc var
- daugh_nodes = [self.problem.nodes[d] for d in
self.get_node_vars(node, 'var_daughters')]
- # Is any daughter ambiguous?
- any_daugh_mult_entries = any([daughter.n_entries > 1 for daughter
in daugh_nodes])
- # Does any daughter have multiple agrs?
- any_daugh_mult_agrs =
any([self.get_node_vars(daughter, 'max_agrs') > 1 for daughter in
daugh_nodes])
- if any_daugh_mult_entries:
- # Some daughter is ambiguous.
- variables = [daughter.entry_var for daughter in daugh_nodes] +
variables
- if any_daugh_mult_agrs:
- # Some daughter has multiple agrs.
- variables = [self.get_node_vars(daughter, 'agr_var') for
daughter in daugh_nodes] + variables
- constraint =
XDGConstraint(agree_daugh_entry_agrC(daugh_nodes, self.abbrev, agr_label,
- agr,
mult_agrs=mult_agrs),
- name=name + 'Agr_Dlex_agr')
- else:
- # No daughter mas multiple agrs.
- constraint = XDGConstraint(agree_daugh_entryC(daugh_nodes,
self.abbrev, agr_label,
- agr,
mult_agrs=mult_agrs),
- name=name + 'Agr_Dlex')
- elif any_daugh_mult_agrs:
- # Some daughter has multiple agrs.
- variables = [self.get_node_vars(daughter, 'agr_var') for
daughter in daugh_nodes] + variables
- constraint = XDGConstraint(agree_daugh_agrC(daugh_nodes,
self.abbrev, agr_label,
- agr,
mult_agrs=mult_agrs),
- name=name + 'Agr_Dagr')
- else:
- # No daughter has multiple agrs.
- constraint = XDGConstraint(agree_daughC(daugh_nodes,
self.abbrev, agr_label,
- agr,
mult_agrs=mult_agrs),
- name=name + 'Agr')
-
- # Add the constraint.
- self.problem.addConstraint(constraint, variables)
-
- def _agr_principle_mult(self, node, agrs, agree):
- """Do any of the children of node agree on the multiple features
in agree?
-
- Constraint variables: at least node's agr_var and daughter arc
vars.
- Other possible variables: daughter entry_vars, daughter agr_vars.
- 4 possible var combinations.
-
- @param node: node that this agreement constraint is for
- @type node: Node
- @param agrs: node's list of agrs (node is unambiguous)
- @type agrs: list of tuples (or dicts if language has
multiple agrs for words)
- @param agree: daughter agreement constraints (arc labels) for
node
- @type agree: list of strings
-
- """
- # Variables always end with daughter arc vars.
- variables = self.get_node_vars(node, 'daughter_vars')
- constraint = None
- # Does the language have multiple agrs for a single word/node?
- mult_agrs = self.language.has_mult_agrs
- name = str(node.index) + ':'
- # Node associated with each daughter arc var.
- daugh_nodes = [self.problem.nodes[d] for d in
self.get_node_vars(node, 'var_daughters')]
- # Node's agr var is always the first variable.
- agr_variable = self.get_node_vars(node,'agr_var')
- # Is any daughter ambiguous?
- any_daugh_mult_entries = any([daughter.n_entries > 1 for daughter
in daugh_nodes])
- # Does any daughter have multiple agrs?
- any_daugh_mult_agrs =
any([self.get_node_vars(daughter, 'max_agrs') > 1 for daughter in
daugh_nodes])
- if any_daugh_mult_entries:
- # Some daughter is ambiguous; add daughter entry vars to
variables.
- variables = [daughter.entry_var for daughter in daugh_nodes] +
variables
- if any_daugh_mult_agrs:
- # Some daughter has multiple agrs; add daughter agr vars
to variables.
- variables = [self.get_node_vars(daughter, 'agr_var') for
daughter in daugh_nodes] + variables
- constraint =
XDGConstraint(agree_agr_daugh_entry_agrC(daugh_nodes, self.abbrev,
-
agrs, agree, mult_agrs=mult_agrs),
- name=name + 'Agr_mult_Dlex_agr')
- else:
- constraint =
XDGConstraint(agree_agr_daugh_entryC(daugh_nodes, self.abbrev,
- agrs,
agree, mult_agrs=mult_agrs),
- name=name + 'Agr_mult_Dlex')
- elif any_daugh_mult_agrs:
- # Some daughter has multiple agrs; add daughter agr vars to
variables.
- variables = [self.get_node_vars(daughter, 'agr_var') for
daughter in daugh_nodes] + variables
- constraint = XDGConstraint(agree_agr_daugh_agrC(daugh_nodes,
self.abbrev,
- agrs, agree,
mult_agrs=mult_agrs),
- name=name + 'Agr_mult_Dagr')
- else:
- constraint = XDGConstraint(agree_agr_daughC(daugh_nodes,
self.abbrev,
- agrs, agree,
mult_agrs=mult_agrs),
- name=name + 'Agr_mult')
- variables = [agr_variable] + variables
- # Add the constraint.
- self.problem.addConstraint(constraint, variables)
-
- def _agr_principle_lex(self, node):
- """Does any entry for this ambiguous node have an agree that is
satisfied by the node's daughters?
-
- Constraint variables: at least node's entry_var and daughter arc
vars.
- Other possible variables: node's agr_var, daughter entry_vars,
daughter agr_vars.
- 8 possible var combinations.
-
- @param node: node that this agreement constraint is for
- @type node: Node
- """
- # Variables always end with daughter arc vars.
- variables = self.get_node_vars(node, 'daughter_vars')
- # Does the language have multiple agrs for a single word/node?
- mult_agrs = self.language.has_mult_agrs
- # Node associated with each daughter arc var.
- daugh_nodes = [self.problem.nodes[d] for d in
self.get_node_vars(node, 'var_daughters')]
- # Agr and entry vars for nodes
- agr_variable = self.get_node_vars(node, 'agr_var')
- entry_variable = node.entry_var
- # Is any daughter ambiguous?
- any_daugh_mult_entries = any([daughter.n_entries > 1 for daughter
in daugh_nodes])
- # Does any daughter have multiple agrs?
- any_daugh_mult_agrs =
any([self.get_node_vars(daughter, 'max_agrs') > 1 for daughter in
daugh_nodes])
- if any_daugh_mult_entries:
- # Some daughter is ambiguous; add daughter entry vars to
variables.
- variables = [daughter.entry_var for daughter in daugh_nodes] +
variables
- if any_daugh_mult_agrs:
- # Some daughter has multiple agrs; add daughter agr vars to
variables.
- variables = [self.get_node_vars(daughter, 'agr_var') for
daughter in daugh_nodes] + variables
- # Node is ambiguoous, so add mother entry var to variables.
- variables = [entry_variable] + variables
- if self.get_node_vars(node, 'max_agrs') > 1:
- # Node has multiple agrs, so add agr var to variables
- variables = [agr_variable] + variables
- constraint = None
- name = str(node.index) + ':'
- if self.get_node_vars(node, 'max_agrs') > 1:
- if any_daugh_mult_entries:
- if any_daugh_mult_agrs:
- # Ambiguous daughters and daughters with multiple agrs
and mother has multiple agrs.
- constraint =
XDGConstraint(agree_entry_agr_daugh_entry_agrC(daugh_nodes, node,
-
self.abbrev,
mult_agrs=mult_agrs),
- name=name
+ 'Agr_mult_lex_Dlex_agr')
- else:
- # Ambiguous daughters and mother has multiple agrs.
- constraint =
XDGConstraint(agree_entry_agr_daugh_entryC(daugh_nodes, node,
-
self.abbrev,
mult_agrs=mult_agrs),
- name=name
+ 'Agr_mult_lex_Dlex')
- elif any_daugh_mult_agrs:
- # Daughters with multiple agrs and mother has multiple
agrs.
- constraint =
XDGConstraint(agree_entry_agr_daugh_agrC(daugh_nodes, node,
-
self.abbrev, mult_agrs=mult_agrs),
- name=name + 'Agr_mult_lex_Darg')
- else:
- # Mother has multiple agrs.
- constraint =
XDGConstraint(agree_entry_agr_daughC(daugh_nodes, node,
-
self.abbrev, mult_agrs=mult_agrs),
- name=name + 'Agr_mult_lex')
- elif any_daugh_mult_entries:
- if any_daugh_mult_agrs:
- # Ambiguous daughters and daughters with multiple agrs.
- constraint =
XDGConstraint(agree_entry_daugh_entry_agrC(daugh_nodes, node,
-
self.abbrev, mult_agrs=mult_agrs),
- name=name + 'Agr_lex_Dlex_agr')
- else:
- # Ambiguous daughters.
- constraint =
XDGConstraint(agree_entry_daugh_entryC(daugh_nodes, node,
-
self.abbrev, mult_agrs=mult_agrs),
- name=name + 'Agr_lex_Dlex')
- elif any_daugh_mult_agrs:
- # Daughters with multiple agrs.
- constraint = XDGConstraint(agree_entry_daugh_agrC(daugh_nodes,
node,
- self.abbrev,
mult_agrs=mult_agrs),
- name=name + 'Agr_lex_Dagr')
- else:
- # No additional vars.
- constraint = XDGConstraint(agree_entry_daughC(daugh_nodes,
node,
- self.abbrev,
mult_agrs=mult_agrs),
- name=name + 'Agr_lex')
-
- # Add the constraint.
- self.problem.addConstraint(constraint, variables)
-
-class Semantics(ArcDimension):
- """Class for the semantics dimension."""
-
- def __init__(self, language=None, problem=None):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem)
- self.abbrev = 'sem'
- self.labels = language.labels.get(self.abbrev, []) + [None]
-
- # Principled are grouped so that they can some can have priority
over others.
- self.set_principles([self.root_principle, self.mother_principle,
self.valency_principle])
-
- def root_principle(self):
- """There must be at least one root."""
- self.problem.addConstraint(XDGConstraint(at_least_one,
name='Root'),
-
self.get_node_vars(self.problem.nodes[-1], 'daughter_vars'))
-
- def mother_principle(self):
- """Each node other than EOS must have at least one mother."""
- for node in self.problem.nodes[:-1]:
- self.problem.addConstraint(XDGConstraint(at_least_one,
name=str(node.index) + ':Mother'),
-
self.get_node_vars(node, 'mother_vars'))
-
-class IFDimension(Dimension):
- """Abstract class for interface dimensions."""
-
- def __init__(self, language=None, problem=None, dim1=None, dim2=None):
- """Assign two dimensions that this is the interface for."""
- Dimension.__init__(self, language, problem)
- self.dim1 = dim1
- self.dim2 = dim2
-
-class SynSem(IFDimension):
- """Class for the syntax-semantics interface."""
-
- def __init__(self, language=None, problem=None, sem=None, syn=None):
- IFDimension.__init__(self, language, problem, sem, syn)
- self.abbrev = 'synsem'
- self.labels = language.labels.get(self.abbrev, []) + [None]
-
- # Principled are grouped so that they can some can have priority
over others.
- # Principled are grouped so that they can some can have priority
over others.
- self.set_principles([self.linking_end_principle])
-
- def linking_end_principle(self):
- """Instantiate linking-end constraints for nodes where
applicable."""
- # Only check nodes up to EOS node
- for node in self.problem.nodes[:-1]:
- entries = node.entries
- if len(entries) == 1:
- # No ambiguity
- entry = entries[0]
- lex_dim = self.get_lex_dim(entry)
- # There may be no synsem dimension in this entry
- if lex_dim:
- args = lex_dim.arg
- if args:
- node_dim1 = node.vars[self.dim1.abbrev]
- node_dim2 = node.vars[self.dim2.abbrev]
- sem_daugh_arcs = node_dim1['daughter_vars']
- syn_daugh_arcs = node_dim2['daughter_vars']
- sem_daugh_indices = node_dim1['var_daughters']
- syn_daugh_indices = node_dim2['var_daughters']
- # args is a dict with sem keys and lists of syn
values
- for sem_arc, syn_arcs in args.iteritems():
- # Out arc in sem labeled sem_arc must
correspond to
- # out arc in syn labeled one of syn_arcs
- self.problem.addConstraint(
- XDGConstraint(arg_matchC(len(sem_daugh_arcs),
sem_arc, syn_arcs,
- sem_daugh_indices,
syn_daugh_indices),
- name='Arg'),
- sem_daugh_arcs + syn_daugh_arcs)
- else:
- # Ambiguity; does any entry have an arg constraint
- any_arg = False
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim:
- args = lex_dim.arg
- if args:
- any_arg = True
- # At least one entry does have an arg constraint
- if any_arg:
- node_dim1 = node.vars[self.dim1.abbrev]
- node_dim2 = node.vars[self.dim2.abbrev]
- sem_daugh_arcs = node_dim1['daughter_vars']
- syn_daugh_arcs = node_dim2['daughter_vars']
- sem_daugh_indices = node_dim1['var_daughters']
- syn_daugh_indices = node_dim2['var_daughters']
- self.problem.addConstraint(
- XDGConstraint(arg_match_entryC(node, self.abbrev,
- sem_daugh_indices,
syn_daugh_indices,
- len(sem_daugh_arcs)),
- name='ArgEntry'),
- [node.entry_var] + sem_daugh_arcs + syn_daugh_arcs)
-
-
-### ----------------------------------------------------------------------
-### CONSTRAINT PREDICATES
-###
-### Functions whose identifiers end in C curry over the constraint
variables
-### ----------------------------------------------------------------------
-
-# ----------------------------------------------------------------------
-# Tree, root, projectivity predicates
-# ----------------------------------------------------------------------
-
-def one_exists(*values):
- """Is exactly one of values non-False (0, None)?
-
- @param: values arc labels
- @type: values list of strings
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- any_found = False
- for v in values:
- if v:
- if any_found:
- return False
- else:
- any_found = True
- return any_found
-
-def at_least_one(*values):
- """Is at least one of values non-False (0, None)?
-
- @param: values arc labels
- @type: values list of strings
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- return any(values)
-
-def no_cross(inside, outside):
- """outside crosses inside; only possible if outside or inside is Null
- (or inside allows non-projectivity.
- @param: inside arc label
- @type: inside string
- @param: outside label for an arc that crosses inside
- @type: outside string
- @return: whether outside arc fails to cross inside arc, or,
if it does cross,
- whether this is legal
- @rtype: boolean
- """
- return not inside or not outside
-
-# ----------------------------------------------------------------------
-# Group predicates
-# ----------------------------------------------------------------------
-
-def group_entries_agree(entry_vars, group_entries):
- """In group_entries, each subgroup of indices represents a single word
in the group.
- Variable values in entry_vars (not yet grouped) correspond to these
indices.
- There must either be no matches at all (the group doesn't apply to
this sentence)
- or there must be exactly one match in each subgroup (the group appears
once in the sentence).
-
- @param entry_vars: values of entry_vars for nodes potentially in
group
- @type entry_vars: list of ints
- @param group_entries: for each word in group, list of entry indices
for each node
- @type group_entries: list of lists of ints
- @return: whether the group entry constraint is satisfied
- @rtype: boolean
- """
- # Combine the entry_vars with corresponding group_entry indices
- entry_var_i = 0
- var_entries = []
- for group_entry_sublist in group_entries:
- var_entry_sublist = []
- for entry in group_entry_sublist:
- var_entry_sublist.append((entry_vars[entry_var_i], entry))
- entry_var_i += 1
- var_entries.append(var_entry_sublist)
- # Count up the number of entry, group_entry matches for each word in
the group
- n_group_entries = [len(filter(lambda x: x[0] == x[1], sublist)) for
sublist in var_entries]
- # Either all are 0 or all are 1
- if all([n == 1 for n in n_group_entries]) or all([n == 0 for n in
n_group_entries]):
- return True
- return False
-
-def group_entries_agreeC(group_entries):
- """Return a function of entry variables that tells whether the group
entry constraint is satisfied.
- @param group_entries: for each word in group, list of entry indices
for each node
- @type group_entries: list of lists of ints
- @return: function that checks the constraint
- @rtype: function, # of arguments is number of entry variables for
the group
- """
- return lambda *variables: group_entries_agree(variables, group_entries)
-
-def group_arc_agree(m_entry, m_g_entry, g_arc_label, daughters):
- """
- Are mother-daughter arc constraints satisfied for a group?
-
- Suceed if the mother entry does not match the mother group entry
- or if exactly one of the daughters has a matching entry index and is
on the right arc.
- @param m_entry: mother entry index
- @type m_entry: int
- @param m_g_entry: mother entry index for group
- @type m_g_entry: int
- @param g_arc_label: arc label for which constraint applies
- @param daughters: (daughter_entry, arc_label, daughter_group_entry)
triples
- @type daughters: list of triples of strings and ints
- """
- if m_entry == m_g_entry:
- # Mother's entry matches the entry for this group
- # Count the daughter matches
- matches = 0
- for d_entry, arc, d_g_entry in daughters:
- if d_entry == d_g_entry:
- # If the daughter's entry is its entry for the group,
- # its arc from mother must have the right label
- if arc != g_arc_label or matches:
- return False
- else:
- matches = 1
- elif arc == g_arc_label:
- # If the arc matches but the daughter's entry is wrong,
fail
- return False
- if matches == 1:
- # There must be exactly one match
- return True
- # Succeed if the mother's entry fails to match its entry for the group
- return True
-
-def group_arc_agreeC(entry_index, arc_label, d_entry_indices):
- """Return a function that checks with group arc constraint is
satisfied.
-
- Function variables are the mother entry and alternating daughter
entries
- and corresponding mother->daughter arcs.
-
- @param entry_index: group entry index for mother node
- @type entry_index: int
- @param arc_label: mother->daughter arc label for which constraint
applies
- @type arc_label: string
- @param d_entry_indices: pairs of daughter group entry indices and
mother-daughter arc indices
- @type d_entry_indices: list of int pairs
- """
- grouping = range(0, len(d_entry_indices)*2, 2)
- grouping_indices = zip(grouping, d_entry_indices)
- return lambda *variables: group_arc_agree(variables[0], entry_index,
arc_label,
- # Group the daughter
variables in pairs,
- # adding on the group entry
indices
- [variables[1:][x:x+2] + (y,)
for x, y in grouping_indices])
-
-# ----------------------------------------------------------------------
-# Valency predicates
-# ----------------------------------------------------------------------
-
-def valency_constraint(arc_labels, entry_index, label, node, dim_abbrev,
ins):
- """Is the valency constraint satisfied?
-
- @param arc_labels: daughter or mother arc labels
- @type arc_labels: list of strings
- @param entry_index: index for the node's lexical entry
- @type entry_index: int
- @param label: label the constraint applies to
- @type label: string
- @param node: node the constraint applies to
- @type node: Node
- @param dim_abbrev: abbreviation for a dimension
- @type dim_abbrev: string
- @param ins: whether the constraint applies to ins or outs of
node
- @type ins: boolean
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- # Node entry
- entry = node.entries[entry_index]
- entry_dim = entry.dims.get(dim_abbrev)
- if not entry_dim:
- return True
- # Ins or outs constraint for label in entry
- entry_constraint = entry_dim.ins.get(label, 0) if ins else
entry_dim.outs.get(label, 0)
- if entry_constraint == 0:
- return none_equal(arc_labels, label)
- elif entry_constraint == '!':
- return one_equal(arc_labels, label)
- elif entry_constraint == '?':
- return one_equal(arc_labels, label, True)
- # * constraint always return True
- return True
-
-def valency_constraintC(label, node, dim_abbrev, ins):
- """Curry valency_constraint over node entry var and (daughter or
mother) arc label vars"""
- return lambda *values: valency_constraint(values[1:], values[0],
label, node, dim_abbrev, ins)
-
-def none_equal(values, value):
- """Are none of values equal to value?
-
- @param values: any objects
- @type values: sequence of objects
- @param value: any object
- @type value: object
- @return: whether value is not equal to any of values
- @rtype: boolean
- """
- for v in values:
- if v == value:
- return False
- return True
-
-def none_equalC(value):
- """Curry none_equal over values (arc labels)."""
- return lambda *values: none_equal(values, value)
-
-def one_equal(values, value, orzero=False):
- """Is one (or zero or one) of values equal to value?
-
- @param values: any objects
- @type values: sequence of objects
- @param value: any object
- @type value: object
- @param orzero: whether 0 occurrences of value in values is acceptable
- @type orzero: boolean
- @return: whether value is equal to 1 (or 0 or 1) of values
- @rtype: boolean
- """
- any_found = False
- for v in values:
- if v == value:
- if any_found:
- return False
- else:
- any_found = True
- return orzero or any_found
-
-def one_equalC(value):
- """Curry one_equal over values (arc labels), with orzero=False."""
- return lambda *values: one_equal(values, value)
-
-def zero_or_one_equalC(value):
- """Curry one_equal over values (arc labels) with orzero=True."""
- return lambda *values: one_equal(values, value, True)
-
-# ----------------------------------------------------------------------
-# Order predicates
-# ----------------------------------------------------------------------
-
-def before_all(labels, index, indices, label):
- """Are all of indices with label after index?
-
- @param: labels arc labels
- @type: labels list of strings
- @param: index node index
- @type: index int
- @param: indices child indices
- @type: indices list of ints
- @param: label arc label
- @type: label string
- """
- children = [c for c, v in zip(indices, labels) if v == label]
- return not children or index < min(children)
-
-def before_allC(index, indices, label):
- """Curry before_all over arc labels."""
- return lambda *labels: before_all(labels, index, indices, label)
-
-def all_before(labels, index, indices, label):
- """Are all of indices with label before index?
-
- @param: labels arc labels
- @type: labels list of strings
- @param: index node index
- @type: index int
- @param: indices child indices
- @type: indices list of ints
- @param: label arc label
- @type: label string
- """
- children = [c for c, v in zip(indices, labels) if v == label]
- return not children or index > max(children)
-
-def all_beforeC(index, indices, label):
- """Curry all_before over arc labels."""
- return lambda *labels: all_before(labels, index, indices, label)
-
-def all_before_all(labels, indices, label1, label2):
- """Are all of indices with label1 before all indices with label2?
-
- @param: labels arc labels
- @type: labels list of strings
- @param: indices child indices
- @type: indices list of ints
- @param: label1 arc label
- @type: label1 string
- @param: label2 arc label
- @type: label2 string
- """
- ind_val = zip(indices, labels)
- children1 = [c for c, v in ind_val if v == label1]
- if not children1:
- return True
- children2 = [c for c, v in ind_val if v == label2]
- if not children2:
- return True
- return max(children1) < min(children2)
-
-def all_before_allC(indices, label1, label2):
- """Curry all_before_all over arc labels."""
- return lambda *labels: all_before_all(labels, indices, label1, label2)
-
-def order_constraint(arc_labels, node, dim_abbrev, entry_index):
- """Are the constraints in arc_labels in entry in mother node satisfied?
-
- @param: arc_labels labels for children arcs
- @type: arc_labels list of strings
- @param: node mother node for which the constraint is
implemented
- @type: node Node
- @param: dim_abbrev abbreviation for a dimension
- @type: dim_abbrev string
- @param: entry_index index of lexical entry for node
- @type: entry_index int
- @return: whether the order constraint is satisfied
- @rtype: boolean
- """
- entry = node.entries[entry_index]
- entry_dim = entry.dims.get(dim_abbrev)
- if not entry_dim:
- return True
- daughters = node.vars[dim_abbrev].get('var_daughters', [])
- index = node.index
- # All ordering constraints must be satisfied
- for label1, label2 in entry_dim.order:
- if label1 == '^':
- # second label must follow this node
- if not before_all(arc_labels, index, daughters, label2):
- return False
- elif label2 == '^':
- # first label must precede this node
- if not all_before(arc_labels, index, daughters, label1):
- return False
- elif not all_before_all(arc_labels, daughters, label1, label2):
- return False
- return True
-
-def order_constraintC(node, dim_abbrev):
- """Curry order_constraint over children arc vars and node entry_var."""
- return lambda *variables: order_constraint(variables[1:], node,
dim_abbrev,
- variables[0])
-
-# ----------------------------------------------------------------------
-# Agreement predicates
-# ----------------------------------------------------------------------
-
-def feat_agree(agr1, agr2):
- """Do the two agreement tuples 'agree'?
-
- For now, they should be equal (if both are not null). Later, use
unification.
- @param agr1: ordered feature values
- @type agr1: list of strings or ints
- @param agr2: ordered feature values
- @type agr2: list of strings or ints
- """
- return not agr1 or not agr2 or (agr1 == agr2)
-
-def agree_daughters(arc_labels, daughters, dim_abbrev, agr_label, agr,
- daugh_entry_indices=[], daugh_agr_indices=[],
- mult_agrs=True):
- """Do all daughters on arcs with agr_label match agr?
-
***The diff for this file has been truncated for email.***
=======================================
--- /generation/en.py Fri Nov 27 11:00:30 2009 UTC
+++ /dev/null
@@ -1,153 +0,0 @@
-# English greatly simplified
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# 2009.10.31
-# Added groups: 'had an argument', 'BREAK the ice'
-
-from lex import *
-
-### A language
-ENGLISH = Language('English', 'en',
- has_mult_agrs=False,
- morph_processing=False,
- labels={
- 'syn': ['subj', 'obj', 'adv', 'det', 'adj'],
- 'sem': ['arg1', 'arg2', 'del', 'mod']
- #'synsem': ['arg1','arg2']
- })
-
-### A simple, primitive lexicon for English
-ENGLISH.set_lexicon(
- Lexicon(
- {# Some lexical classes for POS
- 'V': # Doubles as intransitive verb (but should that
enforce 'subj': 0 ?)
- Lex(gram='V',
- dims={'syn': LexDim(ins={'root': '!'},
- outs={'subj': '!', 'adv': '*'},
- agree=['subj'],
- order=[('subj', '^'),
('^', 'adv')]),
- 'sem': LexDim(ins={'root': '!'},
- outs={'arg1': '!'})}),
- 'V_T': Lex(gram='V_T',
- classes=['V'],
- dims={'syn': LexDim(outs={'obj': '!'},
- order=[('^', 'obj'),
('obj', 'adv')]),
- 'sem': LexDim(outs={'arg2': '!'}),
- 'synsem': LexDim(arg={'arg1':
['subj'], 'arg2': ['obj']})}),
- 'V_TI': Lex(gram='V_TI',
- classes=['V'],
- dims={'syn': LexDim(outs={'obj': '?'},
- order=[('^', 'obj'),
('obj', 'adv')]),
- 'sem': LexDim(outs={'arg2': '?'})}),
- 'V_3SG': Lex(gram='V_3SG',
- dims={'syn': LexDim(agrs=[(3, 'sg')],
- feats={'tns': 'pres'})}),
- 'V_BS': Lex(gram='V_BS',
- dims={'syn': LexDim(agrs=[(1,), (2,), (3, 'pl')],
- feats={'tns': 'pres'})}),
- 'V_PS': Lex(gram='V_PS',
- dims={'syn': LexDim(feats={'tns': 'past'})}),
- 'N': Lex(gram='N',
- dims={'syn': LexDim(ins={'subj': '?', 'obj': '?'}),
- 'sem':
LexDim(ins={'arg1': '?', 'arg2': '?'})}),
- 'PN': Lex(gram='PN',
- classes=['N'],
- dims={'syn': LexDim(agrs=[(3, 'sg')])}),
- 'CN': Lex(gram='CN',
- classes=['N'],
- dims={'syn': LexDim(outs={'adj': '*', 'det': '?'},
- order=[('det', 'adj'),
('adj', '^'), ('det', '^')]),
- 'sem': LexDim(outs={'mod': '*'}),
- 'synsem': LexDim(arg={'mod': ['adj']})}),
- 'N_PL': Lex(gram='N_PL',
- classes=['CN'],
- dims={'syn': LexDim(agrs=[(3, 'pl')])}),
- 'N_SG': Lex(gram='N_SG',
- classes=['CN'],
- dims={'syn': LexDim(agrs=[(3, 'sg')])}),
- 'N_MS': Lex(gram='N_MS',
- classes=['CN'],
- dims={'syn': LexDim(agrs=[(3, 'sg')])}),
- 'N_ADJ': Lex(gram='N_PL',
- classes=['N'],
- dims={'syn': LexDim(agrs=[(3, 'pl')],
- outs={'det': '!'},
- order=[('det', '^')])}),
- 'T_ADV': Lex(gram='T_ADV',
- dims={'syn': LexDim(ins={'adv': '!'}),
- # there should probably be other options
- 'sem': LexDim(ins={'root': '?'})}),
- 'ADJ': Lex(gram='ADJ',
- dims={'syn': LexDim(ins={'adj': '!'}),
- 'sem': LexDim(ins={'mod': '!'})}),
-
- 'DET': Lex(gram='DET',
- dims={'syn': LexDim(ins={'det': '!'}),
- 'sem': LexDim(ins={'del': '!'})}),
-
- # Lexical classes
-
- 'EAT': Lex(lexeme='EAT', classes=['V_TI'],
- dims={'synsem': LexDim(arg={'arg1':
['subj'], 'arg2': ['obj']})}),
- 'BREAK': [Lex(lexeme='BREAK', classes=['V_T']),
- Lex(lexeme='BREAK', classes=['V'],
- dims={'synsem': LexDim(arg={'arg1': ['subj']})}),
- Lex(lexeme='BREAK', classes=['V_T'],
- gid='g2',
- dims={'syn': LexDim(groupouts={'obj': 'ice'})})],
-
- # Word lexical entries
-
- 'eats': [Lex(word='eats', classes=['EAT', 'V_3SG'])],
- 'eat': [Lex(word='eat', classes=['EAT', 'V_BS'])],
- 'break': [Lex(word='break', classes=['BREAK', 'V_BS'])],
- 'breaks': [Lex(word='breaks', classes=['BREAK', 'V_3SG'])],
- 'argue': [Lex(word='argue', classes=['V_BS', 'V'])],
- 'had': [Lex(word='had', classes=['V_PS', 'V_T']),
- Lex(word='had', classes=['V_PS', 'V_T'],
- gid='g1',
- dims={'syn':
LexDim(groupouts={'obj': 'argument'})})],
- 'Mary': [Lex(word='Mary', classes=['PN'])],
- 'people': [Lex(word='people', classes=['N_PL'])],
- 'often': [Lex(word='often', classes=['T_ADV'])],
- 'yogurt': [Lex(word='yogurt', classes=['N_MS'])],
- 'ice': [Lex(word='ice', classes=['N_MS']),
- Lex(word='ice', classes=['N_MS'],
- gid='g2', gwords=3,
- dims={'syn':
LexDim(groupouts={'det': 'the'})})],
- 'old': [Lex(word='old', classes=['N_ADJ']),
- Lex(word='old', classes=['ADJ'])],
- 'tall': [Lex(word='tall', classes=['ADJ'])],
- 'fresh': [Lex(word='fresh', classes=['ADJ'])],
- 'the': [Lex(word='the', classes=['DET']),
- Lex(word='the', classes=['DET'],
- gid='g2')],
- 'an': [Lex(word='an', classes=['DET']),
- Lex(word='an', classes=['DET'],
- gid='g1')],
- 'man': [Lex(word='man', classes=['V_T', 'V_BS']),
- Lex(word='man', classes=['N_SG'])],
- 'girl': [Lex(word='girl', classes=['N_SG'])],
- 'army': [Lex(word='army', classes=['N_SG'])],
- 'argument': [Lex(word='argument', classes=['N_SG']),
- Lex(word='argument', classes=['N_SG'],
- gid='g1', gwords=3,
- dims={'syn': LexDim(groupouts={'det': 'an'})})],
- 'mans': [Lex(word='mans', classes=['V_T', 'V_3SG'])],
- 'boats': [Lex(word='boats', classes=['N_PL'])]}))
=======================================
--- /generation/jp-infl.html Sun Dec 6 01:04:53 2009 UTC
+++ /dev/null
@@ -1,277 +0,0 @@
-<html>
-<head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-</head>
-<body>
-
-
-<table summary="TBD" frame="box" border="1" cellspacing="0">
-<caption>The Predicate System</caption>
-<tr>
- <th colspan="3" >Root Stability</th>
- <th colspan="1" >Weak</th>
- <th colspan="2" >Strong</th>
- <th colspan="4" >Weak</th>
- <th colspan="13">Strong (but with phonotactics)</th>
- <th colspan="1" >Weak</th>
- <th></th>
-</tr>
-<tr>
- <th colspan="3" >Part of Speech</th>
- <th colspan="19">Verb</th>
- <th colspan="2" >Copula</th>
- <th >Adjective</th>
-</tr>
-<tr>
- <th colspan="3" rowspan="3" valign="top">Conjugation Class</th>
- <th rowspan="2">二段</th>
- <th colspan="2" rowspan="2">一段</th>
- <th rowspan="2">カ変</th>
- <th colspan="3" >サ変</th>
- <th rowspan="3">special polite (~ar.u)</th>
- <th colspan="10" rowspan="2">五段</th>
- <th colspan="2" rowspan="2">丁寧語</th>
- <th rowspan="2"></th>
- <th rowspan="3">形容詞</th>
-</tr>
-<tr>
- <th>Modern</th>
- <th colspan="2">Classic</th>
-</tr>
-<tr>
- <th>得る</th>
- <th>上</th><th>下</th>
- <th>来る</th>
- <th colspan="2">S</th><th>Z</th>
-
<th>R</th><th>T</th><th>W</th><th>(U)</th><th>B</th><th>M</th><th>N</th><th>G</th><th>K</th><th>S</th>
- <th>ます</th><th>です</th><th>だ</th>
-</tr>
-<tr>
- <th colspan="99" class="hr"></th>
-</tr>
-
-<tr>
- <th rowspan="20">Infl.</th>
- <th rowspan="9">imperfective&dagger;</th>
- <th>indicative (terminal)</th>
- <td>e-ru</td><td rowspan="2" colspan="6"
style="color:green;">-ru</td>
- <td rowspan="2" colspan="13">-u</td>
- <td>da</td>
- <td>-i <-si></td>
-</tr>
-<tr>
- <th>indicative (attributive)</th>
- <td>u-ru</td>
- <td>na</td>
- <td>-i <-ki></td>
-</tr>
-<tr>
- <th rowspan="2">conditional (hypothetical stem)</th> 
- <td>u-reba</td><td colspan="6" style="color:green;">-reba</td> 
- <td colspan="11">-eba</td>
- <td><mas-eba></td> 
- <td style="background-color: black;"> </td><td>nar-a(ba)</td>
- <td>-ker-eba</td>
-</tr>
-<tr>
- 
- <td>?</td><td colspan="19" style="color:green;">*u nara</td>
- <td style="background-color: black;"> </td>
- <td>?</td>
-</tr>
-<tr>
- <th>enumeration</th>
- <td>?</td><td colspan="19" style="color:green;">*u nari</td>
- <td>nar-i</td>
- <td>-i nari</td>
-</tr>
-<tr>
- <th>imperative</th>
- <td colspan="3" style="color:blue;">-ro</td>
- <td>ko-i</td><td colspan="3" style="color:blue;">-ro</td>
- <td style="color: red;">[∅]i</td><td colspan="10">-e</td>
- <td><mas-e></td><td>?</td><td><nar-e></td>
- <td><-kar-e></td>
-</tr>
-<tr>
- <th rowspan="2">subjunctive (volitional)</th>
- <td colspan="7" style="color:purple;">-you</td>
- <td colspan="11">-ou</td>
- <td colspan="2">-you</td><td>dar-ou</td>
- <td><-kar-ou></td>
-</tr>
-<tr>
- <td>?</td><td colspan="4">-ru darou</td><td>?</td><td>?</td>
- <td colspan="11">-u darou</td>
- <td colspan="3"style="background-color: black;"> </td>
- <td>-i darou</td>
-</tr>
-<tr>
- <th>participle (continuative)</th>
- <td>?</td><td colspan="6" style="color:purple;">-∅</td>
- <td>[∅]i</td><td colspan="10">-i</td>
- <td colspan="2">?</td><td>ni</td>
- <td>-ku</td>
-</tr>
-<tr>
- <th colspan="99" class="hr"></th>
-</tr>
-
-<tr>
- <th rowspan="5">perfective</th>
- <th>indicative</th>
- <td colspan="7" style="color:purple;">-ta</td>
- <td colspan="4">[t]ta</td><td>-uta</td><td
colspan="3">[n]da</td><td>[∅]ida</td><td>[∅]ita</td>
- <td colspan="3">-ita</td><td>dat-ta</td>
- <td>-kat-ta</td>
-</tr>
-<tr>
- <th>conditional (provisional)</th>
- <td colspan="22" style="color:purple;">*ta (na)ra</td>
-</tr>
-<tr>
- <th>enumeration</th>
- <td colspan="22" style="color:purple;">*ta (na)ri</td>
-</tr>
-<tr>
- <th>subjunctive</th>
- <td colspan="18" style="color:purple;">*ta darou</td>
- <td colspan="2" style="background-color: black;"> </td>
- <td colspan="2">*ta darou</td>
-</tr>
-<tr>
- <th>participle (gerundive)</th>
- <td colspan="7" style="color:purple;">-te</td>
- <td colspan="4">[t]te</td><td>-ute</td><td
colspan="3">[n]de</td><td>[∅]ide</td><td>[∅]ite</td><td>-ite</td>
- <td colspan="2"><-ite></td><td>de</td>
- <td>-ku-te</td>
-</tr>
-<tr>
- <th colspan="99" class="hr"></th>
-</tr>
-
-<tr>
- <th rowspan="4">irrealis</th>
- <th>indicative</th>
- <td></td><td colspan="2"></td>
- <td></td><td colspan="3"></td>
- <td colspan="11"></td>
- <td>mas-en</td><td></td><td></td>
- <td></td>
-</tr>
-<tr>
- <th>imperative</th>
- <td>?</td><td colspan="2">-ru na</td>
- <td>?</td><td colspan="2">?</td><td>?</td>
- <td colspan="11">-u na</td>
- <td colspan="2">?</td><td>?</td>
- <td>?</td>
-</tr>
-<tr>
- <th>subjunctive</th>
- <td>?</td><td colspan="2">(-ru) mai</td>
- <td>?</td><td colspan="2">?</td><td>?</td>
- <td colspan="12">-u mai</td>
- <td>?</td><td>?</td>
- <td>?</td>
-</tr>
-<tr>
- <th>participle</th>
- <td>?</td><td colspan="6" style="color:orange;">-zu</td>
- <td colspan="3">-a-zu</td><td colspan="2">-wa-zu</td><td
colspan="6">-a-zu</td>
- <td>?</td><td>?</td><td>?</td>
- <td>?</td>
-</tr>
-
-<tr>
- <th colspan="99" class="hr"></th>
-</tr>
-
-<tr>
- <th rowspan="5">Deriv.</th>
- <th>形容詞</th>
- <th>negative</th>
- <td colspan="7" style="color:blue;">-na.i</td>
- <td colspan="3">-a-na.i</td><td colspan="2">-wa-na.i</td><td
colspan="6">-a-na.i</td>
- <td style="background-color: black;"> </td><td style="color:
red;">de (ha) na.i desu de (ha) ar-i-mas-en</td><td
style="color:red;">de (ha) na.i</td>
- <td>-ku na.i</td>
-</tr>
-<tr>
- <th>丁寧語</th>
- <th>distal</th>
- <td colspan="7" style="color:purple;">-mas.u</td>
- <td>[∅]i-mas.u</td><td colspan="10">-i-mas.u</td>
- <td colspan="2" style="background-color: black;"> </td><td>des.u</td>
- <td>-i desu</td>
-</tr>
-
-<tr>
- <th rowspan="3">一段</th>
- <th>causative</th>
- <td colspan="4" style="color:orange;">-sase.ru</td>
- <td style="color:red;">sase.ru</td><td>-se-sase.ru</td><td>-ze-sase.ru</td>
- <td colspan="3">-a-se.ru</td><td colspan="2">-wa-se.ru</td><td
colspan="6">-a-se.ru</td>
- <td rowspan="4" colspan="4" style="background-color: black;"> </td>
-</tr>
-<tr>
- <th>passive</th>
- <td colspan="4" style="color:orange;">-rare.ru</td>
- <td style="color:red;">sare.ru</td><td>-se-rare.ru</td><td>-ze-rare.ru</td>
- <td colspan="3">-a-re.ru</td><td colspan="2">-wa-re.ru</td><td
colspan="6">-a-re.ru</td>
-</tr>
-<tr>
- <th>potential</th>
- <td>?</td><td colspan="3" style="color:orange;">-(ra)re.ru</td>
- <td style="color:red;">deki.ru</td><td>-se-rare.ru</td><td>-ze-rare.ru</td>
- <td colspan="11">-e.ru</td>
-</tr>
-</table>
-
-<ul>
- <li><> — indicates rare/historical forms</li>
- <li>[] — indicates stem changes</li>
- <li>() — indicates optionally/regularly elided</li>
- <li>red — irregular/suppletive</li>
- <li>カ変/サ変/二段 roots</li>
- <ul>
- <li>green — ku-/su-/zu-/*</li>
- <li>blue — ko-/si-/ji-/e-</li>
- <li>purple —
ki-/si-/ji-/e-</li>
- <li>orange — ko-/se-/ze-/e-
(Classic-S only differs from Modern-S in these derivational forms)</li>
- </ul>
- <li>&dagger; — N.B. the 未然形 stem for negative&volitional is
often called "imperfective"</li>
-</ul>
-
-<h3>Examples</h3>
-<dl>
- <dt>二段</dt><dd>得る「うる／える」 (the only remaining one)</dd>
- <dt>カ変</dt><dd>来る「くる」 (the only one)</dd>
- <dt>Modern S-サ変</dt><dd>する (the only one)</dd>
- <dt>Classic S-サ変</dt><dd>愛す「あいす」、熱す「ねっす」 (others?)</dd>
- <dt>Classic Z-サ変</dt><dd>感じる「かんじる」、信じる「しんじる」
(others?)</dd>
- <dt>Special Polite</dt><dd>*ござる、下さる「くださる」、いらっしゃる、なさ
る、仰る「おっしゃる」 (the only five)</dd>
- <dt>U-五段</dt><dd>乞う「こう」、恋う「こう」、問う「とう」 (others?)</dd>
-</dl>
-
-<h3>Miscellaneous Irregularities</h3>
-<dl>
- <dt>下一段</dt>
- <dd>呉れる「くれる」 imperative is くれ (not the expected くれろ)</dd>
- <dt>Special Polite</dt>
- <dd>Many optional irregular perfective forms: いらした/~て (for いらっし
ゃった/~て), 管(?)すった/~て (for 下さった/~て)</dd>
- <dt>R-五段</dt>
- <dd>ある negative is ない (not the expected あらない), but あらぬ/あらず
are still used</dd>
- <dt>K-五段</dt>
- <dd>行く「いく」 perfective forms are いった、いって (not the expected い
いた、いいて)</dd>
- <dt>形容詞</dt>
- <dd>いい has root よ- for all other forms</dd>
-</dl>
-</body>
-</html>
=======================================
--- /generation/jp.py Tue Dec 15 09:11:08 2009 UTC
+++ /dev/null
@@ -1,392 +0,0 @@
-#!/usr/bin/env python
-# -*- coding: UTF-8 -*-
-# A primitive lexicon for tokenized Japanese
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-from lex import *
-
-# N.B. the syn dimension has the tree_principle so every node can have at
most one parent. Thus we can simplify situations like {(a|b|c)!} by using
{a?,b?,c?}
-#
-# However, the sem dimenstion only has the mother_principle so we can't
use that trick there. And even for syn, we can't use that trick for outs.
-
-# I'm not sure if と is really a complementizer (when used with, e.g., 言う
), but we use that for lack of a better name for the case.
-JAPANESE = Language('Japanese', 'jp',
- has_mult_agrs=True,
- morph_processing=False,
- labels={
- 'syn': ['top', 'adv', 'ADJ', 'rel',
- 'N', 'NO', 'NA', 'subj', 'subj2', 'obj', 'dat', 'gen', 'comp'],
- 'sem': ['arg1', 'arg2', 'arg3', 'del', 'mod'],
- 'synsem': []})
-
-JAPANESE.set_lexicon(Lexicon({
-##### Some lexical classes for POS
-### Predicates
-# BUG: we can't break out the 'outs' to a superclass to share them among
all options of PRED. It doesn't get inherited. This is evidenced by
sentences with topics.
- 'PRED': [
- Lex(gram='PRED',
- # TODO: sentential particles, but only if root (or quote)
- dims={
- 'syn': LexDim(
- ins={'root':'?'},
- outs={'top':'*', 'adv':'*'},
- order=[('top','^'), ('top','adv'), ('adv','^')]),
- 'sem': LexDim(
- ins={'root':'?'}),
- 'synsem': LexDim()}),
- # TODO: this should probably impose the 'attributive' form for
morphological generation
- Lex(gram='PRED',
- dims={
- 'syn': LexDim(
- ins={'rel':'?'},
- outs={'top':'*', 'adv':'*'},
- order=[('top','^'), ('top','adv'), ('adv','^')]),
- 'sem': LexDim(
- ins={'mod':'?'}),
- 'synsem': LexDim(
- arg={'mod': ['rel']})})],
-#__ Predicates with subjects (i.e. not the copula)
- 'PRED_SUBJ': Lex(gram='PRED_SUBJ', classes=['PRED'],
- dims={
- 'syn': LexDim(
- outs={'subj':'?'},
- order=[('top','subj'), ('subj','^')]),
- 'sem': LexDim(
- outs={'arg1':'?'}),
- 'synsem': LexDim(
- arg={'arg1': ['subj']})}),
-#__ Predicates with two subjects
- 'PRED_AFFECTIVE': Lex(gram='PRED_AFFECTIVE', classes=['PRED_SUBJ'],
- dims={
- 'syn': LexDim(
- outs={'subj2':'?'},
- order=[('top','subj2'), ('subj','subj2'), ('subj2','^')]),
- 'sem': LexDim(
- outs={'arg2':'?'}),
- 'synsem': LexDim(
- arg={'arg2': ['subj2']})}),
-
-### 動詞 -- Verbal predicates
- 'V': Lex(gram='V', classes=['PRED_SUBJ']),
-#__ Operational transitive verbs
- 'V_T': Lex(gram='V_T', classes=['V'],
- dims={
- 'syn': LexDim(
- outs={'obj':'?'},
- order=[('top','obj'), ('obj','^')]),
- 'sem': LexDim(
- outs={'arg2':'?'}),
- 'synsem': LexDim(
- arg={'arg2': ['obj']})}),
-#__ Indirect transitive verbs (can be combined with V_T or V_A)
- 'V_I': Lex(gram='V_I', classes=['V'],
- dims={
- 'syn': LexDim(
- outs={'dat':'?'},
- order=[('top','dat'), ('dat','^')]),
- 'sem': LexDim(
- outs={'arg3':'?'}),
- 'synsem': LexDim(
- arg={'arg3': ['dat']})}),
-#__ Affective transitive verbs
- 'V_A': Lex(gram='V_A', classes=['V','PRED_AFFECTIVE']),
-
-
-### 形容詞 -- Adjectives
- 'ADJ': Lex(gram='ADJ', classes=['PRED_SUBJ'],
- dims={'syn': LexDim(
- ins={'ADJ':'?'}, # cf. tree_principle
- order=[('ADJ','^')])}),
-#__ Transitive adjectives
- 'ADJ_A': Lex(gram='ADJ_A', classes=['ADJ', 'PRED_AFFECTIVE']),
-
-
-### 名詞や形容動詞や -- Nouns etc.
-# This is the generic noun super-class, it supports neither NA nor NO
modifiers.
- 'N': Lex(gram='N',
- dims={
- 'syn': LexDim(
- ins={'N': '?'},
- outs={'rel':'?', 'gen':'?', 'comp':'?'},
- order=[('rel','^'), ('gen','^'), ('comp','^'), ('^','N')]),
- 'sem': LexDim(
- ins={'del':'!'}, # BUG: HACK!
- outs={'mod':'?'}),
- 'synsem': LexDim(
- arg={'mod': ['rel', 'gen', 'comp']})}), # BUG: doesn't
seem to work
-#__ 名詞 -- no-nouns (so not
including "no", "kono", "sono",... "kore", "sore",...)
- 'N_NO': Lex(gram='N_NO', classes=['N'],
- dims={'syn': LexDim(ins={'NO':'?'})}),
-#__ 形容動詞 -- na-nouns
- 'N_NA': Lex(gram='N_NA', classes=['N'],
- dims={'syn': LexDim(ins={'NA':'?'})}),
-
-### Agreement features
-# BUG: these don't work quite right
- '+ANIM': Lex(gram='+ANIM', dims={'sem': LexDim(agrs=[{'anim': '+'}])}),
- '-ANIM': Lex(gram='-ANIM', dims={'sem': LexDim(agrs=[{'anim': '-'}])}),
-
-### Topic/Focus particles
- 'TOP': Lex(gram='TOP',
- dims={
- 'syn': LexDim(ins={'top':'!'}, order=[('^','top')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()}),
- 'ha': [
- Lex(word='ha', classes=['TOP'],
- dims={'syn': LexDim(outs={'N':'!'}, order=[('N','^')])}),
- Lex(word='ha', classes=['TOP'],
- dims={'syn': LexDim(outs={'dat':'!'}, order=[('dat','^')])}),
- Lex(word='ha', classes=['TOP'],
- dims={'syn': LexDim(outs={'comp':'!'}, order=[('comp','^')])}),
- Lex(word='ha', classes=['TOP'],
- dims={'syn': LexDim(outs={'adv':'!'}, order=[('adv','^')])})],
- 'mo': [
- Lex(word='mo', classes=['TOP'],
- dims={'syn': LexDim(outs={'N':'!'}, order=[('N','^')])}),
- Lex(word='mo', classes=['TOP'],
- dims={'syn': LexDim(outs={'dat':'!'}, order=[('dat','^')])}),
- Lex(word='mo', classes=['TOP'],
- dims={'syn': LexDim(outs={'comp':'!'}, order=[('comp','^')])}),
- Lex(word='mo', classes=['TOP'],
- dims={'syn': LexDim(outs={'adv':'!'}, order=[('adv','^')])})],
-
-### Nominal particles (and homonyms)
- 'N_PART': Lex(gram='N_PART',
- dims={
- 'syn': LexDim(outs={'N':'!'}, order=[('N','^')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()}),
- 'ga': [
- Lex(word='ga', classes=['N_PART'],
- dims={'syn': LexDim(
- ins={'subj':'?', 'subj2':'?'}, # cf. tree_principle
- order=[('^','subj'), ('^','subj2')])}),
- # TODO: The sentential particle
- Lex(word='ga', classes=['PRED_PART'])],
- 'wo': [Lex(word='wo', classes=['N_PART'],
- dims={'syn': LexDim(ins={'obj':'!'}, order=[('^','obj')])})],
- 'ni': [
- # TODO: ordering constraint dative vs <形容動詞>に adverbs (but
not other advs)
- # The dative (are N_NA really prohibited, or just unlikely?)
- Lex(word='ni',
- dims={
- 'syn': LexDim(
- outs={'NO':'!'},
- ins={'dat':'!'},
- order=[('NO','^'), ('^','dat')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()}),
- # The adverbial
- Lex(word='ni',
- dims={
- 'syn': LexDim(
- outs={'NA':'!'},
- ins={'adv':'!'},
- order=[('NA','^'), ('^','adv')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})],
- 'de': [
- Lex(word='de', classes=['N_PART'],
- dims={'syn': LexDim(ins={'adv':'!'}, order=[('^','adv')])})
- # TODO: the copula gerund
- ],
- 'no': [
- # The genitive
- Lex(word='no',
- dims={
- 'syn': LexDim(
- outs={'NO':'!'},
- ins={'gen':'!'},
- order=[('NO','^'), ('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()}),
- # The fused noun
- Lex(word='no',
- dims={
- 'syn': LexDim(
- ins={'N': '!'},
- outs={'NO':'!'},
- order=[('NO','^'), ('^','N')]),
- 'sem': LexDim(
- ins={'del': '!'}, # BUG! HACK!
- outs={'mod':'!'}),
- 'synsem': LexDim(
- arg={'mod': ['NO']}) }),
- # The unfused noun (TODO: also extended predicate)
- Lex(word='no',
- dims={
- 'syn': LexDim(
- ins={'N': '!'},
- outs={'rel':'!'},
- order=[('rel','^'), ('^','N')]),
- 'sem': LexDim(
- ins={'del': '!'}, # BUG! HACK!
- outs={'mod':'!'}),
- 'synsem': LexDim(
- arg={'mod': ['rel']}) })],
- 'to': [Lex(word='to', classes=['N_PART'],
- dims={'syn': LexDim(ins={'comp':'!'}, order=[('^','comp')])})],
-
-### Copula
-# TODO
- 'na': [
- Lex(word='na',
- dims={
- 'syn': LexDim(
- outs={'NA':'!'},
- ins={'gen':'!'},
- order=[('NA','^'), ('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})
- # TODO: attributive copula version (must ensure the caller is a
noun)
- ],
- 'da': [Lex(word='da', classes=['PRED'],
- # BUG: shouldn't be used for attributive position
- dims={'syn': LexDim(outs={'N':'!'}, order=[('N','^')])})],
- 'desu': [
- Lex(word='desu', classes=['PRED'],
- dims={'syn': LexDim(outs={'N':'!'}, order=[('N','^')])}),
- Lex(word='desu', classes=['PRED'],
- dims={'syn': LexDim(outs={'ADJ':'!'}, order=[('ADJ','^')])})],
-
-### Sentential particles
-# TODO
- 'PRED_PART': Lex(gram='PRED_PART',
- dims={
- 'syn': LexDim(),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()}),
- 'ka': [Lex(word='ka', classes=['PRED_PART'])],
- 'ne': [Lex(word='ne', classes=['PRED_PART'])],
- 'yo': [Lex(word='yo', classes=['PRED_PART'])],
- 'wa': [Lex(word='wa', classes=['PRED_PART'])],
- #'ga': <see nominal particles above>,
-
-
-### Word lexical entries
- 'iku': [Lex(word='iku', classes=['V_I'])],
- 'kau': [Lex(word='kau', classes=['V_T'])],
- 'suru': [Lex(word='suru', classes=['V_T'])],
- 'nomu': [Lex(word='nomu', classes=['V_T'])],
- 'taberu': [Lex(word='taberu', classes=['V_T'])],
- 'wakaru': [Lex(word='wakaru', classes=['V_A'])],
- 'dekiru': [Lex(word='dekiru', classes=['V_A'])],
- # TODO: auxilliary version of aru
- 'aru': [Lex(word='aru', classes=['V_A'],
- dims={'sem': LexDim(
- # BUG: this doesn't work just because we don't construct
arg1,arg2 correctly!
- agrs=[{'anim': '+', 'anim2': '-'}],
- agree=[('anim','arg1','anim'), ('anim2','arg2','anim')])})],
- # TODO: the other iru
- 'iru': [Lex(word='iru', classes=['V_A'],
- dims={'sem': LexDim(
- agrs=[{'anim': '+', 'anim2': '-'}],
- agree=[('anim','arg1','anim'), ('anim2','arg2','anim')])})],
- 'atarasi': [Lex(word='atarasi', classes=['ADJ'])],
- 'aoi': [Lex(word='aoi', classes=['ADJ'])],
- 'kirei': [Lex(word='kirei', classes=['N_NA'])],
- 'benri': [Lex(word='benri', classes=['N_NA'])],
- 'kore': [Lex(word='kore', classes=['N', '-ANIM'])],
- 'sore': [Lex(word='sore', classes=['N', '-ANIM'])],
- 'are': [Lex(word='are', classes=['N', '-ANIM'])],
- 'dore': [Lex(word='dore', classes=['N', '-ANIM'])],
- 'kono': [Lex(word='kono',
- dims={
- 'syn': LexDim(ins={'gen':'!'}, order=[('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})],
- 'sono': [Lex(word='sono',
- dims={
- 'syn': LexDim(ins={'gen':'!'}, order=[('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})],
- 'ano': [Lex(word='ano',
- dims={
- 'syn': LexDim(ins={'gen':'!'}, order=[('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})],
- 'dono': [Lex(word='dono',
- dims={
- 'syn': LexDim(ins={'gen':'!'}, order=[('^','gen')]),
- 'sem': LexDim(ins={'del':'!'}),
- 'synsem': LexDim()})],
- 'dare': [Lex(word='dare', classes=['N_NO', '+ANIM'])],
- 'watasi': [Lex(word='watasi', classes=['N_NO', '+ANIM'])],
- 'neko': [Lex(word='neko', classes=['N_NO', '+ANIM'])],
- 'eego': [Lex(word='eego', classes=['N_NO', '-ANIM'])],
- 'kuruma': [Lex(word='kuruma', classes=['N_NO', '-ANIM'])],
- 'zassi': [Lex(word='zassi', classes=['N_NO', '-ANIM'])],
- 'koohii': [Lex(word='koohii', classes=['N_NO', '-ANIM'])],
- 'sakana': [Lex(word='sakana', classes=['N_NO', '-ANIM'])],
- }))
-
-# ----------------------------------------------------------------------
-# Some test sentences
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- from l3 import XDGProblem
- GRAMMATICAL = [
- "wakaru",
- "watasi ga wakaru",
- "watasi ha wakaru",
- "sore ga wakaru",
- "sore ha wakaru",
- "kore to sore desu",
- "watasi no kuruma desu",
- "kirei na kuruma desu",
- "atarasi no desu",
- "atarasi kuruma desu",
- "kuruma ga atarasi desu",
- "kuruma de iku",
- "kuruma ni iku",
- "kirei ni iku",
- "neko ga sakana wo taberu",
- "sakana wo neko ga taberu",
- "neko ha sakana wo taberu",
- "sakana ha neko ga taberu",
- "watasi ga eego ga dekiru",
- "watasi ha eego ga dekiru", # BUG: need to kill subj
- "eego ha watasi ga dekiru", # BUG: need to kill subj2
- "watasi ha eego ha dekiru",
- "eego ha watasi ha dekiru",
- "neko ga iru", # TODO: prefer animacy
- "zassi ga aru", # BUG: animacy constraint doesn't work (kill subj)
- "sono kuruma desu"]
- UNGRAMMATICAL = [
- "watasi ga ha wakaru",
- "koohii wo ha nomu",
- "watasi na kuruma desu",
- "kirei no kuruma desu",
- "kirei kuruma desu",
- "kuruma wo watasi ha kau",
- "eego ga watasi ha dekiru",
- "zassi ga iru", # BUG: animacy constraints don't work
- "sore no kuruma desu",
- "sono no kuruma desu"]
-
- def test_all(dimensions=None):
- print 'TESTING GRAMMATICAL SENTENCES'
- for sentence in GRAMMATICAL:
- XDGProblem(language=JAPANESE,
sentence=sentence).solve(dimensions=dimensions, verbose=1)
- print '\n\nTESTING UNGRAMMATICAL SENTENCES'
- for sentence in UNGRAMMATICAL:
- XDGProblem(language=JAPANESE,
sentence=sentence).solve(dimensions=dimensions, verbose=1)
-
- test_all()
=======================================
--- /generation/l3.py Fri Nov 27 10:54:58 2009 UTC
+++ /dev/null
@@ -1,754 +0,0 @@
-# Cross-linguistic components of XDG, implemented with python_constraint.
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# Michael Gasser <gas...@cs.indiana.edu>
-# 2009.09.12
-#
-# 2009.09.19
-# -- Added way to handle words lacking lexical entries
-# (but this should probably assign them to some lexical category
-# so we don't get the unknown word behaving like an adjective and a verb).
-# -- Added grouping of principles within a dimension to make solving more
-# efficient.
-# For Syntax, tree, order, and valency principles may be enough to
-# converge on a single solution, or at least to constrain the variables
-# a lot.
-# -- Add hierarchy and inheritance in Lexicon
-#
-# 2009.10.23
-# -- Added multiple dimensions (dimensions now in separate module)
-#
-# 2009.10.25
-# -- Sentences can now be input as strings (without EOS punctuation).
-#
-# 2009.10.31
-# -- Representation of groups
-# -- Group constraints
-# -- Pretty-printing of solutions
-# -- Constraints are all created on problem init, rather than in solve()
-#
-# TODO
-# 1 Agreement should really be with unification.
-# Use same simplified NLTK unification as with morphology processing.
-# 2 Generation principles.
-# 3 Subclasses of XDGConstraint for principles.
-
-import time
-from en import ENGLISH
-from xdg_constraint import Problem, XDGConstraint
-from dimension import ArcDimension, SynSem, Syntax, Semantics,
agr_match_entry, group_entries_agreeC
-
-### ----------------------------------------------------------------------
-### Problems
-### ----------------------------------------------------------------------
-
-class XDGProblem(Problem):
- """Class for an XDG constraint satisfaction problem.
-
- Example:
- >>> problem = XDGProblem(sentence='the man eats the yogurt')
- >>> problem.solve()
-
- SOLUTION 0
- ...
-
- @param solver: a CS solver, defaults to BacktrackingSolver
- @type solver: Solver
- @param sentence: list of words or string with spaces separating words
- @type sentence: list or string
- @param dimensions: dimensions that are part of this problem
- @type dimensions: list of Dimension subclasses
- @param parsing: whether this is a parsing (vs. a generation) problem
- (only parsing supported so far)
- @type parsing: boolean
- """
- def __init__(self, solver=None, sentence=[],
- dimensions = [(SynSem, Syntax, Semantics)],
- language=ENGLISH, parsing=True):
- Problem.__init__(self, solver=solver)
- # Create a dummy variable to use when a variable is needed but
- # only value is possible for it
- self.dummy_var = '?dummy'
- self.addVariable('?dummy', [0])
- self.language = language
- # If sentence is a string, split it at spaces and add EOS marker
- if sentence and isinstance(sentence, list):
- self.sentence = sentence
- else:
- self.sentence = sentence.split() + ['.']
- self.nodes = []
- # For each tuple, instantiate arc dimensions (all but first) and
- # then interface dimension (first)
- self.dimensions = []
- for dims in dimensions:
- arc_dim2, arc_dim1 = [dim(language, self) for dim in dims[1:]]
- if_dim = dims[0](language, self, arc_dim1, arc_dim2)
- self.dimensions.extend([arc_dim1, arc_dim2, if_dim])
- # self.dimensions = [dimension(language, self) for dimension in
dimensions]
- self.parsing = parsing
- self.lexicon = language.lexicon if self.language else None
- # Variables not specific to dimensions (entry)
- self.variables = {}
- # Variables specific to particular dimensions; make a
sub-dictionary
- # for each one
- self.dim_vars = dict([(dim.abbrev, {}) for dim in self.dimensions])
- # Create Nodes for the sentence
- self.create_nodes()
- # Lexicalize the nodes and find groups
- self.groups = self.lexicalize()
- # Create variables for each possible arc
- self.create_arc_variables()
- # Implement principles (creating constraints) for each dimension
- for dimension in self.dimensions:
- for principle in dimension.get_principles():
- principle()
-## for constraint, vars in self._constraints:
-## print constraint, vars
-
- def create_nodes(self):
- """Create a node for each word in sentence."""
- self.nodes = []
- for index, word in enumerate(self.sentence):
- self.nodes.append(Node(word=word, index=index))
-
- def lexicalize(self):
- """Lexicalize each node and return a list of groups found."""
- if not self.lexicon:
- raise ValueError, 'No lexicon stored!'
- # Keep track of groups found during lexicalization
- groups = {}
- # For all nodes except the end-of-sentence node
- for node in self.nodes[:-1]:
- # Find lexical entries, including groups
- new_groups = node.lexicalize1(self.lexicon, self.dimensions)
- for gid, (gwords, lex) in new_groups.items():
- if gid in groups:
- # Check to see if the Lex is the same for another Node
- # (we have to check the id because different lexes
could be clones of the same
- # original entry)
- matching_group_lexs = filter(lambda l: lex.id == l.id,
groups[gid]['lex'])
- if matching_group_lexs:
- # If so, append this to the list of nodes there
-
groups[gid]['lex'][matching_group_lexs[0]].append(node)
- else:
- # If not, start a new list of nodes with this node
- groups[gid]['lex'][lex] = [node]
- if gwords:
- # Number of words found is not 0; replace old value
- groups[gid]['gwords'] = gwords
- else:
- # This group is new; initialize it with lex, node, and
gwords
- groups[gid] = {}
- groups[gid]['lex'] = {}
- groups[gid]['gwords'] = gwords
- groups[gid]['lex'][lex] = [node]
- # Check whether each group found has all of its words
- for gid, gdict in groups.items():
- gwords = gdict.get('gwords', 0)
- if not gwords or len(gdict['lex']) != gwords:
- # Not enough words for this group; get rid of it
- del groups[gid]
- # For each group that survives, add it to the entries of the nodes
in it,
- # and record the group vars needed for group constraints
- group_vars = []
- for gid, gdict in groups.items():
- group1_vars = [gid]
- for lex, nodes in gdict['lex'].items():
- group1_lex_vars = []
- for node in nodes:
- # Index of the group entry for this node
- group_entry = len(node.entries)
- # Store this node along with the entry index
- group1_lex_vars.append([node, group_entry, lex])
- node.entries.append(lex)
- group1_vars.append(group1_lex_vars)
- # Add the vars for this group to the list
- group_vars.append(group1_vars)
- for node in self.nodes[:-1]:
- # Finish lexicalization: entry and agr variables
- node.lexicalize2(self.dimensions)
- # Create entry variable; this applies to all dimensions
- if node.n_entries <= 1:
- # Set node's entry var to dummy if there's only 1 entry
- node.entry_var = self.dummy_var
- else:
- # Otherwise add the node's entry var to variables
- self.addVariable(node.entry_var, range(node.n_entries))
- # Store the entry var in variables dict with
(index, 'entry') as key
- self.variables[node.index] = node.entry_var
- # Other node variables are specific to particular dimensions
- for dimension in self.dimensions:
- dim_abbrev = dimension.abbrev
- # Node var dict for this dimension
- dim_vars = node.vars[dim_abbrev]
- # Only create mother and daughter vars for ArcDimensions
- if isinstance(dimension, ArcDimension):
- dim_vars['mother_vars'] = []
- dim_vars['daughter_vars'] = []
- dim_vars['var_daughters'] = []
- # Agreement: only for Syntax
- if isinstance(dimension, Syntax):
- if 'agr' not in self.dim_vars[dim_abbrev]:
- self.dim_vars[dim_abbrev]['agr'] = {}
- if 'agr_var' in dim_vars:
- agr_var = dim_vars.get('agr_var')
- # The node already has an agr var for this
- # dimension: values are indices for agr dicts
- self.addVariable(agr_var,
range(dim_vars['max_agrs']))
- # Store the agr var in variables dict with
- # (index, dim_abbrev, 'agr') as key
- self.dim_vars[dim_abbrev]['agr'][node.index] =
agr_var
- else:
- # There is only 1 agr for the node; use dummy
var for this
- dim_vars['agr_var'] = self.dummy_var
- # Constrain the agr index to be less than
- # the lengths of agrs in particular entries
- if node.n_entries > 1 and dim_vars['max_agrs'] > 1:
- name = str(node.index) + ':' + dim_abbrev
+ ':Agr~Entry'
-
self.addConstraint(XDGConstraint(agr_match_entry(node, dim_abbrev),
- name=name),
- [dim_vars['agr_var'],
node.entry_var])
- for dimension in self.dimensions:
- # Make variable dicts for EOS node too
- EOS_vars = {}
- self.nodes[-1].vars[dimension.abbrev] = EOS_vars
- if isinstance(dimension, ArcDimension):
- EOS_vars['mother_vars'] = []
- EOS_vars['daughter_vars'] = []
- EOS_vars['var_daughters'] = []
- # Create the group constraints for each group
- for group in group_vars:
- self.group_entry_constraint(group[0], group[1:], groups)
- return groups
-
- def group_entry_constraint(self, gid, nodes, groups):
- """Create the constraint that requires the same entry for all
nodes belonging to group.
-
- Also add to self.groups the dict that simplifies
group_arc_principle.
-
- @param gid: Group id
- @type gid: string
- @param nodes: lists of node, entryindex pairs for each group
node/word
- @type nodes: lists of lists of Node, int pairs
-
- """
- # Entry variables for the constraint
- variables = []
- # List of lists of group entries for each node
- group_entries = []
- # List of lists of variable, entry pairs
- variable_entries = []
- for node_ls in nodes:
- group_entry_sublist = []
- var_entry_sublist = []
- for node, index, lex in node_ls:
- entry_var = node.entry_var
- variables.append(entry_var)
- group_entry_sublist.append(index)
- var_entry_sublist.append((node, entry_var, index))
- group_entries.append(group_entry_sublist)
- variable_entries.append([lex, var_entry_sublist])
- # Store this in the dict that will become
self.groups: "noun_lex_var_entry"
- # It's needed for group arc principle in syntax and semantics
- groups[gid]['n_l_v_e'] = variable_entries
- # Exactly one node in each sublist must take the group entry if
any does
-
self.addConstraint(XDGConstraint(group_entries_agreeC(group_entries),
name=gid + ':Group'),
- variables)
-
- def create_arc_variables(self):
- """Create variables for arc labels."""
- # Create separate arcs for each dimension that has them
- for dimension in self.dimensions:
- # Only if this is an ArcDimension
- if isinstance(dimension, ArcDimension):
- dim_abbrev = dimension.abbrev
- dim_vars = self.dim_vars[dim_abbrev]
- # Initialize dicts for storing vars in problem dimension
- if 'arc_vars' not in dim_vars:
- dim_vars['arc_vars'] = {}
- if 'arc_daughs' not in dim_vars:
- dim_vars['arc_daughs'] = {}
- # Check here for whether dimension has arcs
- # For all nodes except the end-of-sentence node
- for index, node1 in enumerate(self.nodes[:-1]):
- index1 = node1.index
- str1 = str(index1)
- # Variables for this dimension in node1
- vars1 = node1.vars[dim_abbrev]
- # Outs and ins for node1 on dimension
- outs1 = vars1.get('outs', [])
- ins1 = vars1.get('ins', [])
- haslex1 = node1.n_entries > 0
- # Create arc variables in both directions to other
nodes
- for node2 in self.nodes[index+1:-1]:
- vars2 = node2.vars[dim_abbrev]
- haslex2 = node2.n_entries > 0
- outs2 = vars2.get('outs', [])
- ins2 = vars2.get('ins', [])
- index2 = node2.index
- str2 = str(index2)
- # Intersections of ins and outs of node1 and node2
- if haslex1 and haslex2:
- outs2ins1 = outs2 & ins1
- outs1ins2 = outs1 & ins2
- elif haslex1:
- outs2ins1 = ins1 - set(['root'])
- outs1ins2 = outs1
- else:
- outs2ins1 = outs2
- outs1ins2 = ins2 - set(['root'])
- # Arc into node1, only if outs2 & ins1 is not empty
- if outs2ins1:
- # String name for the arc variable
- var2 = dim_abbrev + ':' + str2 + '->' + str1
- # Values constrained to be labels in the
in-out intersection
- self.addVariable(var2, list(outs2ins1) +
[None])
- # Store the arc var in variables dict with
index pair as key
- # and vice versa
- dim_vars['arc_daughs'][(index2, index1)] = var2
- dim_vars['arc_vars'][var2] = (index2, index1)
- # Add the variable to mother and daughter var
lists in nodes
- vars1.get('mother_vars').append(var2)
- vars2.get('daughter_vars').append(var2)
- # Add daughter index to var_daughters list in
mother
- vars2.get('var_daughters').append(index1)
- # Arc out of node1, only if ins2 & outs1 is not
empty
- if outs1ins2:
- # String name for the arc variable
- var1 = dim_abbrev + ':' + str1 + '->' + str2
- # Values constrained to be labels in the
in-out intersection
- self.addVariable(var1, list(outs1ins2) +
[None])
- # Store the arc var in variables dict with
index pair as key
- # and vice versa
- dim_vars['arc_daughs'][(index1, index2)] = var1
- dim_vars['arc_vars'][var1] = (index1, index2)
- # Add the variable to mother and daughter var
lists in nodes
- vars1.get('daughter_vars').append(var1)
- vars2.get('mother_vars').append(var1)
- # Add daughter index to var_daughters list in
mother
- vars1.get('var_daughters').append(index2)
- # Create arcs from end-of-sentence node
- EOS = self.nodes[-1]
- EOS_index = EOS.index
- str2 = str(EOS_index)
- EOS_vars = EOS.vars[dim_abbrev]
- # String name for arc from EOS to other node
- var = dim_abbrev + ':' + str2 + '->' + str1
- # Only possible arc label is 'root'
- domain = [None]
- if 'root' in ins1:
- domain.append('root')
- # Del arcs possible in Semantics (later also Syntax?)
- if 'del' in ins1:
- domain.append('del')
- self.addVariable(var, domain)
- # Add the variable to mother and daughter var lists in
nodes
- dim_vars['arc_daughs'][(EOS_index, index1)] = var
- dim_vars['arc_vars'][var] = (EOS_index, index1)
- vars1.get('mother_vars').append(var)
- EOS_vars.get('daughter_vars').append(var)
- # For consistency, add daughter index t var_daughters
list in EOS
- EOS_vars.get('var_daughters').append(index1)
-
-## def solutions_constrain_vars(self, solutions):
-## """Use intermediate solutions to constrain variable domains."""
-## # Initialize new variable dict (with sets as values)
-## variables = dict([(var, set()) for var in
self._variables.keys()])
-## for solution in solutions:
-## for var, domain in solution.iteritems():
-## variables[var].add(domain)
-## # Re-assign variables
-## for var in self._variables:
-## self._variables[var] = Domain(list(variables[var]))
-##
-## def any_unconstrained_var(self):
-## """Does any variable have more than one value in its domain?"""
-## return any([len(x) > 1 for x in self._variables.values()])
-##
-## def constrain_and_solve(self, dimension, principle_iter, verbose=0):
-## """
-## Make constraints and find solutions for the next group of
principles
-## for one dimension.
-## """
-## try:
-## # Get the next group of dimensions if there is one
-## principles = principle_iter.next()
-## # Create the constraints for each of the dimensions.
-## for principle in principles:
-## principle()
-## # Run the solver and return the solutions
-## return self.getSolutions(verbose=verbose)
-##
-## except StopIteration:
-## return None
-
-## def solve_dim(self, dimension, verbose=0):
-## """
-## Find solutions for a single dimension, implementing constraints
-## and solving by principle groups.
-## """
-## solutions = []
-## for index, principles in
enumerate(dimension.get_principles_iter()):
-## print 'GROUP', index, 'PRINCIPLES'
-## # Start over with new constraints
-## self._constraints = []
-## # Implement constraints for each principle in group
-## for principle in principles:
-## principle()
-## # Find solutions, given these principles
-## solutions = self.getSolutions(verbose=verbose)
-## # Constrain variable domains based on solutions
-## self.solutions_constrain_vars(solutions)
-## return solutions
-
- def solve(self, dimensions=None, verbose=0):
- """Find solutions to problem and pretty-print them.
- @param dimensions: dimensions to solve for
- @type dimensions: list of dimension abbrevs (strings)
- @param verbose: whether to print out verbose messages
- @type verbose: int -- 0: terse, 1: summary msg, 2: lots of msgs
- """
- if dimensions:
- dimensions = [dim for dim in self.dimensions if dim.abbrev in
dimensions]
- else:
- dimensions = self.dimensions
- print '\nSOLVING', self.sentence
- XDGConstraint.calls = 0
- if verbose:
- t1 = time.time()
- solutions = self.getSolutions(verbose=verbose)
- # Record time here to avoid timing printing
- if verbose:
- time_diff = time.time() - t1
- if len(solutions) > 10:
- print '\nFound', len(solutions), 'solutions'
- else:
- self.print_solutions(solutions)
- if verbose:
- print
- print 'Time: %0.3f ms' % (time_diff * 1000.0,)
- print 'Calls:', XDGConstraint.calls
- return solutions
-
- ### Pretty printing solutions
-
- def print_solutions(self, solutions):
- """
- Pretty-print solutions.
-
- @param solutions: problem solutions
- @type solutions: list of solution dicts
- """
- if not solutions:
- print 'NO SOLUTION FOUND'
- for index, solution in enumerate(solutions):
-# if len(solutions) > 1:
-
print '\n+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++'
- print '+ SOLUTION',
index, ' +'
-
print '+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++'
- self.print_solution(solution)
-
- def print_solution(self, solution):
- """Pretty-print solution.
-
- @param solution: problem solution
- @type solution: solution dict
- """
- nodes_str, positions = self.nodes_string()
- # Dimension-independent variables (entry)
- if self.variables:
- entries = [solution.get(self.variables.get(i, None)) for i in
range(len(self.nodes)-1)]
- print
- print self.node_var_string('?entry', entries, positions)
- print nodes_str
- # Dimension-specific variables
- for dim, variables in self.dim_vars.iteritems():
- if variables:
- print '\n_______________________ DIMENSION',
dim.upper(), '_______________________'
- # Agr variables if there are any
- agr_values = variables.get('agr', None)
- if agr_values:
- values = [solution.get(agr_values.get(i, None)) for i
in range(len(self.nodes)-1)]
- print self.node_var_string('?agr', values, positions)
- print nodes_str
- for (src, dest), var in
variables['arc_daughs'].iteritems():
- if isinstance(src, int) and isinstance(dest, int):
- val = solution.get(var)
- if val != None and var != self.dummy_var:
- print self.arc_string(val, src, dest,
positions)
-
- def node_var_string(self, var, values, positions):
- '''Return a string with values for variable var written in
positions.
-
- @param var: variable name
- @type var: string
- @param values: values for variable for each node in solution
- @type values: list of values or None
- @param positions: positions of nodes in string representation
- @type positions: list of ints
- @return string beginning with var containing values for var where
appropriate
- @rtype string
- '''
- string = var
- curr_pos = len(string) - 1
- for value, position in zip(values, positions):
- if value != None:
- string += ' ' * (position - curr_pos - 1) + str(value)
- else:
- string += ' ' * (position - curr_pos)
- curr_pos = position
- return string
-
- def nodes_string(self):
- '''Return a string that distributes the words in the sentence
evenly.
-
- @return nodes string and positions of centers of strings
- @rtype tuple: (string, list of ints)
- '''
- width = max([len(node.word) for node in self.nodes]) + 3
- word1 = self.nodes[0].word
- position = len(word1)/2 + 6
- string = ' ' + word1 + ' ' * ((width - len(word1))/2)
- positions = [position]
- for node in self.nodes[1:]:
- word = node.word
- word_len = len(word)
- len_diff = width - word_len
- s = word.ljust(word_len + len_diff/2)
- string += s.rjust(width)
- position += width
- positions.append(position)
- return string, positions
-
- def arc_string(self, label, start, end, positions):
- """Return a string representing an arc with label start to end
position in positions.
-
- @param label: arc label
- @type label: string
- @param start: node index for beginning of arc
- @type start: int
- @param end: node index for end of arc
- @type end: int
- @param positions: centers of nodes in string
- @type positions: list of ints
- @return string representing arc
- @rtype string
- """
- width = abs(positions[end] - positions[start])
- direction = end > start
- return self.arc_string1(label, width, direction,
min([positions[start], positions[end]]))
-
- def arc_string1(self, label, width, direction, start=0):
- """Return a string representing an arc with label and width in
direction starting from start.
-
- @param label: arc label
- @type label: string
- @param width: width of arc string
- @type width: int
- @param direction: direction of arc
- @type direction: boolean (True: right, False: left)
- @param start: position of left end of arc
- @type start: int
- @return string representing arc
- @rtype string
- """
- left = ' ' * start
- label_len = len(label)
- shaft = '-' * ((width - label_len) / 2)
- if direction:
- string = left + shaft + label + shaft + '>'
- return string.ljust(width, '-')
- else:
- string = left + '<' + shaft + label + shaft
- return string.rjust(width, '-')
-
-### ----------------------------------------------------------------------
-### CONSTRAINTS
-### ----------------------------------------------------------------------
-
-### ----------------------------------------------------------------------
-### Nodes
-### ----------------------------------------------------------------------
-
-class Node:
- """Class for nodes, the basic input to a constraint satisfaction
problem."""
-
- def __init__(self, word='', index=0, node_set=[], entry_index=0,
entries=[]):
- self.word = word
- self.index = index
- self.node_set = node_set
- self.entry_index = entry_index
- self.entries = entries or []
- # Dictionary of dimension-specific variables, each organized into
- # ins, outs, daughter_vars, mother_vars, var_daughters, and agr_var
- self.vars = {}
- # Variables for daughter arcs
- self.daughter_vars = []
- # Variables for mother arcs
- self.mother_vars = []
- # Daughter indices corresponding to daughter_vars
- self.var_daughters = []
- # Variable for index of lexical entry
- self.entry_var = None
- # Variable for agr dicts
- self.agr_var = None
-
- def lexicalize1(self, lexicon, dimensions):
- """Find lexical entries for node and return groups found.
-
- @param lexicon: the lexicon for the language
- @type lexicon: Lexicon
- @param dimensions: list of dimensions
- @type dimensions: list of instances of Dimension subclasses
(e.g., Syntax)
- @return list of ids of groups found
- @rtype list of strings
- """
- # Keep track of group lexical entries by their gid
- groups = {}
- # Find all entries, single word and group
- entries = lexicon.get_lex(self.word)
- for entry in entries:
- # Clone the entry because it's going to be mutated during
inheritance
- entry = entry.clone()
- # Any additional entries found from classes during inheritance
- add_entries = []
- # Any groups that are found
- new_groups = {}
- # This mutates the lexicon itself and accumulates new groups
and other entries
- lexicon.inherit(entry, dimensions, groups=new_groups,
add_entries=add_entries)
- if new_groups:
- # Group entries; don't add to self.entries yet
- groups.update(new_groups)
- if not entry.is_group():
- # The entry itself is not a group; add it to self.entries
- self.entries.append(entry)
- # Add any new entries found to the node
- self.entries.extend(add_entries)
- return groups
-
- def lexicalize2(self, dimensions):
- """Complete lexicalization of node: entry and agr variables."""
- self.n_entries = len(self.entries)
- if self.n_entries == 0:
- print '\nWord', self.word, 'not found in lexicon!'
- elif self.n_entries > 1:
- # Make a variable for the lexical entry index, common to all
dimensions
- entry_var = str(self.index) + '_entry'
- self.entry_var = str(self.index) + '_entry'
- # For each dimension, initialize variable lists
- for dimension in dimensions:
- dim_abbrev = dimension.abbrev
- # Initialize the specs for this dimension
- dct = {}
- # Create a variable for agr if any entry has more than one
possibility
- if isinstance(dimension, Syntax):
- dct['max_agrs'] = self.get_max_agr(dim_abbrev)
- if dct['max_agrs'] > 1:
- dct['agr_var'] = str(self.index) + '_agr'
- # For ArcDimensions, sets of all possible ins and outs for
node (redundant)
- if isinstance(dimension, ArcDimension):
- dct['ins'] = self.get_ins(dim_abbrev)
- dct['outs'] = self.get_outs(dim_abbrev)
- # For IFDimensions, add an arg list?
- # Make this the vars dict for this dimension
- self.vars[dim_abbrev] = dct
-
- def __str__(self):
- return 'Node' + str(self.index)
-
- def get_entry(self, index=0):
- """Return the lexical entry with the given index."""
- if self.n_entries == 0:
- return None
- return self.entries[index]
-
- def get_max_agr(self, dim):
- """The length of the longest agrs list of dicts among entries for
a given dimension.
- @param dim: dimension abbrevation, e.g., 'syn'
- @type dim: string
- """
- if self.n_entries == 0:
- return 0
- return max([(len(entry.get_dim(dim).agrs) if entry.get_dim(dim)
else 0) \
- for entry in self.entries])
-
- def get_outs(self, dim):
- """Return the set of out arc labels from all entries for a given
dimension.
- @param dim: dimension abbrevation, e.g., 'syn'
- @type dim: string
- """
- outs = set()
- for entry in self.entries:
- if entry.get_dim(dim):
- outs = outs.union(set(entry.get_dim(dim).outs.keys()))
- return outs
-
- def get_ins(self, dim):
- """Return the set of in arc labels from all entries for a given
dimension.
- @param dim: dimension abbreviation, e.g., 'syn'
- @type dim: string
- """
- ins = set()
- for entry in self.entries:
- if entry.get_dim(dim):
- ins = ins.union(set(entry.get_dim(dim).ins.keys()))
- return ins
-
-
-# ----------------------------------------------------------------------
-# Some test sentences
-# ----------------------------------------------------------------------
-
-GRAMMATICAL = ['Mary eats yogurt',
- 'Mary eats',
- 'Mary eats yogurt often',
- 'people eat yogurt',
- 'tall people eat yogurt',
- 'the man eats old yogurt',
- 'the people had an argument',
- 'an army had an argument',
- 'the people argue',
- 'the girl breaks the ice',
- 'the old man the boats',
- 'the old man mans the boats',
- 'the old man mans tall boats',
- # John is not in the lexicon
- 'John eats yogurt',
- # raw is not in the lexicon
- 'Mary eats raw yogurt']
-
-UNGRAMMATICAL = ['people eats yogurt',
- 'Mary eat yogurt',
- 'people eat often yogurt',
- 'eats yogurt often',
- 'the old the boats']
-
-def test_all(dimensions=None):
- print 'TESTING GRAMMATICAL SENTENCES'
- for sentence in GRAMMATICAL:
- XDGProblem(sentence=sentence).solve(dimensions=dimensions,
verbose=1)
- print '\n\nTESTING UNGRAMMATICAL SENTENCES'
- for sentence in UNGRAMMATICAL:
- XDGProblem(sentence=sentence).solve(dimensions=dimensions,
verbose=1)
-
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
- print "\n"
- test_all()
=======================================
--- /generation/latex/HomeworkStyle.sty Fri Oct 23 07:34:39 2009 UTC
+++ /dev/null
@@ -1,151 +0,0 @@
-% ~~~~~ This file is an article style sheet.
-\NeedsTeXFormat{LaTeX2e}
-\ProvidesPackage{HomeworkStyle}[2009/09/06 wren's standard stylization]
-% Only changed headers/footers since 2007/10/07
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Adjust border spacing, must come before the header adjustments
-\RequirePackage{calc}
-
-%% STYLE: The \setlength syntax is LaTeX (good),
-%% the other way with or without "=" is TeX (not as good)
-\setlength{\textheight }{8.9in} % Height of body
-\setlength{\textwidth }{6.5in} % Width of body, headers,
footers
-\setlength{\marginparwidth}{.75in}
-
-\setlength{ \topmargin }{0pt} % Margin between top and header
-\addtolength{\topmargin }{-\headheight}
-\addtolength{\topmargin }{-\headsep}
-\addtolength{\headheight }{3pt} % For the fancyhdr bars
-\setlength{ \headsep }{0.2in} % Separator between header and
body
-
-\setlength{\oddsidemargin }{(\paperwidth-\textwidth)/2 - 1in}
-\setlength{\evensidemargin}{\oddsidemargin} % BUG: we're borked without
this
-
-\setlength{\footskip}{\paperheight-\textheight-\headheight-\headsep
- -2\topmargin-2in}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Adjust other spacings
-
-\setlength{\parindent}{0.5in} % Individual files should unset this if
desired
-\setlength{\parskip }{0.2in} % If a variable value is used, ala:
- % {1ex plus 0.5ex minus 0.2ex}
- % Then this should be moved to after the ToC
if
- % there is one
-
-%% Line spacing: 1 is single (default), 1.3 is one and a half, 1.6 is
double.
-%\linespread{1.6}
-
-%% Don't do double spaces after periods.
-%% (Remember to do "i.e.\ " etc if this is commented out)
-%\frenchspacing
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Other random counter adjustments
-
-%% Number subsubsections
-\setcounter{secnumdepth}{3}
-
-%% Show subsubsections in table of contents
-\setcounter{tocdepth}{3}
-
-%% Disallow page breaks at hyphens (this will give some underfull vbox's,
-%% so an alternative is to use \brokenpenalty=100 and manually search
-%% for and fix such page breaks)
-%% can't use \setcounter for some reason...
-\brokenpenalty=10000
-
-%% Try to ensure no stranded figures alone on a page
-\renewcommand{\topfraction}{0.85}
-\renewcommand{\textfraction}{0.1}
-\renewcommand{\floatpagefraction}{0.75}
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~ General packages
-
-%% Decent fonts
-\RequirePackage{amsmath,amssymb,amsthm,latexsym}
-\RequirePackage{pslatex} % OBSOLETE: Times, Helvetica, special narrow
Courier
-\RequirePackage{mathptmx} % mathtimes
-
-%% Decent bibliographies
-\RequirePackage[sectionbib,sort]{natbib}
-\bibliographystyle{plainnat}
-\bibpunct{[}{]}{;}{a}{,}{,}
-\RequirePackage{url}
-% lsalike for natbib
-\newcommand{\quotecite}[2][]{\citeauthor*{#2}'s
\citetext{\citealp[#1]{#2}}}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Define \subtitle, \theauthor, \thetitle commands for reusability
-\RequirePackage{entitlement} % local.
- % STYLE: better than \usepackage in
*.{sty,cls}
-
-% The bug's not in entitlement.
-% If we comment the above (and use the below) it's still there
-%\newcommand{\theauthor}{$\langle$Insert Author Here$\rangle$}
-%\newcommand{\thetitle}{$\langle$Insert Title Here$\rangle$}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Adjust Headers and Footers
-
-\RequirePackage{fancyhdr} % This gives us decent header and footer control
-\RequirePackage{lastpage} % This gives us the last page, after enough
re-runs
-\pagestyle{fancy}
-
-%% For 'plain' pages like the \maketitle and such
-%\fancypagestyle{plain}{
-% % put special \fancyhead[]{} and such in here
-%}
-
-%% For regular pages. Areas are {H,F}x{L,C,R}x{E,O}
-\fancyhead[L]{\thetitle}
-\fancyhead[C]{}
-\fancyhead[R]{\theauthor}
-\fancyfoot[L]{\today}
-\fancyfoot[C]{}
-\fancyfoot[R]{page \thepage\ of \pageref{LastPage}}
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~ Hyperlinks in PDF (Must come later than most packages)
-%% cf <http://www.tex.ac.uk/cgi-bin/texfaq2html?label=hyperdupdest>
-
-\RequirePackage[
- letterpaper,breaklinks,bookmarks,
- naturalnames,
- %
- colorlinks=true,
- linkcolor=blue,
- citecolor=blue,
- filecolor=blue,
- menucolor=blue,
- pagecolor=blue,
- urlcolor=blue,
- %
- citebordercolor={1 1 1},
- filebordercolor={1 1 1},
- linkbordercolor={1 1 1},
- menubordercolor={1 1 1},
- pagebordercolor={1 1 1},
- urlbordercolor={1 1 1},
- %
- pdftitle={\thetitle},
- pdfauthor={\theauthor},
- pdfcreator={LaTeX with hyperref},
- %pdfsubject={...},
- %pdfkeywords={..., ...}
- %
- % cf <http://www.tex.ac.uk/cgi-bin/texfaq2html?label=pdfpagelabels>
- plainpages=false,% Fix intermixing "ii" and "2" pages
- pdfpagelabels]% Print as "ii (4 of 40)"
- %
- {hyperref}
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
=======================================
--- /generation/latex/Makefile Fri Oct 23 07:34:39 2009 UTC
+++ /dev/null
@@ -1,166 +0,0 @@
-# wren ng thornton, <wr...@cpan.org> ~ 2009.09.06
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# This GNU makefile should greatly ease building LaTeX stuff
-# cf <http://theory.uwinnipeg.ca/gnu/make/make_14.html#SEC13>
-#
-# N.B. Recursive $(MAKE) requires the caller be in the same directory
-# with this file, because there's no real way to pass down the -f
-# flag. So I've done my best to avoid recursive make.
-
-LHS2TEX = lhs2TeX
-LHSFLAGS =
-LATEX = pdflatex
-LATEXFLAGS = -file-line-error -halt-on-error
-CLEAN_SUFFIXES =
aux,log,toc,lof,lot,bbl,blg,out,dvi,ps,ptb,xyc,cb,idx,ist,ilg,ind,glo,glg,gls
-EXCALIBUR = '/Applications/TeX/Excalibur (4.0.5)/Excalibur.app'
-RM = rm -f
-
-# First undefine all the rules for suffixes, then add ours in
-.SUFFIXES:
-.SUFFIXES: .lhs .tex .pdf
-# Just to make sure these are always run
-# (even if someone's named a file after them)
-.PHONY: all spellcheck clean realclean open
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# A pretty wrapper and usage printer
-# (the ifndef was hoisted out around the 'all' target to avoid recursive
make)
-
-ifndef FILE
-all:
- @echo '*** Since each project is unique I can'\''t do it all'
- @echo ' You must pass a *.pdf goal or define a FILE when calling me'
- @echo ' If you'\''re using `pstricks'\'' then also set LATEX=latex'
- @echo
- @exit 1
-else
-all: $(basename $(FILE)).pdf
-endif
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# TODO: this needs better fleshing out
-
-%.tex: %.lhs
-ifeq ($(shell which '$(LHS2TEX)'),)
- @echo '*** Can'\''t find `$(LHS2TEX)'\''!'
- @echo
- @exit 1
-endif
-
- @if [ -e $*.tex ]; then \
- echo '*** Already found a .tex file.' ;\
- echo '*** Are you sure you want to do this?' ;\
- exit 1 ;\
- fi
-
- @echo ; echo ; echo '*** Run lhs2tex to get something to work with'
- $(LHS2TEX) $(LHSFLAGS) -o $*.tex $*.lhs
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# The real meat of this all
-%.pdf: %.tex
-ifeq ($(shell which '$(LATEX)'),)
- @echo '*** Can'\''t find `$(LATEX)'\''!'
- @echo
- @exit 1
-endif
-
- @echo ; echo ; echo '*** Run once just to start'
- @yes x | $(LATEX) $(LATEXFLAGS) $< >$*.firstRunErrorLog \
- || ( err=$$? ;\
- cat $*.firstRunErrorLog ;\
- rm $*.firstRunErrorLog ;\
- exit $$err )
- @rm $*.firstRunErrorLog
-
- @# Can't seem to get the $(shell) of grep to work, it's not getting the
file
- @if grep '^[^%]*\\bibliography{' $< 2>&1 >/dev/null ; then \
- echo ; echo ; echo '*** BibTeX it' ;\
- bibtex $* ;\
- echo ; echo ; echo '*** Run again, for real this time' ;\
- yes x | $(LATEX) $(LATEXFLAGS) $< >/dev/null ;\
- fi
- @# Sometimes the above latex dies, needs to be uncommented
- @# We should make this into a flag for quietness.
-
- @# These files would be generated by \makeglossary
- @# (See also \makegloss and bibtex?
- @# <http://www.socher.org/index.php/Main/CompleteLatexThesisFramework>)
- @if [ -e $*.ist -a -e $*.glo ] ; then \
- echo ; echo ; echo '*** Make glossary' ;\
- makeindex $*.glo -s $*.ist -t $*.glg -o $*.gls ;\
- fi
-
- @# These ones are made by \makeindex
- @if [ -e $*.idx ] ; then \
- echo ; echo ; echo '*** Make index';\
- makeindex $* ;\
- fi
-
- @echo ; echo ; echo '*** Run a final time to get the references right'
- @# BUG: needs an extra time if index/glossary (or for some reason...)
- yes x | $(LATEX) $(LATEXFLAGS) $<
-
-ifeq ($(LATEX),latex) # BUG: if $(LATEX) contains but is not exactly...
- @echo ; echo ; echo '*** You used `latex'\'': Converting from dvi to pdf'
-
- @# N.B. the -o flag is needed to not send it to the printer!
- @# The -G0 is to fix ligature mangling for {fi},{fl}, etc
- @# cf <http://www.tex.ac.uk/cgi-bin/texfaq2html?label=charshift>
- dvips -Ppdf -G0 -o $*.ps $*.dvi
-
- @# cf <http://www.cs.toronto.edu/~murray/compnotes/latex.html>
- ps2pdf \
- -dAutoFilterColorImages=false \
- -sColorImageFilter=FlateEncode \
- -sPAPERSIZE=a4 \
- $*.ps
-endif
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# Run Excalibur (without needing to type out the damn path)
-# N.B. Doesn't play nice with Literate Haskell
-spellcheck:
-ifdef FILE
- open -a $(EXCALIBUR) $(basename $(FILE)).tex
-else
- @echo '*** Spellcheck what?! Define FILE first.'
- @echo
- @exit 1
-endif
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# Run `open` (without needing to type out the file name again)
-open:
-ifdef FILE
- @echo ; echo ; echo '*** Opening'
- open $(basename $(FILE)).pdf
-else
- @echo '*** Open what?! Define FILE first.'
- @echo
- @exit 1
-endif
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# Remove all the auxilliary files
-clean:
-ifdef FILE
- @echo ; echo ; echo '*** Cleaning'
- $(RM) $(basename $(FILE)).{$(CLEAN_SUFFIXES)}
-else
- @echo '*** I won'\''t try that! Define FILE first.'
- @echo
- @exit 1
-endif
-
-
-# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-# Also remove the pdf
-realclean: clean
- $(RM) $(basename $(FILE)).pdf
=======================================
--- /generation/latex/entitlement.sty Fri Oct 23 07:34:39 2009 UTC
+++ /dev/null
@@ -1,112 +0,0 @@
-% ~~~~~ Because we're entitled to have things work!
-%% This file is an article style sheet.
-\NeedsTeXFormat{LaTeX2e}
-\ProvidesPackage{entitlement}[2009/10/11 Better preamble commands and
access to internal variables]
-
-
-%% Much of this is TeX which is bad style, but I don't know how
-%% else to do preambling
-%% N.B. This all requires being in a real .sty and can't be \input{}ed
-%% otherwise you'll get weird errors about \@
-%%
-%% Much of this is modified from:
-%% <http://tug.org/mail-archives/texhax/2007-June/008649.html>
-%% cf also: <http://www.math.nagoya-u.ac.jp/en/journal/manual-02.html>
-%% and also:
<http://64.233.169.104/search?q=cache:_IvpCpWo3NAJ:www.cmis.csiro.au/ismm2002/submission/kapproc.tex>
-
-%% BUG: \today and \date don't seem to play together nicely anymore.
-%% What went wrong? In \maketitle it doesn't get reformatted,
-%% in fancyhdr it really is today! We don't touch them and so
-%% I'm not sure if it's our fault or someone else's, but I'd
-%% guess it's probably ours.
-%% Update: Apparently it's not us. It's not \maketitle or fancyhdr
-%% either, WTF?
-%% At least this makes it consistent, though it does remove the
-%% auto-reformatting
-\gdef\the@date{\today}% default to \today, before we override \today
-\def\date#1{%
- \gdef\the@date{#1}%
- \gdef\today{#1}% At least redefine \today to be consistent with \maketitle
-}
-%% The version of \today in article is:
-%\def\today{\ifcase\month\or
-% January\or February\or March\or April\or May\or June\or
-% July\or August\or September\or October\or November\or December\fi
-% \space\number\day, \number\year}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% Format: \iffull<macro><body when macro is full>\fi
-% <http://tug.org/mail-archives/texhax/2007-June/008649.html>
-%
-% The {} around #1 are necessary to fix an obscure bug where
-% if #1 begins with \textit{ then bizarre things happen like
-% printing an \Omega and ignoring the first characters of the body
-% For another fix, use \ifempty instead
-%
-%\def\iffull#1{\if{#1}\relax\else}
-
-% Format: \ifempty<macro><body when macro is empty>\else<body when full>\fi
-% <http://www.tex.ac.uk/cgi-bin/texfaq2html?label=empty>
-%
-%\def\ifempty#1{%
-% \def\tempempty{}%
-% \def\temparg{#1}%
-% \ifx\tempempty\temparg%
-%}
-
-% Both of those definitions are buggy and will miss certain things. This
bug shows up while redefining \title below, where the ":\\" is printed
regardless of whether \@subtitle is empty or not. This version seems to
work better:
-% <http://www.physics.wm.edu/~norman/latexhints/conditional_macros.html>
-%
-% Format: \ifx<macro>\@empty<body when empty>\else<body when full>\fi
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Define a \subtitle which will appear in \maketitle but not
\thetitle
-\gdef\@subtitle{}
-\def\subtitle#1{\gdef\@subtitle{#1}}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Redefine \title so it uses the \subtitle and so \thetitle works
right
-% TODO: make it further into \title[short-title]{full-title} ala \section
-% TODO: allow schanging the ":\\" delimiter
-\gdef\the@title{}
-\def\title#1{%
- \gdef\the@title{#1}% Copy it for when \maketitle unsets \@title
- \gdef\@title{#1\ifx\@subtitle\@empty\else:\\\@subtitle\fi}%
-}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ Redefine \author so \theauthor works right
-\gdef\the@author{}
-\def\author#1{%
- \gdef\the@author{#1}% Copy it for when \maketitle unsets \@author
- \gdef\@author{#1}%
-}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~ New commands to give users access to the internal variables
-% set by \author and \title
-\newcommand{\theauthor}{\the@author}
-\newcommand{\thetitle}{\the@title}
-
-
-%% The old bug-fix kept for reference:
-%% The \global\let will actually copy things over, unlike
-%% \newcommand{\the@author}{\@author}
-%% This doesn't work if called within an environment, so take "global"
liberally
-%% It can however be called within the preamble because it has no arguments
-%% though it seems to be something about \global\let vs \gdef
-%
-%\newcommand{\debugmaketitle}{%
-% \global\let\the@author\@author
-% \global\let\the@title\@title
-% \renewcommand{\theauthor}{\the@author}% it used to use \@author directly
-% \renewcommand{\thetitle}{\the@title}% it used to use \@title directly
-%}
-
-
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
=======================================
--- /generation/latex/finalreport.tex Wed Dec 16 05:57:22 2009 UTC
+++ /dev/null
@@ -1,157 +0,0 @@
-\documentclass[12pt]{article}
-\usepackage{HomeworkStyle}
-
-% Declarative formatting for commentation
-\newcommand{\todo}[1]{\textit{ #1}\marginpar{\textbf{TODO}}}
-\newcommand{\comment}[1]{\textit{ #1}\marginpar{\textbf{Comment}}}
-% Uncomment these lines to remove commentation.
-\renewcommand{\todo}[1]{}
-\renewcommand{\comment}[1]{}
-
-% These two styles are defined separately so they can be more easily
overridden/altered
-\newcommand{\lingtrans}[1]{`#1'}
-\newcommand{\lingforeign}[1]{\textit{#1}}
-% Use this declarative markup to indicate lexical forms (and optionally
their glosses)
-\newcommand{\lf}[2][]{\lingforeign{#2}\ifx#1\else\ \lingtrans{#1}\fi}
-
-\author{Wren N.\,G.\,Thornton, Alex Rudnick, and Yin Wang}
-\title{Final Report}
-\subtitle{Surface realization for XDG}
-\date{15 December 2009}
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{document}
-
-\maketitle
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Introduction}
-\label{s:intro}
-
-XDG is a dependency grammar formalism which uses constraint programming
for resolving parse trees. Often times in dependency formalisms there are
conflicts between desired analyses. \todo{move this elsewhere, or remove
it} For example, English light verb clusters are best analyzed ---from a
morphosyntactic perspective--- as a linear chain from the subject through
the light verbs in left-to-right order and ending with the main verb; but
from a semantic perspective it would be better for the subject to depend on
the main verb directly. \comment{but we want this part for explaining how
generation ought to work} In order to resolve these conflicts XDG uses
multiple dimensions as part of a complete analysis of the given sentence.
-
-In general, because of the constraint-based nature it should be possible
to run XDG backwards, using the semantic dimension (or an external semantic
representation) to determine the sentences that would be parsed into that
semantic representation. Although literature on XDG has occasionally
mentioned this possibility (Debusmann's dissertation mentions it in
passing, and it's discussed in the paper from Pelizzoni and das Gracas
Volpe Nunes), as far as we know, generation with XDG has not yet been
implemented.
-
-\comment{proposal:
- %
- We will investigate using XDG for text generation. Literature on XDG
often mentions the possibility of using XDG for generation as well as
parsing, but it is only mentioned in passing in Ralph Debusmann's
dissertation and the paper from Pelizzoni and das Gracas Volpe Nunes.
- %
- As a starting point we will adapt your XDG parsing code to “run
backwards”, producing text from XDG graph structures. One principle issue
here is in efficiently linearizing the nodes. Common algorithms for
topological sorting are in O(|V| + |E|), indicating the need for minimizing
the number of edges considered (i.e. by ignoring edges which could be
derived by transitivity from other edges), since the number of edges will
be a limiting factor for using multiple dimensions in generation.
Additionally, these algorithms only find a single ordering and will need
extending in order to enumerate all orderings. In order to demonstrate
generality of the linearization algorithm and other aspects of generation,
we will build small grammars for several distinct languages. The first
language will come from a small subset of English, and at least one other
will come from a language with free word order such as Japanese or Quechua.
- %
- After settling on a linearization algorithm and developing these
grammars, time permitting, we would like to extend the Python
implementation of XDG to handle multiple dimensions. This is necessary for
the generation strategy suggested in the aforementioned dissertation and
paper, where a semantic dimension is used to drive constraints in the
syntactic dimensions. It's also necessary to make use of existing XDG
grammars for parsing, since they often separate syntax into ID and LP
dimensions in order to handle issues with word ordering.
-}
-
-So for this project, we have investigated how to take a dependency parse
of a sentence and recover the corresponding text. This is not completely
trivial, because there can be many possible orderings for a particular
dependency parse. Our task for this project was to recover what that text
sentences might have been, given a dependency graph.
-
-We also looked into dependency parsing for two other languages, Japanese
and Quechua. For Japanese, some interesting progress has been made, and in
the near future, we expect a working Quechua parser.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\comment{
-\section{Possible solutions and related work}
-\label{s:related}
-%
-Discussion of a few of the things from the Suggested Reading List?
-%
-Pelizzoni and das Gracas Volpe Nunes say...
-}
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Linearizing a dependency graph}
-\label{s:linearizer}
-
-Given a dependency graph output by the XDG parser, we want to find all of
the possible orderings that correspond to that graph, given the constraints
specified by the rules given in the lexical entries for the words in the
graph. Since there might be multiple lexical entries (even different parts
of speech) corresponding to a given string of characters, we refer to the
parsing solution given to find out which entry is currently being used, and
then look up the relevant ordering rules.
-
-Particularly, the \texttt{order} rules in the lexical entries constrain
possible orderings. For example, in the English lexicon, the entry for
verbs, V, contains these rules for syntax:
-
-\begin{verbatim}
- ins: {root: '!'}
- outs: {sb: '!', adv: '*', prp: '*'}
- agree: [[sb, sb, ^]]
- order: [(sb, ^), (^, adv), (prp, adv)]
-\end{verbatim}
-
-Here, there are three ordering rules specified. Here the \texttt{\^}
character indicates ``this'', and describes the position of the current
word relative to its daughter nodes in the graph. Here we see that (1) the
``sb'' (subject) node of this verb must come before the current word, the
``adv'' node must come after this word, and if there are ``prp'' and
``adv'' links, then the preposition must come first. Notably, the
preposition daughter node could be anywhere else in the sentence, as long
as it comes before the adverb.
-
-Once we've gone through each node in the solution and found its ordering
constraints, we just have to address the problem of assigning each word in
the sentence a unique position in the sentence. At first, we did this with
python-constraint's \texttt{FunctionConstraint} constraint objects, but
looking to improve the speed of this search, we turned our attentions to
the constraint solver itself, refactoring the code for extensibility, and
developed the \texttt{LessThanConstraint}.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Modifications to the constraint solver (LessThan constraint)}
-\label{s:lessthan}
-
-Using the built-in \texttt{FunctionConstraint}, the constraint solver has
no option other than to enumerate all possible variable assignments and to
check whether they satisfy the constraint. With factorially many possible
orderings, this causes unacceptable performance for long or medium length
sentences. By introducing a new constraint type which is not forced to
check assignments individually, but rather can check large collections of
assignments at once, we can significantly improve this upper bound. For
selecting a word ordering in surface realization, it is important to notice
that the possible positions form a total linear order. We can take
advantage of this structure to significantly reduce the possible orderings
which need to be considered. The \texttt{LessThanConstraint} class
encapsulates this structure. The semantics are such that
\texttt{p.addConstraint(lambda w1, w2: w1 < w2, [W1,W2])} and
\texttt{p.addConstraint(LessThanConstraint(W1,W2))} are equivalent, except
the latter has improved efficiency.
-
-The \texttt{LessThanConstraint} class improves efficiency by tracking the
upper and lower bounds of the possible ranges for each variable. The upper
bound of the lesser variable can be reduced to the upper bound of the
greater variable, since otherwise there is no possible assignment for the
greater variable. Dually the lower bound of the greater variable can be
raised to the lower bound of the lesser variable, because otherwise there
is no assignment for the lesser variable. Additionally, it is easy to
fail-fast or succeed-fast if there is no overlap in the spans of the two
variables, which can be detected quickly by looking at the extreme values.
-
-The current implementation is sub-optimal because the solver itself has
inadequate support for performing arc consistency and ensuring that all
constraints have propagated fully before resorting to search. However, this
sub-optimality is not that bad for the sentences we have tested on.
Additional improvements could be obtained by defining a
\texttt{TotalOrderConstraint} class which accepts more than two variables
and uses integer linear programming (e.g., the simplex algorithm) to solve
the system of linear inequations at once. For totally connected subgraphs
of constraints, as is common in languages with strict constituent order, a
single \texttt{TotalOrderConstraint} instance would be significantly more
efficient than the quadratic number of \texttt{LessThanConstraint}
instances.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Graph representation for quick lookups}
-\label{s:graph}
-
-Originally the linearizer used the solution data structure from the parser
directly. The solution is represented as a dictionary in the following
format:
-
-\begin{verbatim}
- { "1 -> 0" : "subj",
- ... ...
- "1 -> 2" : "obj",
- ... ...
- "1 -> 3" : "adv",
- ... ...
- }
-\end{verbatim}
-
-The keys are the two nodes' indices encoded in a string, and the value is
a string representing the relationship between the two nodes. A simple
observation is that this structure actually represents a graph with labeled
edges. The motivation to change it is that this representation has a
penalty on performance. When we build the constraints, we first need to
turn the dictionary into lists by using \texttt{dict.keys()} and
\texttt{dict.values()}, then we look up the part of speech by scanning
through the lists. This takes linear time to find a match. For example, if
we want to find the `subj' for word with index 0, we scan the first list to
find the ones with ``0'' on the right hand side of the arrow, then we find
the one(s) with `subj' as the value, and then take the left hand side of
the arrow. The encoding of the indices as strings also makes it cumbersome
to retrieve the node indices.
-
-So one of our performance improvements is to use a fast and convenient
representation for building the constraints. Because we don't want to
interfere with existing code for L$^3$, we did a preliminary conversion
into a new data structure and work on that structure later on. The data
structure is just a normal graph with $O(1)$ time look up times for both
nodes and edges. The nodes are indexed by their names, so we can find the
node for a given word instantly. We can also find related words instantly
using their part of speech names such as `subj'. Using the representation,
the constraint building time is reduced from $O(m r)$ to $O(r)$, where $r$
is the number of rules we need to generate and $m$ is the number of edges
in the graph. Since the number of edges is often quadratic of the number of
nodes. This improves performance to a large degree.
-
-The implementation is very simple. The graph contains a
dictionary(hashmap) whose values are the nodes, indexed by their names, and
each node contains a dictionary whose values are the arcs that leads to
another node, indexed by the part of speech of the other node relative to
this node. This representation is much like ``attributes'' or ``fields'' in
a record type in programming languages.
-
-Although it turns out to be efficient for our use, we are not yet sure if
the graph representation is in general a better idea at the whole (L$^3$)
project level.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Extra: more lexica!}
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\subsection{Japanese}
-\label{s:japanese}
-
-We have a passable basic lexicon for romanized tokenized Japanese. The
tokenization separates basic words and nominal particles, as in common
romanization styles. The morphology of verbs and verb clusters is not
analyzed and would be handled outside of the XDG framework in future
work.\footnote{%
- We have performed some analysis of the various predicate classes in
preparation for developing a morphological analyzer (cf.,
\texttt{./generation/jp-infl.html}). In particular, this analysis has
highlighted many more classes and irregular forms than are commonly taught
in L2 Japanese classes. Work on coding up the analyzer has not begun
however.%
-} Currently, only imperfect direct verb forms are in the lexicon. The main
focus of the lexicon is on resolving the use of nominal and topical
particles and gross syntactic features such as argument structure for
predicates.
-
-A large class of nominal particles encode case marking for nouns. These
particles include \lf{ga}, \lf{wo}, \lf{ni}, \lf{de}, \lf{to}, and
\lf{no}\slash\lf{na}. Non-case marking particles (e.g., \lf{yori},
\lf{hodo}) and particles for locations and extents (e.g., \lf{made},
\lf{made ni}, \lf{kara}) have not been included in the lexicon as yet. The
arc labels in XDG can be viewed as message passing, where incoming arcs are
``requests'' for a particular structure, and outgoing arcs are
``requirements'' of the current node in order to complete the requested
structure. From this perspective, the defining characteristic of nominal
particles is that they answer a request for a particular case and require a
noun.
-
-Even when restricting ourselves to six nominal case particles many
difficulties arise. The nominative marker is used to mark both arguments to
transitive affective predicates, and moreover there are restrictions on the
order between the arguments unlike the usual free constituent order of
Japanese. The dative, instrumental, and one form of the genitive are
homonymous with the continuative, gerundive, and attributive forms of the
copula.\footnote{%
- Actually they are historically the same. However, for modern analysis it
can sometimes be helpful to distinguish their uses as case marking from
their uses as forms of the copula.%
-} The dative and genitive particles also have different grammatical uses
and syntactic restrictions for the two classes of nouns. \lf{No}-nouns
marked with \lf{ni} are dative arguments, whereas \lf{na}-nouns marked with
\lf{ni} are generally adverbial modifiers\footnote{%
- This comes from the continuative form of the copula. Continuative forms
of other predicates also form adverbial modifiers. However, \lf{no}-nouns
are unable to combine with the continuative form of the copula to form an
adverb.%
-} though they may (rarely, due to semantic reasons) be dative arguments as
well. The genitive construction is formed with the particle \lf{no} for
\lf{no}-nouns and \lf{na} for \lf{na}-nouns. It is ungrammatical to use
\lf{no} with \lf{na}-nouns. And while it is possible to use \lf{na} with
\lf{no}-nouns, it is fairly obscure and the semantics are as a relative
clause construction rather than a genitive construction.\footnote{%
- It is the common analysis to describe the \lf{na} particle for
\lf{na}-nouns as an alternative form of \lf{no}. However, another analysis
is that \lf{na}-nouns cannot participate in genitive constructions, and the
use of \lf{na} is the same as for relative clauses with \lf{no}-nouns. This
analysis removes the ambiguity about which particle should be used for
genitive constructions, however it creates a similar ambiguity at the
semantic level regarding whether a genitive or attributive modifier should
be used (for \lf{no}-nouns where there is a choice). For the purpose of
translation the common analysis may be more helpful. Or, because the
\texttt{N$_\text{\texttt{NA}}$ + na} construction will be typically
translated as an adjectival modifier in Western languages, it may be better
to distinguish the three constructions (\texttt{N$_\text{\texttt{NO}}$ +
no}, \texttt{N$_\text{\texttt{NO}}$ + na}, \texttt{N$_\text{\texttt{NA}}$ +
na}) completely rather than choosing either of these two methods of
simplification.%
-} The genitive construction is further complicated by elision rules. When
the genitive \lf{no} particle is followed by a homonymous null noun, the
two fuse together (\texttt{no$_\text{\texttt{GEN}}$ + no$_\text{\texttt{N}}
\Rightarrow$ no$_\text{\texttt{GEN+N}}$}). Similarly, the deictic series
\lf{kore}, \lf{sore}, \lf{are}, \lf{dore} combine with the genitive to
become \lf{kono}, \lf{sono}, \lf{ano}, and \lf{dono}. In both of these
situations, the version without elision is ungrammatical.
-
-The current version of the Japanese lexicon correctly handles all of these
complications and corner cases. I had attempted to handle the differences
for the two noun classes by using the agreement constraints of XDG, but
could not get it to work. The working version adds two new arc labels for
specifically selecting a \lf{na}-noun or \lf{no}-noun, as opposed to the
arc label for selecting any noun which is used by the other particles.
-
-An additional consideration I begun working on is capturing the effects of
animacy on semantically valid parses. To handle animacy the agreement
constraints must be used to avoid an explosion of arc labels. However, I
could not get the agreement constraints to work here either, so this is
left to future work.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\subsection{Quechua}
-\label{s:quechua}
-
-In addition to Japanese, we've just started working on handling parsing
for Quechua, which is an interesting task because Quechua is such a
morphologically rich language, and much of the work of parsing it happens
at a word-analysis level. Most of our work on this front has been in
software engineering, connecting AntiMorpho with the XDG parser by analogy
with how Dr. Gasser has already connected HornMorpho to do parsing for
Amharic. The process isn't very far along yet; when trying to parse a
Quechua sentence, the XDG successfully loads the relevant Quechua
morphology data, but some bugs remain and we haven't (as of this writing)
managed to parse Quechua sentences. We hope to be able to handle simple
sentences in the very near future, though, such as the ones in the Lonely
Planet Quechua phrase book.
-
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\section{Credit assignment problem}
-\label{s:credit}
-
-Wren did all of the work for the Japanese lexicon and refactoring the
constraint library, and almost all of the work for the LessThanConstraint.
Alex did the (nascent) Quechua lexicon (such as it is) and most of the work
on the linearizer itself. Yin worked on the graph representation for
speeding up the linearizer.
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
-\end{document}
=======================================
--- /generation/latex/progressreport0.tex Sat Oct 24 02:34:06 2009 UTC
+++ /dev/null
@@ -1,40 +0,0 @@
-\documentclass[12pt]{article}
-\usepackage{HomeworkStyle}
-
-\author{Wren N.\,G.\,Thornton, Alex Rudnick, and Yin Wang}
-\title{Progress Report}
-\subtitle{Surface realization for XDG}
-\date{23 October 2009}
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{document}
-
-\maketitle
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Thus far, we've mostly been reading and discussing, but there's
-been some programming as well. Particularly, we've read through Mike's
-existing XDG code, the Pelizzoni and Gracas Volpe Nunes papers,
-Ralph Debusmann's thesis, and Sandra K\"{u}bler's dependency parsing
-book. The suggested reading list has been building here:
-\url{http://code.google.com/p/hltdi-l3/wiki/SuggestedReading}
-
-Working with Mike's existing XDG code, we've built a first pass for
-the linearizer. So far, it takes one of the example English sentences,
parses it with the \texttt{XDGSolver} to get a dependency graph, and then
applies the ``mother''--``daughter'' ordering constraints in order to
produce a linearization (or several possible linearizations).
-
-We've also discussed ways of improving the efficiency of solving the
linearization problem by introducing specialized \texttt{Constraint}
subclasses. The simplest of these implements a strict ordering between two
variables. Having this as a specialized class, rather than using a
functional constraint, allows for forward propegation of domain changes
without needing to resort to backtracking search, thus improving the
efficiency of the ordering solver. For example, if $X$ can take the values
$\{2,3,5\}$ and $Y$ can take the values $\{1,3,4\}$ and we have the
constraint $X<Y$, then we can immediately remove 5 from $X$'s domain (since
$\neg\exists y\in Y\!.\: 5<y$) and can immediately remove 1 from $Y$'s
domain (since $\neg\exists x\in X\!.\: x<1$). Extending this, we can
introduce a class for more than two variables which must be in a total
order. This covers the common case for ordering constraints in English, and
allows us to replace $n*(n-1)/2$ binary constraints with a single $n$-ary
constraint. This constraint could be solved efficiently with something like
the simplex algorithm. Continuing to the largest-grain extreme, we could
have a single ordering constraint which is constructed from all the
constraints of all lexemes. This would allow us to use graph-theoretic
algorithms in $O\!(V\!+E)$ without ever resorting to backtracking search,
while still being integrated with the constraint-based architecture. The
trick here would be in being able to return more than one ordering (via
backtracking search, for other constraints to make use of).
-
-In the immediate future, we plan to build in support for other sorts
-of constraints, such as ones that specify the relative positions
-of two different daughters, for a given mother --- this should allow
-us to try to linearize all of the English sample sentences sensibly.
-
-In the coming weeks, we'd also like to work on pinning down what
-to do about unicode characters (Python itself supports unicode
-strings, but will the rest of the code?), tiny lexica for other
-languages (Japanese, Quechua, possibly Chinese), and considering
-multiple dimensions.
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
-\end{document}
=======================================
--- /generation/latex/progressreport1.tex Wed Nov 25 06:36:06 2009 UTC
+++ /dev/null
@@ -1,49 +0,0 @@
-\documentclass[12pt]{article}
-\usepackage{HomeworkStyle}
-
-\author{Wren N.\,G.\,Thornton, Alex Rudnick, and Yin Wang}
-\title{Progress Report}
-\subtitle{Surface realization for XDG}
-\date{24 November 2009}
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{document}
-
-\maketitle
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Since the last progress report, we've improved the linearization code
quite a
-bit; now we can produce linearizations of sentences based on all the
ordering
-information in the lexicon. Previously, the linearizer could only handle
-constraints that involved the head word, but now if a lexical entry calls
for
-its dependents to be in a certain order, that order is respected --
-implementing these constraints was very convenient, using the new graph
-representation.
-
-The graph representation of the dependency parses enables us to access an
arc's
-destination in constant time, given the label on that arc. Now we can say,
for
-example, ``node.getDest(``subj'')'' to get the subject of a verb node,
without
-scanning through all the arcs. The graph representation is helpful both for
-convenience and for efficiency; without it, it takes $O(nm)$ time to
generate
-all the constraints, where $n$ is the number of words and $m$ is the
number of
-arcs. With the graph-based representation, it takes only $O(m)$ to generate
-the constraints. Although currently we convert the parser's output to this
-representation only locally in the linearizer code, the graph
representation
-could be of general use elsewhere in L3, for quick lookups.
-
-We also replaced the functional constraints in the linearizer with the new
-LessThanConstraint class, which cuts down the run time for producing
-linearizations (and interestingly, causes them to be generated in a
different
-order), and fixed some bugs based on our misconceptions of how to look up
the
-lexical entry being used for a given token in the sentence. Now, being
able to
-look up the correct lexical entries, we generate many fewer
linearizations, and
-they seem much more sensible. However, adjectives and determiners are still
-under-constrained, in that they don't necessarily come right before the
nouns
-they depend on, even when they should. We'll have to come up with a way to
-account for this.
-
-For the our step, we're starting on tiny lexica for Japanese and Quechua.
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
-\end{document}
=======================================
--- /generation/latex/proposal.tex Fri Oct 23 07:34:39 2009 UTC
+++ /dev/null
@@ -1,25 +0,0 @@
-\documentclass[12pt]{article}
-\usepackage{HomeworkStyle}
-
-\usepackage{setspace}
-
-\author{Wren N.\,G.\,Thornton and Alex Rudnick}
-\title{Project proposal}
-\subtitle{Surface realization for XDG}
-\date{18 September 2009}
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{document}
-
-\maketitle
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-We will investigate using XDG for text generation. Literature on XDG often
mentions the possibility of using XDG for generation as well as parsing,
but it is only mentioned in passing in Ralph Debusmann's dissertation and
the paper from Pelizzoni and das Gracas Volpe Nunes.
-
-As a starting point we will adapt your XDG parsing code to ``run
backwards'', producing text from XDG graph structures. One principle issue
here is in efficiently linearizing the nodes. Common algorithms for
topological sorting are in $O(|V|+|E|)$, indicating the need for minimizing
the number of edges considered (i.e.\ by ignoring edges which could be
derived by transitivity from other edges), since the number of edges will
be a limiting factor for using multiple dimensions in generation.
Additionally, these algorithms only find a single ordering and will need
extending in order to enumerate all orderings. In order to demonstrate
generality of the linearization algorithm and other aspects of generation,
we will build small grammars for several distinct languages. The first
language will come from a small subset of English, and at least one other
will come from a language with free word order such as Japanese or Quechua.
-
-After settling on a linearization algorithm and developing these grammars,
time permitting, we would like to extend the Python implementation of XDG
to handle multiple dimensions. This is necessary for the generation
strategy suggested in the aforementioned dissertation and paper, where a
semantic dimension is used to drive constraints in the syntactic
dimensions. It's also necessary to make use of existing XDG grammars for
parsing, since they often separate syntax into ID and LP dimensions in
order to handle issues with word ordering. To the extent that our different
languages can be parsed into the same semantic representation, that
representation can be used as an interlingua for generating syntactic
representations, thus providing a basic translation system.
-
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-%% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fin.
-\end{document}
=======================================
--- /generation/lex.py Fri Nov 27 10:54:58 2009 UTC
+++ /dev/null
@@ -1,394 +0,0 @@
-# Lexicon and grammars for XDG.
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# Michael Gasser <gas...@cs.indiana.edu>
-# 2009.09.12
-#
-# 2009.10.23
-# Added LexDim for dimensions within lexical entries
-#
-# 2009.10.31
-# Added groups (instances of Lex with gid and gwords attributes)
-# Fixed inheritance so it handles multiple entries in lexical classes
-
-import copy
-
-class Language:
- """
- Language class, including several language-specific morphological
- properties.
- """
-
- def __init__(self, name, abbrev,
- has_mult_agrs=True,
- morph_processing=True,
- lexicon=None,
- labels={}):
- """
- @param name: string identifying language
- @type name: string
- @param abbrev: short string identifying language
- @type abbrev: string
- @param has_mult_agrs: whether words can have multiple agr
- values, for example, English, Spanish,
- and Malay can't; Amharic, K'iche',
- and Quechua can
- @type has_mult_agrs: boolean
- @param morph_processing: whether words are morphologically
- analyzed or generated, for example,
- they are not for English or Malay(?);
- they are for Spanish, Amharic,
- K'iche, and Quechua
- @type morph_processing: boolean
- @param labels: a map from dimension abbrevations to labels
- @type labels: dict from string to list of strings
- """
- self.name = name
- self.abbrev = abbrev
- self.has_mult_agrs = has_mult_agrs
- self.morph_processing = morph_processing
- self.lexicon = lexicon
- self.labels = labels
-
- def set_lexicon(self, lexicon):
- """Assign this language's lexicon.
-
- @param lexicon: the language's lexicon
- @type lexicon: Lexicon
- """
- self.lexicon = lexicon
- lexicon.language = self
-
-
-class Lexicon(dict):
- """The single lexicon associated with a language."""
-
- def __init__(self, lexemes={}):
- """
- @param lexemes: lexemes making up the lexicon
- @type lexemes: dict of Lexs or lists of Lexs, with word
- forms as keys
- """
- self.language = None
- dict.__init__(self, lexemes)
-
- def add_lex(self, word, lexeme):
- """Add a new lexeme to the lexicon.
-
- @param word: form associated with lex
- @type word: string
- @param lex: new lexeme
- @type lex: list of Lexs
- """
- self[word] = lexeme
-
- def get_lex(self, word):
- """Return the lexeme associated with the word form.
-
- @param word: a word form
- @type word: string
- @return: the lexeme for word, None if there isn't one
- @rtype: list of Lexs
- """
- return self.get(word, [])
-
- def inherit(self, lex, dimensions, groups={}, add_entries=[]):
- """Inherit properties to lex from its classes.
-
- @param lex: a lexical entry
- @type lex: Lex
- @param dimensions: list of dimensions
- @type dimensions: list of instances of Dimension subclasses
- @param groups: dict of gid: # words, group lex
- @type groups: dict of pairs -- gid: (gwords, lex)
- @param add_entries: list of new entries found in classes
- @type add_entries: list of Lexs
- """
- if lex.gid:
- groups[lex.gid] = (lex.gwords, lex)
- self.inherit1(lex, dimensions, classes=lex.classes, checked=[],
- groups=groups, add_entries=add_entries)
-
- def inherit1(self, lex, dimensions, classes=[], checked=[], groups={},
add_entries=[]):
- """
- Inherit properties to lex from classes, accumulating new
- groups and other entries that are found along the way.
-
- @param lex: a lexical entry
- @type lex: Lex
- @param classes: list of superclasses
- @type classes: list of strings (class names)
- @param checked: list of Lexs
- @type checked: list of Lexs
- @param groups: dict -- group id: (gwords, lex)
- @type groups: set of strings
- @param add_entries: list of new entries created
- @type add_entries: list of Lexs
- """
- if classes:
- # Name for the first class
- class_name = classes[0]
- remaining = classes[1:]
- # Entries for the first class
- class1 = self.get_lex(class_name)
- if not class1:
- raise ValueError, "No class %s" % class_name
- # Make it a list in case it's just one class
- if not isinstance(class1, list):
- class1 = [class1]
- # Create new cloned lexes if there's more than one class entry
- lexes = [lex] + [lex.clone() for i in range(len(class1)-1)]
- # For each combination of lex and class entry...
- for l, cls in zip(lexes, class1):
- # If the class entry has not already been checked
- if cls not in checked:
- # Check to see whether this is a group entry
- gid = cls.gid
- # If so, record the gid and number of words in groups,
- # but don't add it to the lex's entries
- if gid:
- groups[gid] = (cls.gwords, l)
- # Add a non-group entry to the new entries
- # unless this is the original, uncloned lex
- elif l != lex:
- add_entries.append(l)
- # Inherit the properties from this class entry
- # to the lex
- l.inherit_properties(cls, dimensions)
- # Add the name of the class to the class list
- # for this lex if it's not there
- if class_name not in l.classes:
- l.classes.append(class_name)
- # Recurse using this class's classes and the
- # remaining classes
- clss = set(cls.classes)
- # Combine new classes with remaining ones
- clss.update(remaining)
- clss = list(clss)
- # Append this class entry to the list of checked
entries
- checked.append(cls)
- self.inherit1(l, dimensions, classes=clss,
- checked=checked[:], groups=groups,
- add_entries=add_entries)
-
-class Lex:
- """
- Class for lexical entries (including groups) and classes of
- lexical entries in a lexicon.
- """
-
- ID = 0
-
- def __init__(self, word='', lexeme='', gram='', dims=None,
classes=None,
- gid='', gwords=0, id=''):
- """
- @param word: wordform for an individual entry
- @type word: string
- @param lexeme: abstract form for a lexeme shared by entries
- @type lexeme: string
- @param gram: abstract identifier for a grammatical
- morpheme or class
- @type gram: string
- @param dims: dimension specifications
- @type dims: dictionary of LexDims
- @param classes: list of classes for this entry or class
- @type classes: list of names of classes
- @param gid: id if this is a group, '' otherwise
- @type gid: string
- @param gwords: number of words in group
- @type gwords: int
- @param id: specified id for this Lex (only in cloning)
- @type id: string
- """
- self.word = word
- self.lexeme = lexeme
- self.gram = gram
- self.dims = dims or {}
- self.classes = classes or []
- self.gid = gid
- self.gwords = gwords
- # Create a unique ID for this Lex (needed to recognize
- # copies of same Lex)
- if id:
- self.id = id
- else:
- self.id = 'l' + str(Lex.ID)
- Lex.ID += 1
-
- def __str__(self):
- """Print name for lexical entry or class."""
- return ("Lex: %s, classes: %s" %
- ((self.word or self.lexeme or self.gram),
- str(self.classes)))
-
- def clone(self):
- """Return a copy of this Lex. Only applies to word-level
entries."""
- copied = Lex(word=self.word, classes=self.classes[:], id=self.id,
- gid=self.gid, gwords=self.gwords)
- # Make compies of LexDims
- copied.dims = {}
- for abbrev, dim in self.dims.iteritems():
- copied.dims[abbrev] = dim.clone()
- return copied
-
- def get_dim(self, dim_label):
- """Return the dimension with dim_label.
-
- @param dim_label: dimension label: syn, sem, synsem,
- a language label or pair
- @type dim_label: string
- @return: the dimension associated with the label
- @rtype: LexDim
- """
- return self.dims.get(dim_label)
-
- def inherit_properties(self, cls, dimensions):
- """Inherit properties of cls to this Lex for each dimension.
-
- @param cls: a lexical class of this Lex
- @type cls: instance of Lex representing a lexical class
- @param dimensions: list of dimensions
- @type dimensions: list instances of Dimension
- """
- # First inherit cross-dimensional properties
- if cls.gid and not self.gid:
- self.gid = cls.gid
- self.gwords = cls.gwords
- for dimension in dimensions:
- dim_abbrev = dimension.abbrev
- cls_dim_vars = cls.dims.get(dim_abbrev)
- # Only update if class has something for this dimension
- if cls_dim_vars:
- dim_vars = self.dims.get(dim_abbrev)
- if not dim_vars:
- dim_vars = LexDim(label=dim_abbrev)
- self.dims[dim_abbrev] = dim_vars
- # ins, outs, order, agreement only for subclasses of
ArcDimension
-# if isinstance(dimension, ArcDimension):
- # Update ins and outs
- cls_ins = cls_dim_vars.ins
- if cls_ins:
- for feat, val in cls_ins.iteritems():
- dim_vars.ins[feat] = val
- cls_outs = cls_dim_vars.outs
- if cls_outs:
- for feat, val in cls_outs.iteritems():
- dim_vars.outs[feat] = val
- # If cls is a group entry, update groupouts
- if cls.gid:
- cls_groupouts = cls_dim_vars.groupouts
- if cls_groupouts:
- for feat, val in cls_groupouts.iteritems():
- dim_vars.groupouts[feat] = val
- # Update order
- cls_order = cls_dim_vars.order
- if cls_order:
- for order in cls_order:
- if order not in dim_vars.order:
- dim_vars.order.append(order)
- # Update agreement
- cls_agree = cls_dim_vars.agree
- if cls_agree:
- for agree in cls_agree:
- if agree not in dim_vars.agree:
- dim_vars.agree.append(agree)
- cls_agrs = cls_dim_vars.agrs
- if cls_agrs and not dim_vars.agrs:
- dim_vars.agrs = cls_agrs
- # Update arg
- cls_arg = cls_dim_vars.arg
- if cls_arg:
- for arg, args in cls_arg.iteritems():
- dim_vars.arg[arg] = args
- # Update features
- cls_feats = cls_dim_vars.feats
- if cls_feats:
- for feat, val in cls_feats.iteritems():
- dim_vars.feats[feat] = val
-
- def is_group(self):
- """Is this a lexical entry for a group? It is if it has a gid."""
- return self.gid
-
-
-class LexDim:
- """Class for lexical specifications for particular dimensions."""
-
- def __init__(self, label='', name='',
- ins=None, outs=None, order=None, agree=None, agrs=None,
- arg=None, groupouts=None, feats=None):
- """
- @param label: label for the dimension type
- @type label: string
- @param name: name for the word on this dimension,
- e.g., when semantics needs to be labeled
- differently or when the dimensions represent
- different languages
- @type name: string
- @param ins: labels that can (must) come into nodes
- with this entry
- @type ins: dict of label constraint pairs;
- constraint: '!', '?', '*', or a specific word
(string)
- @param outs: labels that can (must) come out of nodes with
this entry
- @type outs: dict of label constraint pairs
(constraint: '!', '?', '*')
- @param groupouts: labels for arcs with their daughter labels for
groups
- @type groupouts: dict == arc label : daughter label
- @param order: list of order tuples
- @type order: list of tuples of two strings (arc labels
- or '^', representing 'this node')
- @param agree: list of arc labels (or arc label lists
- if multiple agrs are possible for language)
- @type agree: list of strings (or lists of strings)
- @param agrs: list of agr features (or arc_label, feature
- dicts if multiple agrs are possible for language)
- @type agrs: list of tuples of strings and/or ints (or
- dicts of tuples of strings and/or ints)
- @param arg: for an interface dimension, mappings of
- args on the two related dimensions
- @type arg: dictionary of lists of strings
- @param feats: dictionary of features
- @type feats: string: string dict
- """
- self.label = label or ''
- self.name = name or ''
- self.ins = ins or {}
- self.outs = outs or {}
- self.groupouts = groupouts or {}
- self.order = order or []
- self.agree = agree or []
- self.agrs = agrs or []
- self.arg = arg or {}
- self.feats = feats or {}
-
- def __str__(self):
- """Print name for lexical dimension."""
- return ("LexDim: %s, ins: %s, outs: %s" %
- ((self.label or self.name),
- str(self.ins),
- str(self.outs)))
-
- def clone(self):
- """Make a copy of this LexDim."""
- copied = LexDim(label=self.label, name=self.name,
- ins=copy.deepcopy(self.ins),
outs=copy.deepcopy(self.outs),
- groupouts=copy.deepcopy(self.groupouts),
- order=self.order[:],
- agree=self.agree[:], agrs=copy.deepcopy(self.agrs),
- feats=copy.deepcopy(self.feats))
- return copied
=======================================
--- /generation/linearize.py Sat Jan 16 05:05:47 2010 UTC
+++ /dev/null
@@ -1,316 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-
-# Code to take an XDG "solution" to a sentence, which doesn't specify a
total
-# ordering, and produce a linear ordering.
-
-# This has lots of problems, at the moment. Right now, it doesn't know how
to
-# get the correct word sense out of an XDG parse, for lexemes with multiple
-# word senses.
-# Unimplemented, for now, is building constraints out of rules that don't
refer
-# to the current word ("^"). This is implementable; I'll just have to sit
down
-# and do it.
-
-# Also, I'd like to look into doing this with a more serious graph-theory
-# library, like networkx (networkx.lanl.gov)
-
-# Wren suggests that we define a new type of constraint such that you give
it a
-# list of words (POS types), where that defines a total ordering for those
-# things.
-# This would encode information like:
-# transitive verbs go like: subject ^ objects adverbs.
-# ... and there might be other words in there too.
-
-import l3
-from xdg_constraint import Problem
-from xdg_constraint import *
-from en import ENGLISH
-
-DUMMY = "?dummy"
-
-def linearizeAndPrint(sentence):
- print
- print "SENTENCE:", sentence
-
- problem = l3.XDGProblem(sentence=sentence)
- lexicon = ENGLISH.lexicon
- solutions = None
-
- solutions = problem.solve()
-
- i = 0
- for soln in solutions:
- if len(solutions) > 1:
- print
- print "now linearizing solution #", i
- i += 1
-
- linearization = Problem()
-
- # words is a list of variables for the constraint solver.
- words = map(indexToVar, range(len(sentence)))
-
- # each word is going to be associated with a unique position.
- linearization.addVariables(words, range(len(sentence)))
- linearization.addConstraint(AllDifferentConstraint())
-
- addOrderConstraints(problem, soln, sentence, linearization)
- addProjectivityConstraints(problem, soln, sentence, linearization)
-
- linearizations = linearization.getSolutions()
- printLinearizations(linearizations, sentence)
-
-def addOrderConstraints(problem, soln, sentence, linearization):
- """Given the XDGProblem, solution to that problem, the original
sentence,
- and the already-instantiated linearization problem, enforce the
ordering
- constraints specified in the lexical entries."""
-
- words = map(indexToVar, range(len(sentence)))
- # construct graph from solution
- sg = solutionToGraph(problem, soln)
-
- for n in sg.nodes.values():
- var1 = n.name
- index1 = varToIndex(var1)
- node = problem.nodes[index1]
-
- if not node.entries:
- rules1 = []
- else:
- entryindex = soln[node.entry_var]
- lexical_entry = node.entries[entryindex]
- rules1 = lexical_entry.dims['syn'].order
-
- for (left,right) in rules1:
- if (left == "^"):
- rights = n.getDest(right)
-
- for var2 in rights:
- # print "rule:", var1, "before", right,
- # print "(", sentence[index1], "before",
- # print sentence[varToIndex(var2.name)], ")"
- variables = [var1, var2.name]
- linearization.addConstraint(LessThanConstraint(),
- variables)
- elif (right == "^"):
- lefts = n.getDest(left)
- for var2 in lefts:
- # print "rule:", left, "before", var1,
- # print "(", sentence[varToIndex(var2.name)],
- # print "before", sentence[index1], ")"
- variables = [var2.name, var1]
- linearization.addConstraint(LessThanConstraint(),
- variables)
- else:
- lefts = n.getDest(left)
- rights = n.getDest(right)
-
- for var2 in lefts:
- for var3 in rights:
- # print "rule:", left, "before", right,
- # print "(for the word", sentence[index1], ")"
- variables = [var2.name, var3.name]
- linearization.addConstraint(LessThanConstraint(),
- variables)
-
-def addProjectivityConstraints(problem, soln, sentence, linearization):
- """For every arc h->d, for every node w:
- w can be between h and d only if h->w (or h->n, where n is one of
w's
- ancestors in the graph.)"""
- # If h governs one of w's ancestors (or w), then w can be between h
and d.
-
- edges = getSyntaxArcs(soln)
- words = map(indexToVar, range(len(sentence)))
- heads = headMap(edges)
-
- # head and dep are names of variables. We ignore labels.
- for w in words:
- ancestors = getAncestors(w, heads)
- for (head, dep, label) in edges:
- cantGoBetween = True
- for ancestor in ancestors:
- if (head, ancestor) == (head, dep) or head == ancestor:
- cantGoBetween = False
- # if cantGoBetween, then install the constraint where w can't
go
- # between head and dep.
- if cantGoBetween:
- variables = [head, w, dep]
- linearization.addConstraint(
- lambda head,w,dep: not (head < w and w < dep),
- variables)
- linearization.addConstraint(
- lambda head,w,dep: not (dep < w and w < head),
- variables)
-
-def headMap(edges):
- """Take the list produced by getSyntaxArcs and return a dictionary from
- dependent nodes to head nodes."""
-
- out = {}
- for head,dep,label in edges:
- out[dep] = head
- return out
-
-def getAncestors(w, heads):
- """Return a list of the ancestors of the variable w, starting with
w."""
- out = []
- here = w
- # keep going until we find a thing that's not governed.
- while True:
- out.append(here)
- if here in heads.keys():
- here = heads[here]
- else:
- break
-
- if here in out:
- print "getAncestors: loop in dependency graph?"
- break
- return out
-
-def getSyntaxArcs(soln):
- """Return a list of tuples of the form (word1, word2, label),
indicating
- that there's an arc from word1 to word2 with that label."""
-
- out = []
-
- for var,label in soln.iteritems():
- if label != None and var != DUMMY:
-
- if not var.startswith("syn:"): continue
- if "->" not in var: continue
-
- index1, index2 = map(int, var.split(":")[1].split("->"))
- word1 = indexToVar(index1)
- word2 = indexToVar(index2)
-
- out.append( (word1, word2, label) )
- return out
-
-
-# generate a constraint variable using index number
-# int -> string
-def indexToVar(idx):
- return "word" + str(idx);
-
-# get the index from the constraint variable name
-# string -> int
-def varToIndex(v):
- return int(v[4:]);
-
-# Graph nodes are indexed by their names such as "word1", "word2" etc
-class Graph:
- def __init__(self):
- self.nodes = {}
- def addNode(self, n):
- self.nodes[n.name] = n
- def getNode(self, name):
- return self.nodes[name]
-
-# Node arcs are dictionaries mapping from arc types (such as "subj", "obj")
-# to other Nodes -- a list of Nodes, in fact.
-class Node:
- def __init__(self, s):
- self.name = s
- self.arcs = {}
- def addArc(self, other, arclabel):
- if arclabel in self.arcs:
- self.arcs[arclabel].append(other)
- else:
- self.arcs[arclabel] = [other]
- def getDest(self, arclabel):
- """Get the destination of an arc; for example getDest("subj") will
get
- the subjects of the node. Note that this returns a list."""
- if arclabel in self.arcs:
- return self.arcs[arclabel]
- else:
- return []
-
-# convert from a solution to a graph
-def solutionToGraph(problem, soln):
- g = Graph()
- for var,val in soln.iteritems():
- if val != None and var != problem.dummy_var:
- if "->" not in var:
- continue
- # print "Skipping this link for now. Should implement it:",
- # print var
- index1, index2 = map(int, var.split(":")[1].split("->"))
- v1 = indexToVar(index1)
- v2 = indexToVar(index2)
- if v1 not in g.nodes:
- n1 = Node(v1)
- g.addNode(n1)
- if v2 not in g.nodes:
- n2 = Node(v2)
- g.addNode(n2)
- g.getNode(v1).addArc(g.getNode(v2), val)
- return g
-
-def printGraph(g):
- print "** printing graph **"
- for n in g.nodes.values():
- printNode(n)
- print "***"
-
-def printNode(n):
- print "node", n.name
- for k in n.arcs:
- print k, ":", n.arcs[k].name
-
-def printLinearizations(linearizations, sentence):
- """This seems like cheating somewhat, but we really just need to know
which
- word corresponds with which variable."""
-
- i = 0
- if not linearizations:
- print "no linearizations?"
-
- for linearization in linearizations:
- indexes = orderList(linearization)
- wordforms = map(lambda(i): sentence[i], indexes)
-
- print "linearization", str(i)+":",
- i += 1
-
- print " ".join(wordforms)
-
-def orderList(linearization):
- rev_order = {};
- for key,val in linearization.iteritems():
- rev_order[val] = key;
- indices = linearization.values();
- indices.sort();
- order_list = [];
- for i in indices:
- order_list.append(varToIndex(rev_order[i]))
- return order_list
-
-def listify(sentence):
- if isinstance(sentence, list):
- return sentence
- else:
- return sentence.split() + ['.']
-
-def main():
- for sentence in l3.GRAMMATICAL:
- linearizeAndPrint(listify(sentence))
-
-if __name__ == "__main__":
- main()
=======================================
--- /generation/xdg_constraint/README Fri Nov 27 08:28:21 2009 UTC
+++ /dev/null
@@ -1,32 +0,0 @@
-=== A General Overview of the xdg_constraint library ===
-
-The entry point to the library is the 'Problem' class. This class
-denotes an instance of a constraint satisfaction problem to be
-solved. A problem consists of (1) a set of variables each of which
-is associated with a domain of possible values, (2) a set of
-constraints relating variables to each other, or just to themselves,
-and (3) a solver for finding complete assignments of variables which
-satisfy the constraints.
-
-A 'Variable' (or any hashable object) denotes the name of a variable.
-Each variable is associated with a 'Domain', though this is done
-externally to the Variable class at the moment. A domain is a set
-of values, with the ability to temporarily hide values and to set
-breakpoints to revert back to (thus un-hiding variables hidden since
-the breakpoint was set).
-
-A 'Constraint' is a (set-theoretic multivariate) relation on
-variables. Each constraint can do initial pre-processing at the
-start of running the solver, and each constraint can also do forward
-checking. However, the utility of pre-processing and forward checking
-is limited by the solvers being insufficiently smart.
-
-A 'Solver' is an algorithm for searching all possible variable
-bindings in order to find one that satisfies all constraints of the
-problem. Pre-processing is run for each constraint before beginning
-the search, thus allowing some constraints (e.g., unary constraints)
-to be satisfied immediately and removed from the set of constraints
-to verify. Ideally forward checking (or generalized arc consistency)
-would allow us to propagate changes as far as possible before
-resorting to guess-and-check; however, the current solvers aren't
-terribly smart yet.
=======================================
--- /generation/xdg_constraint/__init__.py Fri Nov 27 03:02:07 2009 UTC
+++ /dev/null
@@ -1,42 +0,0 @@
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-
-# We're auto-exporting all submodules' exports. This pollutes the
-# namespace, but it makes things easier to import and retains Gustavo
-# Niemeyer's original interface.
-
-# Import things so we can get their __all__
-import xdg_constraint.problem
-import xdg_constraint.variable
-import xdg_constraint.solvers
-import xdg_constraint.constraints
-
-# Import things so we can re-export their __all__
-from xdg_constraint.problem import *
-from xdg_constraint.variable import *
-from xdg_constraint.solvers import *
-from xdg_constraint.constraints import *
-
-__all__ = []
-__all__.extend(problem.__all__)
-__all__.extend(variable.__all__)
-__all__.extend(solvers.__all__)
-__all__.extend(constraints.__all__)
=======================================
--- /generation/xdg_constraint/constraints/__init__.py Fri Nov 27 03:25:43
2009 UTC
+++ /dev/null
@@ -1,48 +0,0 @@
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-
-# We're auto-exporting all submodules' exports. This pollutes the
-# namespace, but it makes things easier to import and retains Gustavo
-# Niemeyer's original interface.
-
-# N.B. we need to import the module so we can use its name unqualified.
-# Otherwise we can't access its __all__ for some reason
-from xdg_constraint.constraints import constraint
-from xdg_constraint.constraints import equality
-from xdg_constraint.constraints import ordering
-from xdg_constraint.constraints import set_membership
-from xdg_constraint.constraints import summation
-from xdg_constraint.constraints import xdg
-
-from xdg_constraint.constraints.constraint import *
-from xdg_constraint.constraints.equality import *
-from xdg_constraint.constraints.ordering import *
-from xdg_constraint.constraints.set_membership import *
-from xdg_constraint.constraints.summation import *
-from xdg_constraint.constraints.xdg import *
-
-__all__ = []
-__all__.extend(constraint.__all__)
-__all__.extend(equality.__all__)
-__all__.extend(ordering.__all__)
-__all__.extend(set_membership.__all__)
-__all__.extend(summation.__all__)
-__all__.extend(xdg.__all__)
=======================================
--- /generation/xdg_constraint/constraints/constraint.py Fri Nov 27
03:25:43 2009 UTC
+++ /dev/null
@@ -1,729 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Constraints: Constraint,
- FunctionConstraint,
- AllDifferentConstraint,
- AllEqualConstraint,
- MaxSumConstraint,
- ExactSumConstraint,
- MinSumConstraint,
- InSetConstraint,
- NotInSetConstraint,
- SomeInSetConstraint,
- SomeNotInSetConstraint,
- XDGConstraint
-"""
-from xdg_constraint.variable import Unassigned
-
-__all__ = ["Constraint", "FunctionConstraint",
- "AllDifferentConstraint", "AllEqualConstraint", "MaxSumConstraint",
- "ExactSumConstraint", "MinSumConstraint", "InSetConstraint",
- "NotInSetConstraint", "SomeInSetConstraint",
- "SomeNotInSetConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class Constraint(object):
- """
- Abstract base class for constraints
- """
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- verbose=False):
- """
- Perform the constraint checking
-
- If the forwardcheck parameter is not false, besides telling if
- the constraint is currently broken or not, the constraint
- implementation may choose to hide values from the domains of
- unassigned variables to prevent them from being used, and thus
- prune the search space.
-
- @param variables: Variables affected by this constraint,
- in the same order provided by the user
- @type variables: sequence
- @param domains: Dictionary mapping variables to their domains
- @type domains: dict
- @param assignments: Dictionary mapping assigned variables
- to their current assumed value
- @type assignments: dict
- @param forwardcheck: Boolean value stating whether forward
- checking should be performed or not
- @type forwardcheck: boolean
- @param verbose: whether to print out constraint info (added
by MG)
- @type verbose: boolean
- @return: Boolean value stating if this constraint is currently
- broken or not
- @rtype: bool
- """#"""
- return True
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- """
- Preprocess variable domains
-
- This method is called before starting to look for solutions,
- and is used to prune domains with specific constraint logic
- when possible. For instance, any constraints with a single
- variable may be applied on all possible values and removed,
- since they may act on individual values even without further
- knowledge about other assignments.
-
- @param variables: Variables affected by this constraint,
- in the same order provided by the user
- @type variables: sequence
- @param domains: Dictionary mapping variables to their domains
- @type domains: dict
- @param constraints: List of pairs of (constraint, variables)
- @type constraints: list
- @param vconstraints: Dictionary mapping variables to a list of
- constraints affecting the given variables.
- @type vconstraints: dict
- """#"""
- if len(variables) == 1:
- variable = variables[0]
- domain = domains[variable]
- for value in domain[:]:
- if not self(variables, domains, {variable: value}):
- domain.remove(value)
- constraints.remove((self, variables))
- vconstraints[variable].remove((self, variables))
-
- def forwardCheck(self, variables, domains, assignments,
- _unassigned=Unassigned):
- """
- Helper method for generic forward checking
-
- Currently, this method acts only when there's a single
- unassigned variable.
-
- @param variables: Variables affected by this constraint,
- in the same order provided by the user
- @type variables: sequence
- @param domains: Dictionary mapping variables to their domains
- @type domains: dict
- @param assignments: Dictionary mapping assigned variables to their
- current assumed value
- @type assignments: dict
- @return: Boolean value stating if this constraint is currently
- broken or not
- @rtype: bool
- """#"""
- unassignedvariable = _unassigned
- for variable in variables:
- if variable not in assignments:
- if unassignedvariable is _unassigned:
- unassignedvariable = variable
- else:
- break
- else:
- if unassignedvariable is not _unassigned:
- # Remove from the unassigned variable domains all
- # values which break our variable's constraints.
- domain = domains[unassignedvariable]
- if domain:
- for value in domain[:]:
- assignments[unassignedvariable] = value
- if not self(variables, domains, assignments):
- domain.hideValue(value)
- del assignments[unassignedvariable]
- if not domain:
- return False
- return True
-
-
-class FunctionConstraint(Constraint):
- """
- Constraint which wraps a function defining the constraint logic
-
- Examples:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> def func(a, b):
- ... return b > a
- >>> problem.addConstraint(func, ["a", "b"])
- >>> problem.getSolution()
- {'a': 1, 'b': 2}
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> def func(a, b):
- ... return b > a
- >>> problem.addConstraint(FunctionConstraint(func), ["a", "b"])
- >>> problem.getSolution()
- {'a': 1, 'b': 2}
- """#"""
-
- def __init__(self, func, assigned=True):
- """
- @param func: Function wrapped and queried for constraint logic
- @type func: callable object
- @param assigned: Whether the function may receive unassigned
- variables or not
- @type assigned: bool
- """
- self._func = func
- self._assigned = assigned
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=False):
- parms = [assignments.get(x, _unassigned) for x in variables]
- missing = parms.count(_unassigned)
- if missing:
- return ((self._assigned or self._func(*parms)) and
- (not forwardcheck or missing != 1 or
- self.forwardCheck(variables, domains, assignments)))
- return self._func(*parms)
-
-
-class AllDifferentConstraint(Constraint):
- """
- Constraint enforcing that values of all given variables are different
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(AllDifferentConstraint())
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- seen = {}
- for variable in variables:
- value = assignments.get(variable, _unassigned)
- if value is not _unassigned:
- if value in seen:
- return False
- seen[value] = True
- if forwardcheck:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in seen:
- if value in domain:
- domain.hideValue(value)
- if not domain:
- return False
- return True
-
-
-class AllEqualConstraint(Constraint):
- """
- Constraint enforcing that values of all given variables are equal
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(AllEqualConstraint())
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- singlevalue = _unassigned
- for value in assignments.values():
- if singlevalue is _unassigned:
- singlevalue = value
- elif value != singlevalue:
- return False
- if forwardcheck and singlevalue is not _unassigned:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- if singlevalue not in domain:
- return False
- for value in domain[:]:
- if value != singlevalue:
- domain.hideValue(value)
- return True
-
-
-class MaxSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum up to
- a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(MaxSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, maxsum, multipliers=None):
- """
- @param maxsum: Value to be considered as the maximum sum
- @type maxsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._maxsum = maxsum
- self._multipliers = multipliers
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- Constraint.preProcess(self, variables, domains,
- constraints, vconstraints)
- multipliers = self._multipliers
- maxsum = self._maxsum
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- domain = domains[variable]
- for value in domain[:]:
- if value*multiplier > maxsum:
- domain.remove(value)
- else:
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value > maxsum:
- domain.remove(value)
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- multipliers = self._multipliers
- maxsum = self._maxsum
- sum = 0
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- if variable in assignments:
- sum += assignments[variable]*multiplier
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > maxsum:
- return False
- if forwardcheck:
- for variable, multiplier in zip(variables, multipliers):
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value*multiplier > maxsum:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- for variable in variables:
- if variable in assignments:
- sum += assignments[variable]
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > maxsum:
- return False
- if forwardcheck:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value > maxsum:
- domain.hideValue(value)
- if not domain:
- return False
- return True
-
-
-class ExactSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum exactly
- to a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(ExactSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, exactsum, multipliers=None):
- """
- @param exactsum: Value to be considered as the exact sum
- @type exactsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._exactsum = exactsum
- self._multipliers = multipliers
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- Constraint.preProcess(self, variables, domains,
- constraints, vconstraints)
- multipliers = self._multipliers
- exactsum = self._exactsum
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- domain = domains[variable]
- for value in domain[:]:
- if value*multiplier > exactsum:
- domain.remove(value)
- else:
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value > exactsum:
- domain.remove(value)
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- multipliers = self._multipliers
- exactsum = self._exactsum
- sum = 0
- missing = False
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- if variable in assignments:
- sum += assignments[variable]*multiplier
- else:
- missing = True
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > exactsum:
- return False
- if forwardcheck and missing:
- for variable, multiplier in zip(variables, multipliers):
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value*multiplier > exactsum:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- for variable in variables:
- if variable in assignments:
- sum += assignments[variable]
- else:
- missing = True
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > exactsum:
- return False
- if forwardcheck and missing:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value > exactsum:
- domain.hideValue(value)
- if not domain:
- return False
- if missing:
- return sum <= exactsum
- else:
- return sum == exactsum
-
-
-class MinSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum at least
- to a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(MinSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, minsum, multipliers=None):
- """
- @param minsum: Value to be considered as the minimum sum
- @type minsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._minsum = minsum
- self._multipliers = multipliers
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- for variable in variables:
- if variable not in assignments:
- return True
- else:
- multipliers = self._multipliers
- minsum = self._minsum
- sum = 0
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- sum += assignments[variable]*multiplier
- else:
- for variable in variables:
- sum += assignments[variable]
- if type(sum) is float:
- sum = round(sum, 10)
- return sum >= minsum
-
-
-class InSetConstraint(Constraint):
- """
- Constraint enforcing that values of given variables are present in
- the given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(InSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)]]
- """#"""
-
- def __init__(self, set):
- """
- @param set: Set of allowed values
- @type set: set
- """
- self._set = set
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- # preProcess() will remove it.
- raise RuntimeError, "Can't happen"
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- set = self._set
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value not in set:
- domain.remove(value)
- vconstraints[variable].remove((self, variables))
- constraints.remove((self, variables))
-
-
-class NotInSetConstraint(Constraint):
- """
- Constraint enforcing that values of given variables are not present in
- the given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(NotInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, set):
- """
- @param set: Set of disallowed values
- @type set: set
- """
- self._set = set
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- # preProcess() will remove it.
- raise RuntimeError, "Can't happen"
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- set = self._set
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value in set:
- domain.remove(value)
- vconstraints[variable].remove((self, variables))
- constraints.remove((self, variables))
-
-
-class SomeInSetConstraint(Constraint):
- """
- Constraint enforcing that at least some of the values of given
- variables must be present in a given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(SomeInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, set, n=1, exact=False):
- """
- @param set: Set of values to be checked
- @type set: set
- @param n: Minimum number of assigned values that should be present
- in set (default is 1)
- @type n: int
- @param exact: Whether the number of assigned values which are
- present in set must be exactly C{n}
- @type exact: bool
- """
- self._set = set
- self._n = n
- self._exact = exact
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- set = self._set
- missing = 0
- found = 0
- for variable in variables:
- if variable in assignments:
- found += assignments[variable] in set
- else:
- missing += 1
- if missing:
- if self._exact:
- if not (found <= self._n <= missing+found):
- return False
- else:
- if self._n > missing+found:
- return False
- if forwardcheck and self._n-found == missing:
- # All unassigned variables must be assigned to
- # values in the set.
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if value not in set:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- if self._exact:
- if found != self._n:
- return False
- else:
- if found < self._n:
- return False
- return True
-
-
-class SomeNotInSetConstraint(Constraint):
- """
- Constraint enforcing that at least some of the values of given
- variables must not be present in a given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(SomeNotInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, set, n=1, exact=False):
- """
- @param set: Set of values to be checked
- @type set: set
- @param n: Minimum number of assigned values that should not be
present
- in set (default is 1)
- @type n: int
- @param exact: Whether the number of assigned values which are
- not present in set must be exactly C{n}
- @type exact: bool
- """
- self._set = set
- self._n = n
- self._exact = exact
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- set = self._set
- missing = 0
- found = 0
- for variable in variables:
- if variable in assignments:
- found += assignments[variable] not in set
- else:
- missing += 1
- if missing:
- if self._exact:
- if not (found <= self._n <= missing+found):
- return False
- else:
- if self._n > missing+found:
- return False
- if forwardcheck and self._n-found == missing:
- # All unassigned variables must be assigned to
- # values not in the set.
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if value in set:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- if self._exact:
- if found != self._n:
- return False
- else:
- if found < self._n:
- return False
- return True
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/constraints/equality.py Fri Nov 27 03:02:07
2009 UTC
+++ /dev/null
@@ -1,110 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Constraints: AllDifferentConstraint,
- AllEqualConstraint
-"""
-from xdg_constraint.constraints.constraint import Constraint
-from xdg_constraint.variable import Unassigned
-
-__all__ = ["AllDifferentConstraint", "AllEqualConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class AllDifferentConstraint(Constraint):
- """
- Constraint enforcing that values of all given variables are different
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(AllDifferentConstraint())
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- seen = {}
- for variable in variables:
- value = assignments.get(variable, _unassigned)
- if value is not _unassigned:
- if value in seen:
- return False
- seen[value] = True
- if forwardcheck:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in seen:
- if value in domain:
- domain.hideValue(value)
- if not domain:
- return False
- return True
-
-
-class AllEqualConstraint(Constraint):
- """
- Constraint enforcing that values of all given variables are equal
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(AllEqualConstraint())
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- singlevalue = _unassigned
- for value in assignments.values():
- if singlevalue is _unassigned:
- singlevalue = value
- elif value != singlevalue:
- return False
- if forwardcheck and singlevalue is not _unassigned:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- if singlevalue not in domain:
- return False
- for value in domain[:]:
- if value != singlevalue:
- domain.hideValue(value)
- return True
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/constraints/ordering.py Fri Nov 27 03:02:07
2009 UTC
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-
-from xdg_constraint.variable import Unassigned
-from xdg_constraint.constraints.constraint import Constraint
-
-__all__ = ["LessThanConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class LessThanConstraint(Constraint):
- """
- Class for binary strict ordering constraints.
- """
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- verbose=False):
- """
- Perform the constraint checking
- """
- if len(variables) != 2:
- raise Exception("LessThanConstraint only supports two
variables")
-
- if variables[0] in assignments:
- value0 = assignments[variables[0]]
- if variables[1] in assignments:
- value1 = assignments[variables[1]]
- return value0 < value1
- else:
- for value1 in domains[variables[1]][:]:
- if value0 < value1:
- return True
- elif forwardcheck:
- domains[variables[1]].hideValue(value1)
- return False
- else:
- if variables[1] in assignments:
- value1 = assignments[variables[1]]
- for value0 in domains[variables[0]][:]:
- if value0 < value1:
- return True
- elif forwardcheck:
- domains[variables[0]].hideValue(value0)
- return False
- else:
- # Ensured by preProcess(), but see the bug note
- return True
-
- # BUG: this won't get *everything* because the Solvers don't run
preProcess to convergence
- def preProcess(self, variables, domains, constraints, vconstraints):
- """
- Preprocess variable domains
- """
- if len(variables) != 2:
- raise Exception("LessThanConstraint only supports two
variables")
- domain0 = domains[variables[0]]
- domain1 = domains[variables[1]]
-
- # Prune impossible assignments
- minimum0 = min(domain0)
- # BUG: can't use filter() because that makes it into a list
- for value1 in domain1[:]:
- if not minimum0 < value1:
- # print "Removing " + repr(value1) + " from " +
repr(variables[1])
- domain1.remove(value1)
-
- if not domain1:
- return
- maximum1 = max(domain1)
- # BUG: can't use filter() because that makes it into a list
- for value0 in domain0[:]:
- if not value0 < maximum1:
- # print "Removing " + repr(value0) + " from " +
repr(variables[0])
- domain0.remove(value0)
-
-
- # Remove this constraint if it will always be satisfied
- if not domain0:
- return
- maximum0 = max(domain0)
- minimum1 = min(domain1)
- if maximum0 < minimum1:
- # print "Removing constraint " + repr(variables[0]) + " < " +
repr(variables[1])
- constraints.remove((self, variables))
- for variable in variables:
- vconstraints[variable].remove((self, variables))
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/constraints/set_membership.py Fri Nov 27
03:02:07 2009 UTC
+++ /dev/null
@@ -1,259 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Constraints: InSetConstraint,
- NotInSetConstraint,
- SomeInSetConstraint,
- SomeNotInSetConstraint
-"""
-from xdg_constraint.constraints.constraint import Constraint
-
-__all__ = ["InSetConstraint", "NotInSetConstraint",
- "SomeInSetConstraint", "SomeNotInSetConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class InSetConstraint(Constraint):
- """
- Constraint enforcing that values of given variables are present in
- the given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(InSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)]]
- """#"""
-
- def __init__(self, set):
- """
- @param set: Set of allowed values
- @type set: set
- """
- self._set = set
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- # preProcess() will remove it.
- raise RuntimeError, "Can't happen"
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- set = self._set
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value not in set:
- domain.remove(value)
- vconstraints[variable].remove((self, variables))
- constraints.remove((self, variables))
-
-
-class NotInSetConstraint(Constraint):
- """
- Constraint enforcing that values of given variables are not present in
- the given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(NotInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, set):
- """
- @param set: Set of disallowed values
- @type set: set
- """
- self._set = set
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- # preProcess() will remove it.
- raise RuntimeError, "Can't happen"
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- set = self._set
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value in set:
- domain.remove(value)
- vconstraints[variable].remove((self, variables))
- constraints.remove((self, variables))
-
-
-class SomeInSetConstraint(Constraint):
- """
- Constraint enforcing that at least some of the values of given
- variables must be present in a given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(SomeInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, set, n=1, exact=False):
- """
- @param set: Set of values to be checked
- @type set: set
- @param n: Minimum number of assigned values that should be present
- in set (default is 1)
- @type n: int
- @param exact: Whether the number of assigned values which are
- present in set must be exactly C{n}
- @type exact: bool
- """
- self._set = set
- self._n = n
- self._exact = exact
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- set = self._set
- missing = 0
- found = 0
- for variable in variables:
- if variable in assignments:
- found += assignments[variable] in set
- else:
- missing += 1
- if missing:
- if self._exact:
- if not (found <= self._n <= missing+found):
- return False
- else:
- if self._n > missing+found:
- return False
- if forwardcheck and self._n-found == missing:
- # All unassigned variables must be assigned to
- # values in the set.
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if value not in set:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- if self._exact:
- if found != self._n:
- return False
- else:
- if found < self._n:
- return False
- return True
-
-
-class SomeNotInSetConstraint(Constraint):
- """
- Constraint enforcing that at least some of the values of given
- variables must not be present in a given set
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(SomeNotInSetConstraint([1]))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, set, n=1, exact=False):
- """
- @param set: Set of values to be checked
- @type set: set
- @param n: Minimum number of assigned values that should not be
present
- in set (default is 1)
- @type n: int
- @param exact: Whether the number of assigned values which are
- not present in set must be exactly C{n}
- @type exact: bool
- """
- self._set = set
- self._n = n
- self._exact = exact
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- set = self._set
- missing = 0
- found = 0
- for variable in variables:
- if variable in assignments:
- found += assignments[variable] not in set
- else:
- missing += 1
- if missing:
- if self._exact:
- if not (found <= self._n <= missing+found):
- return False
- else:
- if self._n > missing+found:
- return False
- if forwardcheck and self._n-found == missing:
- # All unassigned variables must be assigned to
- # values not in the set.
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if value in set:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- if self._exact:
- if found != self._n:
- return False
- else:
- if found < self._n:
- return False
- return True
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/constraints/summation.py Fri Nov 27 03:02:07
2009 UTC
+++ /dev/null
@@ -1,270 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Constraints: MaxSumConstraint,
- ExactSumConstraint,
- MinSumConstraint
-"""
-from xdg_constraint.constraints.constraint import Constraint
-
-__all__ = ["MaxSumConstraint", "ExactSumConstraint", "MinSumConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class MaxSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum up to
- a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(MaxSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 1)], [('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, maxsum, multipliers=None):
- """
- @param maxsum: Value to be considered as the maximum sum
- @type maxsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._maxsum = maxsum
- self._multipliers = multipliers
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- Constraint.preProcess(self, variables, domains,
- constraints, vconstraints)
- multipliers = self._multipliers
- maxsum = self._maxsum
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- domain = domains[variable]
- for value in domain[:]:
- if value*multiplier > maxsum:
- domain.remove(value)
- else:
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value > maxsum:
- domain.remove(value)
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- multipliers = self._multipliers
- maxsum = self._maxsum
- sum = 0
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- if variable in assignments:
- sum += assignments[variable]*multiplier
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > maxsum:
- return False
- if forwardcheck:
- for variable, multiplier in zip(variables, multipliers):
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value*multiplier > maxsum:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- for variable in variables:
- if variable in assignments:
- sum += assignments[variable]
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > maxsum:
- return False
- if forwardcheck:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value > maxsum:
- domain.hideValue(value)
- if not domain:
- return False
- return True
-
-
-class ExactSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum exactly
- to a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(ExactSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)]]
- """#"""
-
- def __init__(self, exactsum, multipliers=None):
- """
- @param exactsum: Value to be considered as the exact sum
- @type exactsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._exactsum = exactsum
- self._multipliers = multipliers
-
- def preProcess(self, variables, domains, constraints, vconstraints):
- Constraint.preProcess(self, variables, domains,
- constraints, vconstraints)
- multipliers = self._multipliers
- exactsum = self._exactsum
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- domain = domains[variable]
- for value in domain[:]:
- if value*multiplier > exactsum:
- domain.remove(value)
- else:
- for variable in variables:
- domain = domains[variable]
- for value in domain[:]:
- if value > exactsum:
- domain.remove(value)
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- multipliers = self._multipliers
- exactsum = self._exactsum
- sum = 0
- missing = False
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- if variable in assignments:
- sum += assignments[variable]*multiplier
- else:
- missing = True
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > exactsum:
- return False
- if forwardcheck and missing:
- for variable, multiplier in zip(variables, multipliers):
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value*multiplier > exactsum:
- domain.hideValue(value)
- if not domain:
- return False
- else:
- for variable in variables:
- if variable in assignments:
- sum += assignments[variable]
- else:
- missing = True
- if type(sum) is float:
- sum = round(sum, 10)
- if sum > exactsum:
- return False
- if forwardcheck and missing:
- for variable in variables:
- if variable not in assignments:
- domain = domains[variable]
- for value in domain[:]:
- if sum+value > exactsum:
- domain.hideValue(value)
- if not domain:
- return False
- if missing:
- return sum <= exactsum
- else:
- return sum == exactsum
-
-
-class MinSumConstraint(Constraint):
- """
- Constraint enforcing that values of given variables sum at least
- to a given amount
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2])
- >>> problem.addConstraint(MinSumConstraint(3))
- >>> sorted(sorted(x.items()) for x in problem.getSolutions())
- [[('a', 1), ('b', 2)], [('a', 2), ('b', 1)], [('a', 2), ('b', 2)]]
- """#"""
-
- def __init__(self, minsum, multipliers=None):
- """
- @param minsum: Value to be considered as the minimum sum
- @type minsum: number
- @param multipliers: If given, variable values will be multiplied by
- the given factors before being summed to be
checked
- @type multipliers: sequence of numbers
- """
- self._minsum = minsum
- self._multipliers = multipliers
-
- def __call__(self, variables, domains, assignments,
- forwardcheck=False, verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- for variable in variables:
- if variable not in assignments:
- return True
- else:
- multipliers = self._multipliers
- minsum = self._minsum
- sum = 0
- if multipliers:
- for variable, multiplier in zip(variables, multipliers):
- sum += assignments[variable]*multiplier
- else:
- for variable in variables:
- sum += assignments[variable]
- if type(sum) is float:
- sum = round(sum, 10)
- return sum >= minsum
-
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/constraints/xdg.py Fri Nov 27 03:27:44 2009
UTC
+++ /dev/null
@@ -1,69 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Constraints: XDGConstraint
-"""
-from xdg_constraint.constraints.constraint import Constraint,
FunctionConstraint
-from xdg_constraint.variable import Unassigned
-
-__all__ = ["XDGConstraint"]
-
-# ----------------------------------------------------------------------
-# Constraints
-# ----------------------------------------------------------------------
-class XDGConstraint(FunctionConstraint):
- """Subclass of FunctionConstraint allowing messages when constraint is
called."""
-
- # Number of constraint calls
- calls = 0
-
- def __init__(self, func, assigned=True, name=''):
- FunctionConstraint.__init__(self, func, assigned=assigned)
- # Assign a name to the constraint
- self.name = name
-
- def __call__(self, variables, domains, assignments, forwardcheck=False,
- _unassigned=Unassigned, verbose=0):
- """
- @param verbose: whether to print out info about constraint
- @type verbose: boolean
- """
- if verbose:
- # Increment counter
- XDGConstraint.calls += 1
- if verbose > 1:
- # Print out some useful stuff about the constraint,
including what it returns
- assg = [assignments.get(x, _unassigned) for x in variables]
- if not assg.count(_unassigned):
- print ' Returned ', self._func(*assg)
- else:
- print ' No value for some variable'
- return FunctionConstraint.__call__(self, variables, domains,
assignments, forwardcheck=forwardcheck,
- _unassigned=_unassigned)
-
- def __str__(self):
- return 'XDGConstraint ' + self.name
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
=======================================
--- /generation/xdg_constraint/problem.py Fri Nov 27 11:35:07 2009 UTC
+++ /dev/null
@@ -1,367 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@sort: Problem
-"""
-import copy
-from xdg_constraint.solvers import BacktrackingSolver
-from xdg_constraint.constraints import Constraint, FunctionConstraint
-from xdg_constraint.variable import Domain
-
-__all__ = ["Problem"]#, "Variable", "Domain", "Unassigned"]
-
-# ----------------------------------------------------------------------
-# Problems
-# ----------------------------------------------------------------------
-class Problem(object):
- """
- Class used to define a problem and retrieve solutions. A problem
- consists of (1) a set of variables each of which is associated
- with a domain of possible values, (2) a set of constraints
- relating variables to each other, or just to themselves, and
- (3) a solver for finding complete assignments of variables which
- satisfy the constraints.
- """
-
- def __init__(self, solver=None):
- """
- @param solver: Problem solver used to find solutions
- (default is L{BacktrackingSolver})
- @type solver: instance of a L{Solver} subclass
- """
- self._solver = solver or BacktrackingSolver()
- self._constraints = []
- self._variables = {}
-
- def reset(self):
- """
- Reset the current problem definition
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariable("a", [1, 2])
- >>> problem.reset()
- >>> problem.getSolution()
- >>>
- """
- del self._constraints[:]
- self._variables.clear()
-
- def setSolver(self, solver):
- """
- Change the problem solver currently in use
-
- Example:
-
- >>> solver = BacktrackingSolver()
- >>> problem = Problem(solver)
- >>> problem.getSolver() is solver
- True
-
- @param solver: New problem solver
- @type solver: instance of a C{Solver} subclass
- """
- self._solver = solver
-
- def getSolver(self):
- """
- Obtain the problem solver currently in use
-
- Example:
-
- >>> solver = BacktrackingSolver()
- >>> problem = Problem(solver)
- >>> problem.getSolver() is solver
- True
-
- @return: Solver currently in use
- @rtype: instance of a L{Solver} subclass
- """
- return self._solver
-
- def addVariable(self, variable, domain):
- """
- Add a variable to the problem
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariable("a", [1, 2])
- >>> problem.getSolution() in ({'a': 1}, {'a': 2})
- True
-
- @param variable: Object representing a problem variable
- @type variable: hashable object
- @param domain: Set of items defining the possible values that
- the given variable may assume
- @type domain: list, tuple, or instance of C{Domain}
- """
- if variable in self._variables:
- raise ValueError, "Tried to insert duplicated variable %s" % \
- repr(variable)
- if type(domain) in (list, tuple):
- domain = Domain(domain)
- elif isinstance(domain, Domain):
- domain = copy.copy(domain)
- else:
- raise TypeError, "Domains must be instances of subclasses of "\
- "the Domain class"
- if not domain:
- raise ValueError, "Domain is empty"
- self._variables[variable] = domain
-
- def addVariables(self, variables, domain):
- """
- Add one or more variables to the problem
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2, 3])
- >>> solutions = problem.getSolutions()
- >>> len(solutions)
- 9
- >>> {'a': 3, 'b': 1} in solutions
- True
-
- @param variables: Any object containing a sequence of objects
- represeting problem variables
- @type variables: sequence of hashable objects
- @param domain: Set of items defining the possible values that
- the given variables may assume
- @type domain: list, tuple, or instance of C{Domain}
- """
- for variable in variables:
- self.addVariable(variable, domain)
-
- def addConstraint(self, constraint, variables=None):
- """
- Add a constraint to the problem
-
- Example:
-
- >>> problem = Problem()
- >>> problem.addVariables(["a", "b"], [1, 2, 3])
- >>> problem.addConstraint(lambda a, b: b == a+1, ["a", "b"])
- >>> solutions = problem.getSolutions()
- >>>
-
- @param constraint: Constraint to be included in the problem
- @type constraint: instance a L{Constraint} subclass or a
- function to be wrapped by L{FunctionConstraint}
- @param variables: Variables affected by the constraint (default to
- all variables). Depending on the constraint type
- the order may be important.
- @type variables: set or sequence of variables
- """
- if not isinstance(constraint, Constraint):
- if callable(constraint):
- constraint = FunctionConstraint(constraint)
- else:
- raise ValueError, "Constraints must be instances of "\
- "subclasses of the Constraint class"
- self._constraints.append((constraint, variables))
-
- def getSolution(self, verbose=False):
- """
- Find and return a solution to the problem
-
- Example:
-
- >>> problem = Problem()
- >>> problem.getSolution() is None
- True
- >>> problem.addVariables(["a"], [42])
- >>> problem.getSolution()
- {'a': 42}
-
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- @return: Solution for the problem
- @rtype: dictionary mapping variables to values
- """
- domains, constraints, vconstraints = self._getArgs()
- if not domains:
- return None
- return self._solver.getSolution(domains, constraints, vconstraints,
- verbose=verbose)
-
- def getSolutions(self, verbose=False):
- """
- Find and return all solutions to the problem
-
- Example:
-
- >>> problem = Problem()
- >>> problem.getSolutions() == []
- True
- >>> problem.addVariables(["a"], [42])
- >>> problem.getSolutions()
- [{'a': 42}]
-
- @param verbose: whether to print out constraint info (added by
MG)
- @type verbose: boolean
- @return: All solutions for the problem
- @rtype: list of dictionaries mapping variables to values
- """
- domains, constraints, vconstraints = self._getArgs()
- if not domains:
- return []
- return self._solver.getSolutions(domains, constraints,
vconstraints,
- verbose=verbose)
-
- def getSolutionIter(self, verbose=False):
- """
- Return an iterator to the solutions of the problem
-
- Example:
-
- >>> problem = Problem()
- >>> list(problem.getSolutionIter()) == []
- True
- >>> problem.addVariables(["a"], [42])
- >>> iter = problem.getSolutionIter()
- >>> iter.next()
- {'a': 42}
- >>> iter.next()
- Traceback (most recent call last):
- File "<stdin>", line 1, in ?
- StopIteration
-
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- """
- domains, constraints, vconstraints = self._getArgs()
- if not domains:
- return iter(())
- return self._solver.getSolutionIter(domains, constraints,
- vconstraints, verbose=verbose)
-
- def _getArgs(self):
- """
- Returns a triple of:
- (1) map from variables to domains;
- (2) set of constraint/constrained-variables pairs,
- where constrained-variables is a set of variables;
- (3) map from variables to subsets of (2), where the variable
- appears in all constrained-variables of its subset.
- """
- domains = self._variables.copy()
-
- allvariables = domains.keys()
- constraints = []
- for constraint, variables in self._constraints:
- if not variables:
- variables = allvariables
- constraints.append((constraint, variables))
-
- vconstraints = {}
- for variable in domains:
- vconstraints[variable] = []
- for constraint, variables in constraints:
- for variable in variables:
- vconstraints[variable].append((constraint, variables))
-
- for constraint, variables in constraints[:]:
- constraint.preProcess(variables, domains,
- constraints, vconstraints)
- for var, domain in domains.iteritems():
- domain.resetState()
- if not domain:
- print 'EMPTY DOMAIN FOR VARIABLE '+var+' !'
- return None, None, None
- #doArc8(getArcs(domains, constraints), domains, {})
- return domains, constraints, vconstraints
-
-
-def getArcs(domains, constraints):
- """
- Return a dictionary mapping pairs (arcs) of constrained variables
-
- @attention: Currently unused.
- """
- arcs = {}
- for x in constraints:
- constraint, variables = x
- if len(variables) == 2:
- variable1, variable2 = variables
- arcs.setdefault(variable1, {})\
- .setdefault(variable2, [])\
- .append(x)
- arcs.setdefault(variable2, {})\
- .setdefault(variable1, [])\
- .append(x)
- return arcs
-
-
-def doArc8(arcs, domains, assignments):
- """
- Perform the ARC-8 arc checking algorithm and prune domains
-
- @attention: Currently unused.
- """
- check = dict.fromkeys(domains, True)
- while check:
- variable, _ = check.popitem()
- if variable not in arcs or variable in assignments:
- continue
- domain = domains[variable]
- arcsvariable = arcs[variable]
- for othervariable in arcsvariable:
- arcconstraints = arcsvariable[othervariable]
- if othervariable in assignments:
- otherdomain = [assignments[othervariable]]
- else:
- otherdomain = domains[othervariable]
- if domain:
- changed = False
- for value in domain[:]:
- assignments[variable] = value
- if otherdomain:
- for othervalue in otherdomain:
- assignments[othervariable] = othervalue
- for constraint, variables in arcconstraints:
- if not constraint(variables, domains,
- assignments, True):
- break
- else:
- # All constraints passed. Value is safe.
- break
- else:
- # All othervalues failed. Kill value.
- domain.hideValue(value)
- changed = True
- del assignments[othervariable]
- del assignments[variable]
- #if changed:
- # check.update(dict.fromkeys(arcsvariable))
- if not domain:
- return False
- return True
-
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/solvers/__init__.py Fri Nov 27 03:02:07 2009
UTC
+++ /dev/null
@@ -1,32 +0,0 @@
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-
-# We're auto-exporting all submodules' exports. This pollutes the
-# namespace, but it makes things easier to import and retains Gustavo
-# Niemeyer's original interface.
-
-# N.B. we need to import the module so we can use its name unqualified.
-# Otherwise we can't access its __all__ for some reason
-from xdg_constraint.solvers import solver
-from xdg_constraint.solvers.solver import *
-
-__all__ = []
-__all__.extend(solver.__all__)
=======================================
--- /generation/xdg_constraint/solvers/solver.py Fri Nov 27 03:25:43 2009
UTC
+++ /dev/null
@@ -1,441 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@group Solvers: Solver,
- BacktrackingSolver,
- RecursiveBacktrackingSolver,
- MinConflictsSolver
-"""
-import random
-
-__all__ = ["Solver", "BacktrackingSolver", "RecursiveBacktrackingSolver",
- "MinConflictsSolver"]
-
-# ----------------------------------------------------------------------
-# Solvers
-# ----------------------------------------------------------------------
-class Solver(object):
- """
- Abstract base class for solvers
-
- @sort: getSolution, getSolutions, getSolutionIter
- """
-
- def getSolution(self, domains, constraints, vconstraints,
- verbose=False):
- """
- Return one solution for the given problem
-
- @param domains: Dictionary mapping variables to their domains
- @type domains: dict
- @param constraints: List of pairs of (constraint, variables)
- @type constraints: list
- @param vconstraints: Dictionary mapping variables to a list of
- constraints affecting the given variables.
- @type vconstraints: dict
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- """
- raise NotImplementedError, \
- "%s is an abstract class" % self.__class__.__name__
-
- def getSolutions(self, domains, constraints, vconstraints,
- verbose=False):
- """
- Return all solutions for the given problem
-
- @param domains: Dictionary mapping variables to domains
- @type domains: dict
- @param constraints: List of pairs of (constraint, variables)
- @type constraints: list
- @param vconstraints: Dictionary mapping variables to a list of
- constraints affecting the given variables.
- @type vconstraints: dict
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- """
- raise NotImplementedError, \
- "%s provides only a single solution" %
self.__class__.__name__
-
- def getSolutionIter(self, domains, constraints, vconstraints,
- verbose=False):
- """
- Return an iterator for the solutions of the given problem
-
- @param domains: Dictionary mapping variables to domains
- @type domains: dict
- @param constraints: List of pairs of (constraint, variables)
- @type constraints: list
- @param vconstraints: Dictionary mapping variables to a list of
- constraints affecting the given variables.
- @type vconstraints: dict
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- """
- raise NotImplementedError, \
- "%s doesn't provide iteration" % self.__class__.__name__
-
-
-class BacktrackingSolver(Solver):
- """
- Problem solver with backtracking capabilities
-
- Examples:
-
- >>> result = [[('a', 1), ('b', 2)],
- ... [('a', 1), ('b', 3)],
- ... [('a', 2), ('b', 3)]]
-
- >>> problem = Problem(BacktrackingSolver())
- >>> problem.addVariables(["a", "b"], [1, 2, 3])
- >>> problem.addConstraint(lambda a, b: b > a, ["a", "b"])
-
- >>> solution = problem.getSolution()
- >>> sorted(solution.items()) in result
- True
-
- >>> for solution in problem.getSolutionIter():
- ... sorted(solution.items()) in result
- True
- True
- True
-
- >>> for solution in problem.getSolutions():
- ... sorted(solution.items()) in result
- True
- True
- True
- """#"""
-
- def __init__(self, forwardcheck=True):
- """
- @param forwardcheck: If false forward checking will not be
requested
- to constraints while looking for solutions
- (default is true)
- @type forwardcheck: bool
- """
- self._forwardcheck = forwardcheck
-
- def getSolutionIter(self, domains, constraints, vconstraints,
- verbose=False):
- """
- Return an iterator for the solutions of the given problem
-
- @param domains: Dictionary mapping variables to domains
- @type domains: dict
- @param constraints: List of pairs of (constraint, variables)
- @type constraints: list
- @param vconstraints: Dictionary mapping variables to a list of
- constraints affecting the given variables.
- @type vconstraints: dict
- @param verbose: whether to print out constraint info (added by MG)
- @type verbose: boolean
- """
- forwardcheck = self._forwardcheck
- assignments = {}
-
- queue = []
-
- while True:
- # Constrain the order in which next unassigned variable is
selected
- # Mix the Degree and Minimum Remaing Values (MRV) heuristics
- lst = [(-len(vconstraints[variable]), # Degree
- len(domains[variable]), # MRV
- variable) for variable in domains]
- lst.sort()
- for item in lst:
- # the last item is variable
- if item[-1] not in assignments:
- # Found unassigned variable
- variable = item[-1]
- values = domains[variable][:]
- if forwardcheck:
- # list of domains of all unassigned variables
other than variable
- pushdomains = [domains[x] for x in domains
- if x not in assignments
and
- x != variable]
- else:
- pushdomains = None
- # Go on after the first unassigned variable is found
- break
- else:
- # No unassigned variables. We've got a solution. Go back
- # to last variable, if there's one.
- yield assignments.copy()
- if not queue:
- return
- variable, values, pushdomains = queue.pop()
- if pushdomains:
- for domain in pushdomains:
- domain.popState()
-
- while True:
- # We have a variable. Do we have any values left?
- if not values:
- # No. Go back to last variable, if there's one.
- del assignments[variable]
- while queue:
- variable, values, pushdomains = queue.pop()
- if pushdomains:
- for domain in pushdomains:
- domain.popState()
- if values:
- break
- del assignments[variable]
- else:
- return
-
- # Got a value. Check it.
- assignments[variable] = values.pop()
-
- if pushdomains:
- for domain in pushdomains:
- domain.pushState()
-
- if verbose > 1:
- print 'Running constraints for new var=val:',
variable, assignments[variable]
- for constraint, variables in vconstraints[variable]:
- if verbose > 1:
- print 'Trying constraint', constraint, 'vars',
variables
- print ' assg', assignments
- print ' domains', domains
- if not constraint(variables, domains, assignments,
- pushdomains, verbose=verbose):
- # Value is not good.
- break
- else:
- break
-
- if pushdomains:
- for domain in pushdomains:
- domain.popState()
-
- # Push state before looking for next variable.
- queue.append((variable, values, pushdomains))
-
- raise RuntimeError, "Can't happen"
-
- def getSolution(self, domains, constraints, vconstraints,
- verbose=False):
- iter = self.getSolutionIter(domains, constraints, vconstraints,
- verbose=verbose)
- try:
- return iter.next()
- except StopIteration:
- return None
-
- def getSolutions(self, domains, constraints, vconstraints,
- verbose=False):
- return list(self.getSolutionIter(domains, constraints,
vconstraints,
- verbose=verbose))
-
-
-class RecursiveBacktrackingSolver(Solver):
- """
- Recursive problem solver with backtracking capabilities
-
- Examples:
-
- >>> result = [[('a', 1), ('b', 2)],
- ... [('a', 1), ('b', 3)],
- ... [('a', 2), ('b', 3)]]
-
- >>> problem = Problem(RecursiveBacktrackingSolver())
- >>> problem.addVariables(["a", "b"], [1, 2, 3])
- >>> problem.addConstraint(lambda a, b: b > a, ["a", "b"])
-
- >>> solution = problem.getSolution()
- >>> sorted(solution.items()) in result
- True
-
- >>> for solution in problem.getSolutions():
- ... sorted(solution.items()) in result
- True
- True
- True
-
- >>> problem.getSolutionIter()
- Traceback (most recent call last):
- ...
- NotImplementedError: RecursiveBacktrackingSolver doesn't provide
iteration
- """#"""
-
- def __init__(self, forwardcheck=True):
- """
- @param forwardcheck: If false forward checking will not be
requested
- to constraints while looking for solutions
- (default is true)
- @type forwardcheck: bool
- """
- self._forwardcheck = forwardcheck
-
-
- def recursiveBacktracking(self, solutions, domains, vconstraints,
- assignments, single):
-
- # Mix the Degree and Minimum Remaing Values (MRV) heuristics
- lst = [(-len(vconstraints[variable]),
- len(domains[variable]), variable) for variable in domains]
- lst.sort()
- for item in lst:
- if item[-1] not in assignments:
- # Found an unassigned variable. Let's go.
- break
- else:
- # No unassigned variables. We've got a solution.
- solutions.append(assignments.copy())
- return solutions
-
- variable = item[-1]
- assignments[variable] = None
-
- forwardcheck = self._forwardcheck
- if forwardcheck:
- pushdomains = [domains[x] for x in domains if x not in
assignments]
- else:
- pushdomains = None
-
- for value in domains[variable]:
- assignments[variable] = value
- if pushdomains:
- for domain in pushdomains:
- domain.pushState()
- for constraint, variables in vconstraints[variable]:
- if not constraint(variables, domains, assignments,
- pushdomains):
- # Value is not good.
- break
- else:
- # Value is good. Recurse and get next variable.
- self.recursiveBacktracking(solutions, domains,
vconstraints,
- assignments, single)
- if solutions and single:
- return solutions
- if pushdomains:
- for domain in pushdomains:
- domain.popState()
- del assignments[variable]
- return solutions
-
-
- def getSolution(self, domains, constraints, vconstraints,
verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- solutions = self.recursiveBacktracking([], domains, vconstraints,
- {}, True)
- return solutions and solutions[0] or None
-
-
- def getSolutions(self, domains, constraints, vconstraints,
verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- return self.recursiveBacktracking([], domains, vconstraints,
- {}, False)
-
-
-class MinConflictsSolver(Solver):
- """
- Problem solver based on the minimum conflicts theory
-
- Examples:
-
- >>> result = [[('a', 1), ('b', 2)],
- ... [('a', 1), ('b', 3)],
- ... [('a', 2), ('b', 3)]]
-
- >>> problem = Problem(MinConflictsSolver())
- >>> problem.addVariables(["a", "b"], [1, 2, 3])
- >>> problem.addConstraint(lambda a, b: b > a, ["a", "b"])
-
- >>> solution = problem.getSolution()
- >>> sorted(solution.items()) in result
- True
-
- >>> problem.getSolutions()
- Traceback (most recent call last):
- ...
- NotImplementedError: MinConflictsSolver provides only a single solution
-
- >>> problem.getSolutionIter()
- Traceback (most recent call last):
- ...
- NotImplementedError: MinConflictsSolver doesn't provide iteration
- """#"""
-
- def __init__(self, steps=1000):
- """
- @param steps: Maximum number of steps to perform before giving up
- when looking for a solution (default is 1000)
- @type steps: int
- """
- self._steps = steps
-
-
- def getSolution(self, domains, constraints, vconstraints,
verbose=False):
- """
- @attention: the verbose parameter is ignored
- """
- assignments = {}
- # Initial assignment
- for variable in domains:
- assignments[variable] = random.choice(domains[variable])
- for _ in xrange(self._steps):
- conflicted = False
- lst = domains.keys()
- random.shuffle(lst)
- for variable in lst:
- # Check if variable is not in conflict
- for constraint, variables in vconstraints[variable]:
- if not constraint(variables, domains, assignments):
- break
- else:
- continue
- # Variable has conflicts. Find values with less conflicts.
- mincount = len(vconstraints[variable])
- minvalues = []
- for value in domains[variable]:
- assignments[variable] = value
- count = 0
- for constraint, variables in vconstraints[variable]:
- if not constraint(variables, domains, assignments):
- count += 1
- if count == mincount:
- minvalues.append(value)
- elif count < mincount:
- mincount = count
- del minvalues[:]
- minvalues.append(value)
- # Pick a random one from these values.
- assignments[variable] = random.choice(minvalues)
- conflicted = True
- if not conflicted:
- return assignments
- return None
-
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /generation/xdg_constraint/variable.py Fri Nov 27 03:02:07 2009 UTC
+++ /dev/null
@@ -1,121 +0,0 @@
-#!/usr/bin/env python
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# Based on the python_constraint.py library (GNU GPLv2 or later)
-# Copyright (c) 2005 Gustavo Niemeyer <niem...@conectiva.com>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-"""
-@var Unassigned: Helper object instance representing unassigned values
-
-@sort: Variable, Domain
-"""
-
-__all__ = ["Variable", "Domain", "Unassigned"]
-
-# ----------------------------------------------------------------------
-# Variables
-# ----------------------------------------------------------------------
-class Variable(object):
- """
- Helper class for variable definition
-
- Using this class is optional, since any hashable object,
- including plain strings and integers, may be used as variables.
- """
-
- def __init__(self, name):
- """
- @param name: Generic variable name for problem-specific purposes
- @type name: string
- """
- self.name = name
-
- def __repr__(self):
- return self.name
-
-Unassigned = Variable("Unassigned")
-
-
-# ----------------------------------------------------------------------
-# Domains
-# ----------------------------------------------------------------------
-class Domain(list):
- """
- Class used to control possible values for variables
-
- When list or tuples are used as domains, they are automatically
- converted to an instance of that class.
- """
-
- def __init__(self, set):
- """
- @param set: Set of values that the given variables may assume
- @type set: set of objects comparable by equality
- """
- list.__init__(self, set)
- self._hidden = []
- self._states = []
-
- def resetState(self):
- """
- Reset to the original domain state, including all possible values
- """
- self.extend(self._hidden)
- del self._hidden[:]
- del self._states[:]
-
- def pushState(self):
- """
- Save current domain state
-
- Variables hidden after that call are restored when that state
- is popped from the stack.
- """
- self._states.append(len(self))
-
- def popState(self):
- """
- Restore domain state from the top of the stack
-
- Variables hidden since the last popped state are then available
- again.
- """
- diff = self._states.pop()-len(self)
- if diff:
- self.extend(self._hidden[-diff:])
- del self._hidden[-diff:]
-
- def hideValue(self, value):
- """
- Hide the given value from the domain
-
- After that call the given value won't be seen as a possible value
- on that domain anymore. The hidden value will be restored when the
- previous saved state is popped.
-
- @param value: Object currently available in the domain
- """
- list.remove(self, value)
- self._hidden.append(value)
-
-
-# ----------------------------------------------------------------------
-if __name__ == "__main__":
- import doctest
- doctest.testmod()
-
=======================================
--- /ordermodel/NOTES Wed Aug 15 06:47:54 2012 UTC
+++ /dev/null
@@ -1,11 +0,0 @@
-TODO(alexr): understand the relationship between this order model you've
been
-kicking around and Kendall's tau.
-
-http://en.wikipedia.org/wiki/Kendall's_tau_rank_correlation_coefficient
-
-It seems like there's something sensible here, but this needs some more
-rethinking.
-
-Of course, the more important thing to think about, before inventing new
order
-models, is how to best make the constraint solving explore likely-good
branches
-of the search space first. (and return them as the top translations!)
=======================================
--- /ordermodel/learnmodel.py Wed Jan 25 23:24:37 2012 UTC
+++ /dev/null
@@ -1,17 +0,0 @@
-#!/usr/bin/env python3
-
-from ordermodel import OrderModel
-
-def main():
- model = OrderModel()
- model.add_sentence("learn that model now seriously".split())
-
- print("ALL OCCURRENCES")
- for tup in model.occurrences.items():
- print(tup)
- print("ALPHAFIRST OCCURRENCES")
- for tup in model.alphafirst.items():
- print(tup)
-
-if __name__ == "__main__":
- main()
=======================================
--- /ordermodel/ordermodel.py Fri Feb 10 14:48:55 2012 UTC
+++ /dev/null
@@ -1,79 +0,0 @@
-#!/usr/bin/env python3
-
-import math
-from collections import defaultdict
-import pickle
-
-from utils import makewordpair
-
-class OrderModel:
- """Very dumbest possible thing: count the number of times two words
appear
- together in the same sentence. Count the fraction of the time that the
- alphabetically-prior one appears first."""
-
- def __init__(self):
- self.occurrences = defaultdict(lambda:0)
- self.alphafirst = defaultdict(lambda:0)
-
- def add_sentence(self, sentence):
- """Given a sentence (list of words, already tokenized), update
- occurrences and alphafirst for every pair of words."""
- assert type(sentence) == list
-
- for i,w1 in enumerate(sentence):
- for j,w2 in enumerate(sentence):
- # don't double-count
- if i >= j: continue
- wordpair = makewordpair(w1, w2)
- self.occurrences[wordpair] += 1
- ## if left word sorts before the right word...
- if sentence[i] == wordpair[0]:
- self.alphafirst[wordpair] += 1
-
- def comes_before(self, w1, w2):
- """Return the frequentist probability that w1 comes before w2, as a
- negative logprob."""
- wordpair = makewordpair(w1,w2)
-
- if wordpair not in self.occurrences:
- print("unknown wordpair:", wordpair)
- tolog = 0.5
- else:
- p_lowerfirst = (self.alphafirst[wordpair] /
- self.occurrences[wordpair])
- tolog = p_lowerfirst if w1 < w2 else (1 - p_lowerfirst)
-
- ## Sort of dumb, but very basic smoothing.
- tolog = max(tolog, 0.01)
- return -1.0 * math.log(tolog, 2)
-
- def order_prob(self, words):
- """Given a list of words, return the joint probability of their
- order."""
- print(words)
- joint = 1.0
- for (w1,w2) in zip(words[:-1], words[1:]):
- joint += self.comes_before(w1, w2)
- return joint
-
- def save(self, fn):
- """Pickle self into the specified filename."""
-
- ## Have to change the dictionaries to not be defaultdicts for
pickling.
- self.occurrences = dict(self.occurrences.items())
- self.alphafirst = dict(self.alphafirst.items())
-
- with open(fn, "wb") as outfile:
- pickle.dump(self, outfile)
-
- self.occurrences = defaultdict(lambda:0, self.occurrences)
- self.alphafirst = defaultdict(lambda:0, self.alphafirst)
-
-def load(fn):
- """Get an OrderModel object from file."""
- with open(fn, "rb") as infile:
- model = pickle.load(infile)
-
- model.occurrences = defaultdict(lambda:0, model.occurrences)
- model.alphafirst = defaultdict(lambda:0, model.alphafirst)
- return model
=======================================
--- /ordermodel/runtests.py Wed Jan 18 23:21:19 2012 UTC
+++ /dev/null
@@ -1,9 +0,0 @@
-#!/usr/bin/env python3
-
-import unittest
-
-from tests.test_learnmodel import TestLearnWordOrder
-from tests.test_utils import TestUtils
-
-if __name__ == "__main__":
- unittest.main()
=======================================
--- /ordermodel/testoncorpus.py Fri Feb 10 14:48:55 2012 UTC
+++ /dev/null
@@ -1,33 +0,0 @@
-#!/usr/bin/env python3
-
-import sys
-
-import ordermodel
-import pickle
-
-def testonfile(model, fn):
- with open(fn, "r") as infile:
- for line in infile:
- line = line.strip().lower()
- if not line: continue
- withtags = line.split()
- words = [word for (word,tag)
- in [chunk.rsplit("/",1) for chunk in withtags]]
-
- print(model.order_prob(words))
-
-## usage: testoncorpus.py modelpickle testsetpickle
-def main():
- with open(sys.argv[2], "rb") as infile:
- testset = pickle.load(infile)
- print(testset)
-
- model = ordermodel.load(sys.argv[1])
- print("loaded!!")
-
- for fn in testset:
- print("testing on fn")
- testonfile(model, fn)
-
-if __name__ == "__main__":
- main()
=======================================
--- /ordermodel/tests/test_learnmodel.py Fri Feb 10 14:48:55 2012 UTC
+++ /dev/null
@@ -1,26 +0,0 @@
-import unittest
-
-import math
-from collections import defaultdict
-
-from ordermodel import OrderModel
-
-class TestLearnWordOrder(unittest.TestCase):
- def setUp(self):
- pass
-
- def test_foo(self):
- self.assertIn("V", ["foo", "V"])
-
- def test_onesentence(self):
- model = OrderModel()
- model.add_sentence("learn that model now seriously".split())
- p = model.comes_before("that", "model")
- self.assertEqual(p, -math.log(1.0, 2))
-
- def test_ambiguous_orders(self):
- model = OrderModel()
- model.add_sentence("learn that model now seriously".split())
- model.add_sentence("seriously now model that learn".split())
- p = model.comes_before("that", "model")
- self.assertEqual(p, -math.log(0.5, 2))
=======================================
--- /ordermodel/tests/test_utils.py Wed Jan 18 23:21:19 2012 UTC
+++ /dev/null
@@ -1,17 +0,0 @@
-import unittest
-
-from utils import makewordpair
-
-class TestUtils(unittest.TestCase):
- def setUp(self):
- pass
-
- def test_makewordpair(self):
- pair = makewordpair("foo","foo")
- self.assertEqual(pair, ("foo","foo"))
-
- pair = makewordpair("betabet","alphabet")
- self.assertEqual(pair, ("alphabet","betabet"))
-
- pair = makewordpair("alphabet", "betabet")
- self.assertEqual(pair, ("alphabet","betabet"))
=======================================
--- /ordermodel/trainoncorpus.py Fri Feb 10 14:48:55 2012 UTC
+++ /dev/null
@@ -1,49 +0,0 @@
-#!/usr/bin/env python3
-
-import sys
-import glob
-import random
-import math
-import pickle
-
-from ordermodel import OrderModel
-
-# change this so it works for you.
-BROWNPATH = "/home/alexr/brown/"
-
-def pick_training_and_test():
- allfiles = glob.glob(BROWNPATH + "/ca??")
- testsize = math.floor(len(allfiles) / 10)
- testset = random.sample(allfiles, testsize)
- trainingset = [fn for fn in allfiles if fn not in testset]
- return (trainingset, testset)
-
-def add_sentences_from_file(fn, model):
- """Assumes that lines are a bunch of space-separated word/tag pairs.
- Lowercases each line, then adds the words (just the words) to the
model.
- Probably soon we'll do something with the tags."""
-
- with open(fn, "r") as infile:
- for line in infile:
- line = line.strip().lower()
- if not line: continue
- withtags = line.split()
- words = [word for (word,tag)
- in [chunk.rsplit("/",1) for chunk in withtags]]
- model.add_sentence(words)
-
-def main():
- trainingset, testset = pick_training_and_test()
-
- model = OrderModel()
- for fn in trainingset:
- add_sentences_from_file(fn, model)
- print(model.comes_before("he", "traveled"))
-
- model.save("brown_training.pickle")
-
- with open("testset.pickle", "wb") as outfile:
- pickle.dump(testset, outfile)
-
-if __name__ == "__main__":
- main()
=======================================
--- /ordermodel/utils.py Wed Jan 18 23:21:19 2012 UTC
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/env python3
-
-def makewordpair(w1, w2):
- return (w1, w2) if (w1 < w2) else (w2, w1)
=======================================
--- /xdg/__init__.py Wed Dec 2 21:59:55 2009 UTC
+++ /dev/null
@@ -1,18 +0,0 @@
-"""
-This file is part of L3XDG.
-
- L3XDG is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- L3XDG is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with L3XDG. If not, see <http://www.gnu.org/licenses/>.
-
-Author: Michael Gasser <gas...@cs.indiana.edu>
-"""
=======================================
--- /xdg/dimension.py Fri Sep 24 21:43:47 2010 UTC
+++ /dev/null
@@ -1,2364 +0,0 @@
-# Implementation of dimensions and dimension-specific constraint
instantiation
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2009 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# 2009.10.31
-# Added group arc principles and constraints
-#
-# 2009.11.15
-# Added SynSyn Dimension
-# Made agreement work by simple one-level unification
-#
-# 2009.11.17
-# Made abbrev a top-level attribute of Dimension
-# Made dimension labels depend on language
-#
-# 2009.11.22
-# Unification of real feature structures for agreement
-# constraints
-#
-# 2009.11.29
-# Languages with multiple agreements like Amharic now
-# roughly work, though the agreement principles/constraints
-# are very ugly (and probably still buggy).
-#
-# 2009.12.03
-# Cleaned up agreement principle/constraints
-#
-# 2009.12.04
-# Massively simplified principles and constraints (especially agreement),
-# by giving each node an entry variable and at least one agr variable.
-#
-# 2009.12.18
-# LinkingEnd principle/constraint fixed (now the link into the daughter
-# on dimension2 can come from any node).
-#
-# 2009.12.21
-# Implemented LinkingMother and LinkingDaughterEnd principles/constraints
-#
-# 2009.12.22
-# Fixed linking principles so they take the two dimensions as arguments,
-# allowing for linking in both "directions".
-#
-# 2009.12.28
-# LinkingAboveBelow1or2Start principle/constraint
-#
-# 2010.01.05
-# Order constraints in lexical entries can be expressed in lists longer
-# than 2; this gets converted to pairs in order_principle()
-#
-# Added ID and LP dimensions (for English), but not yet IDLP.
-#
-# 2010.01.18
-# Added IDLP dimension.
-# Implemented ClimbingPrinciple (required for IDLP)
-#
-# 2010.01.23
-# Implemented BarriersPrinciple (required for IDLP)
-#
-# 2010.01.30
-# Implemented GovernmentPrinciple (required for ID)
-#
-# 2010.01.31
-# Dag/Tree: includes check for cycles (only pairs of nodes)
-#
-# 2010.02.09
-# IDSem: relating ID and Sem dimensions (just like SynSem so far)
-#
-# 2010.02.10
-# LinkingBelow1or2Start
-#
-# 2010.02.13
-# Fixed a bug in valency constraints (now checks entries of nodes
-# on the other ends of arcs)
-#
-# 2010.02.14
-# Agreement principle/constraint changed to accommodate agrs dicts.
-# There is now a separate agreement constraint for each daughter of
-# a node with an agree attribute that can have the right in arc
-# and the right features.
-#
-# 2010.02.16
-# Agreement principle/constraint changed to accommodate agree
-# variables and multiple daughter agreement features.
-#
-# 2010.02.21
-# More fixes, changes in agreement principle and constraint
-# to accommodate multiple children features in agree attribute and
-# agree variables
-# Improvements in valency and govern principles.
-#
-# 2010.03.05
-# Fixed and extended LinkingMother principle.
-#
-# 2010.04.01
-# Added a PrincipleError class to handle errors in instantiating
principles.
-#
-# 2010.04.12
-# Fixed bug in AgreementPrinciple for the case when the entry variable has
-# a value that is irrelevant to the principle and only the default value
of agrs
-# should succeed.
-#
-# 2010.04.05
-# Fixed bug in MotherPrinciple for the case where there are no possible
arcs
-# into some node; this raises a PrincipleError exception.
-#
-# 2010.04.18
-# Lots of changes in how empty nodes interact with constraints and a new
constraint,
-# merge_empty_arc for "merged" empty nodes
-#
-# 2010.04.27
-# Merge2Principle: in SynSem; head noun of a relative clause has two arg*
arcs into
-# it in Semantics
-#
-# 2010.05.02
-# Helper function featlist_agree.
-
-from xdg_constraint import *
-from node import Empty
-from languages.morpho.fs import simple_unify
-from languages.morpho.semiring import TOP
-import itertools
-
-# All possible dimension abbreviations
-DIMENSIONS =
['id', 'lp', 'pa', 'idlp', 'idsem', 'syn', 'sem', 'synsem', 'synsyn']
-
-class Dimension(object):
- """Abstract class for XDG dimensions."""
-
- def __init__(self, language, problem, abbrev=''):
- """
- @param language: the language for the dimension, needed for
- implementation of constraints
- @type language: Language
- @param problem: the constraint satisfaction problem
- @type problem: XDGProblem
- """
- self.language = language
- self.abbrev = abbrev or self.__class__.abbrev
- self.problem = problem
- self.principles = []
-
- def __str__(self):
- return 'Dimension'
-
- def set_principles(self, princs):
- """Assign the principles for this dimension."""
- self.principles = princs
-
- def get_principles(self):
- """Return all principles."""
- return self.principles
-
- def has_principle(self, princ_name):
- """Does this dimension have a principle with a particular name?
- @param a principle (method) name, e.g., "tree_principle"
- @type string
- @return: whether this principle is among the dimension's principles
- @rtype: boolean
- """
- return princ_name in [princ.__name__ for princ in self.principles]
-
- def get_node_vars(self, node, kind):
- """Return the list of variables of kind variable dictionary in
node for
- this dimension."""
- return node.vars.get(self.abbrev, {}).get(kind, [])
-
- def get_lex_dim(self, lex):
- """Return the LexDim in lex for this dimension."""
- return lex.dims.get(self.abbrev)
-
- def get_dim_var_dict(self, kind):
- """Return variable dict of a particular kind for this dimension."""
- return self.problem.dim_vars[self.abbrev].get(kind, {})
-
-class ArcDimension(Dimension):
- """
- Abstract class for dimensions with arcs, such as syntax and semantics.
-
- These have the valency and group arc principles, but not necessarily
tree, order, or agreement.
- """
-
- def group_arc_principle(self):
- """Add group arc constraints, constraining arcs with particular
labels to have
- daughters that belong to the same group and have particular (word)
labels."""
- if self.problem.groups:
- # Dict of arc variables by source, dest pairs
- arc_daughs = self.get_dim_var_dict('arc_daughs')
- for gid, group_obj in self.problem.groups.items():
- lex_lists = []
- for lex, nodes in group_obj.lex.items():
- word = nodes[0][0].word
- node_list = []
- for node, group_entry in nodes:
- entry_var = node.entry_var
- node_list.append([node, entry_var, group_entry])
- lex_lists.append([word, lex, node_list])
- for word, lex, node_list in lex_lists:
- # For this word, check to see if its lex has a
groupouts feature
- # If not, no constraint is necessary
- lex_dim = self.get_lex_dim(lex)
- lex_groupouts = list(lex_dim.groupouts.items())
- if not lex_groupouts:
- continue
- # Any node in node_list must satisfy the set of
arc_label, daugh_label
- # constraints if its entry_var is bound to the group's
entry_index
- for node, entry_var, entry_index in node_list:
- # A dict of arc label : daughter label pairs
- for arc_label, daugh_label in lex_groupouts:
- # Find which nodes have daugh_label as their
name
- matching_nodes = [x for x in lex_lists if x[0]
== daugh_label][0]
- variables = [entry_var]
- d_entry_indices = [x[2] for x in
matching_nodes[2]]
- for d_node, d_entry_var, d_entry_index in
matching_nodes[2]:
- m_index = node.index
- d_index = d_node.index
- arc_var = arc_daughs.get((m_index,
d_index))
- variables.extend([d_entry_var, arc_var])
- # Add the constraint for this combination of
mother and daughter
- # entry variables and the arc variable
- constraint =
XDGConstraint(group_arc_agreeC(entry_index, arc_label, d_entry_indices),
- name = gid
+ ':GroupArc_' + arc_label,
- cls='group_arc')
- self.problem.addConstraint(constraint,
variables)
-
- def set_labels(self):
- """Assign arc labels for the language."""
- # Strip off the language prefix if there is one
- abbrev = self.abbrev.split('-')[-1]
- self.labels = self.language.labels.get(abbrev)
-
- def root_principle(self):
- """There must be at least one root."""
- self.problem.addConstraint(XDGConstraint(at_least_one,
name='Root_' + self.abbrev, cls='root'),
-
self.get_node_vars(self.problem.eos, 'daughter_vars'))
-
- def dag_principle(self):
- """The graph must be a DAG."""
- self.no_cycles()
- self.mothers()
-
- def no_cycles(self):
- """There must be no cycles. (We may need to check beyond pairs.)"""
- daugh_vars = self.get_dim_var_dict('arc_daughs')
- for node1 in self.problem.get_nodes():
- for node2 in self.problem.get_nodes():
- if node1 == node2:
- continue
- node12_arc = daugh_vars.get((node1.index, node2.index))
- node21_arc = daugh_vars.get((node2.index, node1.index))
- if not node12_arc or not node21_arc:
- continue
- name = str(node1.index) + ',' + str(node2.index) + ':Cycle'
- self.problem.addConstraint(XDGConstraint(no_cycle,
name=name, cls='cycles'),
- [node12_arc, node21_arc])
-
- def mothers(self):
- """Add constraint that each node other than EOS must have at least
one mother (part of DAG principle)."""
- for node in self.problem.get_nodes(eos=False):
- vars = self.get_node_vars(node, 'mother_vars')
- if not vars:
- print(">>> {0} has no possible mothers for {1}; could be a
problem! <<<".format(str(node), str(self)))
-# raise PrincipleError(str(node) + ' has no possible
mothers on ' + str(self))
- self.problem.addConstraint(XDGConstraint(at_least_one,
- name=str(node.index)
+ ':Mother', cls='mothers'),
- vars)
-
- def tree_principle(self):
- """Add constraints that no node can have more than one mother."""
- # For all nodes except end-of-sentence node, there must be only
one mother
- for node in self.problem.get_nodes(eos=False):
- vars = self.get_node_vars(node, 'mother_vars')
- if not vars:
- print(">>> {0} has no possible mothers for {1}; could be a
problem! <<<".format(str(node), str(self)))
-# raise PrincipleError(str(node) + ' has no possible
mothers on ' + str(self))
- self.problem.addConstraint(XDGConstraint(one_exists,
- name=str(node.index)
+ ':Tree', cls='tree'),
- vars)
- # End-of-sentence node has only one daughter
- self.problem.addConstraint(XDGConstraint(one_exists,
name='EOS:Tree', cls='tree'),
-
self.get_node_vars(self.problem.eos, 'daughter_vars'))
- # There can be no cycles
- self.no_cycles()
-
- def valency_principle(self):
- """Add valency constraints."""
- # For all nodes except end-of-sentence node, add in and out
valency constraints.
- for node in self.problem.get_nodes(eos=False):
- for label in self.labels + ['root']:
- if label:
- # label is not None
- self._valency_principle1(node, label, ins=True)
- self._valency_principle1(node, label, ins=False)
-
- def _valency_principle1(self, node, label, ins=True):
- """Is valency satisfied for label in node's ins or outs?
-
- @param node: the node that the constraint applies to
- @type node: Node
- @param label: an arc label
- @type label: string
- @param ins: whether the constraint applies to the ins or outs
of the node
- @type ins: boolean
- @return: whether the constraint is satisfied
- @rtype: boolean
- """
- # Set of possible labels (all entries)
- labels = self.get_node_vars(node, 'ins' if ins else 'outs')
- if label in labels:
- dim_abbrev = self.abbrev
- # Check whether the label has anything other than a '*'
constraint
- constraint = False
- i = 0
- while not constraint and i < len(node.entries):
- entry = node.entries[i]
- entry_dim = entry.dims.get(dim_abbrev)
- i += 1
- if not entry_dim:
- continue
- entry_constraint = entry_dim.ins.get(label, '*') if ins
else entry_dim.outs.get(label, '*')
- if entry_constraint not in ['*']: # , '%!', '%', '%%']:
- # Found a *real* constraint
- # (Valency for '%*' is handled by EmptyConstraint)
- constraint = True
- if constraint:
-# print('Creating valency constraint', node, label)
- # Constraint name
- name = '{0}:{1}:Valency'.format(node.index, label)
- if ins:
- name += '_in'
- else:
- name += '_out'
- name += ':' + dim_abbrev
- # Variables: node mother or daughter vars: arc labels
- arc_variables = self.get_node_vars(node, 'mother_vars') if
ins else self.get_node_vars(node,'daughter_vars')
- end_indices = self.get_node_vars(node, 'var_mothers') if
ins else self.get_node_vars(node, 'var_daughters')
- # Nodes on the other ends of the arcs
- ends = [self.problem.get_node(index) for index in
end_indices]
- # Hold on; before we go and create a constraint for the
label, let's make sure there's a node
- # on the other end that can have a link into (out of) it
with that label
- other_arcs = set()
- for end in ends:
- other_arcs.update(self.get_node_vars(end, 'outs' if
ins else 'ins'))
- if arc_variables and label in other_arcs:
- # More variables: entry variables for mothers or
daughters
- end_entry_vars = [end.entry_var for end in ends]
- variables = [node.entry_var] + arc_variables +
end_entry_vars
- constraint = XDGConstraint(valency_constraintC(label,
node, dim_abbrev, ins, ends),
- name=name, cls='valency')
- self.problem.addConstraint(constraint, variables)
-
- def arc_projectivity_principle(self):
- """For any arc in a proj feature with at least one node in the
middle, prevent internal
- nodes from having mothers outside the interval."""
- arc_dict = self.get_dim_var_dict('arc_vars')
- for mother in self.problem.get_nodes(eos=False):
- entries = mother.entries
- # Check whether any entry has a proj feature
- proj = []
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim and lex_dim.proj:
- proj.extend(lex_dim.proj)
- # At least one entry does have a proj constraint,
- # so create constraints if there is an arc out of mother
- # with at least node in between
- if proj:
- mother_entry_var = mother.entry_var
- # Get all daughter vars with daughters separated by at
least one node
- daugh_vars = self.get_node_vars(mother, 'daughter_vars')
- for daugh_var in daugh_vars:
- # For each arc variable, check whether the distance
between
- # the children >= 2.
- source, dest = arc_dict[daugh_var]
- if source - dest >= 2:
- # Left arc passing over at least one other node
- # Check the arcs into nodes in the interval
- for interval_index in range(dest+1, source):
- interval_node =
self.problem.get_nodes()[interval_index]
- for int_node_moth_var in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs coming from nodes
outside the interval
- int_source, int_dest =
arc_dict[int_node_moth_var]
- if int_source < dest or int_source >
source:
- cname = str(source) + '|' + str(dest)
+ ':Proj'
-
self.problem.addConstraint(XDGConstraint(projectsC(mother, self.abbrev),
-
name=cname,
cls='arc_projectivity'),
-
[mother_entry_var, daugh_var, int_node_moth_var])
- elif dest - source >= 2:
- # Right arc passive over at least one other node
- # Check arcs in the nodes in the interval
- for interval_index in range(source+1, dest):
- interval_node =
self.problem.get_nodes()[interval_index]
- for int_node_moth_var in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs from nodes outside
the interval
- int_source, int_dest =
arc_dict[int_node_moth_var]
- if int_source > dest or int_source <
source:
- cname = str(source) + '|' + str(dest)
+ ':Proj'
-
self.problem.addConstraint(XDGConstraint(projectsC(mother, self.abbrev),
-
name=cname,
cls='arc_projectivity'),
-
[mother_entry_var, daugh_var, int_node_moth_var])
-
- def projectivity_principle(self):
- """For any arc with at least one node in the middle, prevent
internal
- nodes from having mothers outside the interval."""
- arc_dict = self.get_dim_var_dict('arc_vars')
- for node in self.problem.get_nodes(eos=False):
- # Get all daughter vars with daughters separated by at least
one node
- daughs = self.get_node_vars(node, 'daughter_vars')
- name = str(node.index) + ':'
- for daugh in daughs:
- # For each arc variable, check whether the distance between
- # the children >= 2.
- # (Note that source is daugh index.)
- source, dest = arc_dict[daugh]
- # Ignore empty nodes
- if dest < 0:
- continue
- if source - dest >= 2:
- # Left arc
- # Check the arcs into nodes in the interval
- for interval_index in range(dest+1, source):
- interval_node =
self.problem.get_nodes()[interval_index]
- for int_node_moth in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs coming from nodes
outside the interval
- int_source, int_dest = arc_dict[int_node_moth]
- if int_source < dest or int_source > source:
- cname = name + str(source) + '|' +
str(dest) + ':Proj'
-
self.problem.addConstraint(XDGConstraint(no_cross, name=cname,
cls='projectivity'),
- [daugh,
int_node_moth])
- elif dest - source >= 2:
- # Right arc
- # Check arcs in the nodes in the interval
- for interval_index in range(source+1, dest):
- interval_node =
self.problem.get_nodes()[interval_index]
- for int_node_moth in
self.get_node_vars(interval_node, 'mother_vars'):
- # Only worry about arcs from nodes outside the
interval
- int_source, int_dest = arc_dict[int_node_moth]
- if int_source > dest or int_source < source:
- cname = name + str(source) + '|' +
str(dest) + ':Proj'
-
self.problem.addConstraint(XDGConstraint(no_cross, name=cname,
cls='projectivity'),
- [daugh,
int_node_moth])
-
- def order_principle(self):
- """Create constraints for the order pairs in each node's order
attribute.
-
- Variables:
- daughter arc_labels
- if node has multiple lex entries: entry_var
- """
- # For all nodes except the end-of-sentence node
- for node in self.problem.get_nodes(eos=False):
- # First replace any ordered lists longer than two with all
ordered pairs
- for entry in node.entries:
- entry_dim = self.get_lex_dim(entry)
- if not entry_dim:
- continue
- orders = entry_dim.order
- if orders and any([len(o) > 2 for o in orders]):
- # More than a pair; create ordered pairs from the list
- ordered_pairs = []
- for order in orders:
- # Already an ordered pair; no change needed
- if len(order) == 2:
- ordered_pairs.append(order)
- # 3 or more items; create all pairs
- else:
- ordered_pairs.extend([list(o) for o in
itertools.combinations(order, 2)])
- # Replace the old list with the new one
- entry_dim.order = ordered_pairs
- # Assume node is ambiguous; create a single constraint for the
node if there are
- # any daughters
- d_vars = self.get_node_vars(node, 'daughter_vars')
- daughter_vars = d_vars[:]
- # Remove any of these that go to empty nodes (which don't have
any location)
- arc_dict = self.get_dim_var_dict('arc_vars')
- for d_var in d_vars:
- if arc_dict.get(d_var)[1] < 0:
- # The destination's index is negative...
- daughter_vars.remove(d_var)
- if daughter_vars:
-
self.problem.addConstraint(XDGConstraint(order_constraintC(node,
self.abbrev),
-
name=str(node.index) + ':Order', cls='order'),
- [node.entry_var] +
daughter_vars)
-
- def agreement_principle(self):
- """Create constraints for the labels in each node's agree field."""
- # For all nodes except the end-of-sentence node
- for node in self.problem.get_nodes(eos=False):
- # Figure out whether this is a node for which there needs to
be an agreement
- # constraint
- if any([(self.get_lex_dim(entry).agree and
self.get_lex_dim(entry).agrs)\
- for entry in node.entries]):
- for entry_index, entry in enumerate(node.entries):
- self._agreement_principle1(node, entry, entry_index)
-
- def _agreement_principle1(self, node, entry, entry_index):
- """
- Create agreement constraints.
-
- @param node: a node
- @type node: instance of Node
- @param entry: entry of node
- @type entry: Lex
- @param entry_index: index of entry
- @type entry_index: int
-
- """
- lexdim = self.get_lex_dim(entry)
- if not lexdim.agree:
- return
- # Only create a constraint if there's an agree attribute
- for arc_label, moth_feat, daugh_feat in lexdim.agree_gen():
-# print('Agree principle for', arc_label, moth_feat, daugh_feat)
- # daugh_feat is either a single feature or a list of features,
in which case we have include
- # the agree var among the variables
- mult_daugh_feats = isinstance(daugh_feat, list)
- # Node and arc variables associated with each daughter arc.
- daugh_nodes = [self.problem.get_node(d) for d in
self.get_node_vars(node, 'var_daughters')]
- daugh_arcs = self.get_node_vars(node, 'daughter_vars')
- # Agree var for the node
- agree_daugh_var =
self.get_node_vars(node, 'agree_daugh_var').get(moth_feat)
- # Agr var for the node
- moth_agr_var =
self.get_node_vars(node, 'agr_var').get(moth_feat)
-# print(' Mother agree var', moth_agr_var, 'agree_daugh_var',
agree_daugh_var)
- # Make the default constraint: applies if there is no mother
agreement variable
- # or if none of the daughter arcs matches arc_label
- if mult_daugh_feats and agree_daugh_var:
- # Get a default value for the agree variable: '^' or the
first one
- var_values = self.problem._variables[agree_daugh_var]
- dflt_agree_daugh_value = '^' if '^' in var_values else
var_values[0]
- name = str(node.index) + '[' + str(entry_index) + ']:' +
moth_feat + '|' + arc_label + ':dflt:Agree'
- variables = [agree_daugh_var] + daugh_arcs
- constraint = XDGConstraint(agree_defaultC(arc_label,
dflt_agree_daugh_value, moth_agr_var),
- name=name, cls='agree')
- self.problem.addConstraint(constraint, variables)
- if not moth_agr_var:
- # Nothing to agree with; skip the constraint
- continue
- # Assume daughters are ambiguous; add entry vars for all
daughters to variables.
- # Assume some daughter has multiple agrs; add daughter agr
vars to variables.
- for daughter, daugh_arc in zip(daugh_nodes, daugh_arcs):
-# if isinstance(daughter, Empty):
-# continue
- if not mult_daugh_feats:
- daugh_agr_var =
self.get_node_vars(daughter, 'agr_var').get(daugh_feat, '')
- if daugh_agr_var:
- name = str(node.index) + '[' + str(entry_index)
+ ']:' + arc_label + ':' + str(daughter.index) + ':Agree'
- variables = [moth_agr_var, node.entry_var,
daugh_agr_var, daughter.entry_var, daugh_arc]
- function = agree_constraintC(node, entry,
entry_index, arc_label=arc_label,
- moth_feat=moth_feat,
daugh_feat=daugh_feat,
- daughter=daughter,
dim_abbrev=self.abbrev)
- constraint = XDGConstraint(function, name=name,
cls='agree')
- self.problem.addConstraint(constraint, variables)
- elif agree_daugh_var:
- # Constraint requires the agree variable as well as
the agr variable
- # List of pairs of daughter's [feat, agr variable]
pairs
- daugh_agr_vars_list =
list(self.get_node_vars(daughter, 'agr_var').items())
- # Remove feat, variable pairs for which daughter has
no explicit value
-# print('daugh_agr_vars_list', daugh_agr_vars_list)
- for feat, var in daugh_agr_vars_list[:]:
- any_expl =
any([self.get_lex_dim(d_entry).any_agr(feat) for d_entry in
daughter.entries])
- if not any_expl:
- daugh_agr_vars_list.remove((feat, var))
- # List of daughter features and list of daughter agr
vars
- daugh_agr_vars_list = list(zip(*daugh_agr_vars_list))
-# print('daugh_agr_vars_list', daugh_agr_vars_list)
- # List of daughter agr vars
- daugh_agr_vars = list(daugh_agr_vars_list[1])
- # Dict of daughter features and positions in daughter
agr vars list
- daugh_agr_var_dct = dict([[feat, index] for index,
feat in enumerate(daugh_agr_vars_list[0])])
- name = str(node.index) + '[' + str(entry_index) + ']:'
+ moth_feat + '|' + arc_label + ':' + str(daughter.index) + ':Agree'
- variables = [moth_agr_var, node.entry_var,
agree_daugh_var, daughter.entry_var, daugh_arc] + daugh_agr_vars
- func = agree_multfeats_constraintC(node, entry,
entry_index,
-
arc_label=arc_label, moth_feat=moth_feat, daugh_feat=daugh_feat,
-
daugh_agr_var_dct=daugh_agr_var_dct,
- daughter=daughter,
dim_abbrev=self.abbrev)
- constraint = XDGConstraint(func, name=name,
cls='agree')
- self.problem.addConstraint(constraint, variables)
-
- def cross_agreement_principle(self):
- '''Constrain daughter feature in agree attribute for node1 to be
the same as mother feature
- in agree attribute for node2, a daughter of node1.'''
- for node in self.problem.get_nodes(eos=False):
- # Arc variables to daughters and mothers
- daugh_arc_vars = self.get_node_vars(node, 'daughter_vars')
- moth_arc_vars = self.get_node_vars(node, 'mother_vars')
- # Daughters corresponding to variables:
- daugh_nodes = [self.problem.get_node(d) for d in
self.get_node_vars(node, 'var_daughters')]
- mother_nodes = [self.problem.get_node(m) for m in
self.get_node_vars(node, 'var_mothers')]
- own_entry_var = node.entry_var
- own_agree_vars = self.get_node_vars(node, 'agree_daugh_var')
- own_agr_vars = self.get_node_vars(node, 'agr_var')
- # Does any entry for this node have a cross-agreement feature?
- for entry_i, entry in enumerate(node.entries):
- entry_dim = self.get_lex_dim(entry)
- entry_cross = entry_dim.cross
- entry_agree = entry_dim.agree
- entry_agrs = entry_dim.agrs
- if not entry_cross:
- continue
-# print('Cross entry', entry, entry_i, 'cross features',
entry_cross)
-# print('Agree', entry_agree, 'var dict', own_agree_vars)
-# print('Agrs', entry_agrs, 'var dict', own_agr_vars)
- for cross_arc, own_feats in entry_cross.items():
-# print(' Cross arc', cross_arc, end= ' ')
- for own_feat in own_feats:
-# print('cross feat', own_feat)
- # For every possible combination of mother with
agree var value of own_feat
- # and daughter with possible in arc with arc label
- for mother, moth_arc_var in zip(mother_nodes,
moth_arc_vars):
- #! moth_arc_var is used to get appropriate
agree_daugh_var
- moth_entry_var = mother.entry_var
- moth_agree_vars =
self.get_node_vars(mother, 'agree_daugh_var')
- # List of pairs of mother's [feat, agree
variable] pairs
- moth_agree_vars_list =
list(moth_agree_vars.items())
- # List of mother features and list of daughter
agree vars
- moth_agree_vars_list =
list(zip(*moth_agree_vars_list))
- # List of mother agreement vars
- if not moth_agree_vars_list:
- # What does this mean exactly?
- continue
- moth_agree_vars = list(moth_agree_vars_list[1])
- # Dict of mother features and positions in
mother agree vars list
- moth_agree_vars_dict = dict([[feat, index] for
index, feat in enumerate(moth_agree_vars_list[0])])
-# print(' Mother', mother, 'mother->own',
moth_arc_var, moth_entry_var, 'agree_vars',
-# moth_agree_vars, moth_agree_vars_dict)
- for moth_entry_i, moth_entry in
enumerate(mother.entries):
- moth_entry_dim =
self.get_lex_dim(moth_entry)
- moth_entry_agree = moth_entry_dim.agree
- if not moth_entry_agree:
- continue
-# print(' Mother entry', moth_entry,
moth_entry_i,
-# 'agree', moth_entry_agree)
- # Find daughters that can have arc as
label on an in arc
- for daughter, daugh_arc_var in
zip(daugh_nodes, daugh_arc_vars):
- if cross_arc not in
self.get_node_vars(daughter, 'ins'):
- continue
- daugh_entry_var = daughter.entry_var
- for daugh_entry_i, daugh_entry in
enumerate(daughter.entries):
- daugh_entry_dim =
self.get_lex_dim(daugh_entry)
- daugh_entry_agr =
daugh_entry_dim.agrs
- daugh_agr_vars =
self.get_node_vars(daughter, 'agr_var')
-# print(' Daughter',
daugh_entry, daugh_entry_i,
-# 'agrs', daugh_entry_agr,
daugh_agr_vars,
-# 'own->daughter',
daugh_arc_var)
-# print('CONSTRAINT:',
-# 'own->daughter(',
daugh_arc_var, ') =', cross_arc,
-# ';
moth_agree_vars[moth_agree_vars_dict[moth_arc_var]] =', own_feat,
-# '; own_agree_moth(',
entry_agree[cross_arc][0], ') =', own_feat)
- variables = [daugh_entry_var,
moth_entry_var, own_entry_var,
- moth_arc_var,
daugh_arc_var] + moth_agree_vars
- name = str(node.index)
+ ':CrossAgree'
- function = crossagreeC(entry_i,
moth_entry_i, daugh_entry_i,
- cross_arc,
own_feat, entry_agree,
-
moth_agree_vars_dict)
- constraint =
XDGConstraint(function, name=name, cls='cross')
-
self.problem.addConstraint(constraint, variables)
-
- def government_principle(self):
- # For all nodes except the end-of-sentence node
- for mother in self.problem.get_nodes(eos=False):
- # Arc variables to daughters:
- daugh_arc_vars = self.get_node_vars(mother, 'daughter_vars')
- # Daughters corresponding to variables
- daugh_nodes = [self.problem.get_node(d) for d in
self.get_node_vars(mother, 'var_daughters')]
- moth_entry_var = mother.entry_var
- for entry_i, entry in enumerate(mother.entries):
- entry_dim = self.get_lex_dim(entry)
- gov = entry_dim.govern
- if not gov:
- continue
-# print('Found govern feature')
- for gov_arc, (gov_feat, gov_val) in gov.items():
- # Only make a constraint if gov_arc is a possible out
arc for node
- if not entry_dim.outs.get(gov_arc):
- continue
- # entry has a govern attribute, so create constraints
for daughters
- for daugh_node, daugh_arc_var in zip(daugh_nodes,
daugh_arc_vars):
- # gov_arc must be a possible arc for daugh_node
(some entry)
- if gov_arc not in
self.get_node_vars(daugh_node, 'ins'):
- continue
- daugh_agr_var =
self.get_node_vars(daugh_node, 'agr_var').get(gov_feat)
- #
daugh_node.vars[self.abbrev].get('agr_var').get(gov_feat)
- if not daugh_agr_var:
- continue
- daugh_entry_var = daugh_node.entry_var
- variables = [daugh_arc_var, moth_entry_var,
daugh_entry_var, daugh_agr_var]
- constraint =
XDGConstraint(govern_entryC(mother=mother, moth_entry_index=entry_i,
moth_entry=entry,
-
daughter=daugh_node,
-
gov_arc=gov_arc, gov_feat=gov_feat, gov_val=gov_val,
-
dim_abbrev=self.abbrev),
- name=str(mother.index)
+ '[' + str(entry_i) + ']:' + gov_arc + ':' + str(daugh_node.index)
+ ':Govern',
- cls='govern')
- self.problem.addConstraint(constraint, variables)
-
-class Syntax(ArcDimension):
- """Class for the syntax dimension."""
-
- abbrev = 'syn'
-
- def __init__(self, language=None, problem=None, abbrev=''):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem, abbrev=abbrev)
- self.set_labels()
- if not self.labels:
- # Default arc labels
- self.labels = ['sb', 'ob', 'adv', 'det', 'adj', 'prp', 'pob',
None]
- self.set_principles([self.tree_principle,
- # Only for some languages
- self.projectivity_principle,
- self.group_arc_principle,
- self.order_principle,
- self.valency_principle,
- self.agreement_principle,
- self.government_principle,
- self.cross_agreement_principle])
-
- def __str__(self):
- return 'Syntax'
-
-class ID(ArcDimension):
- """Class for the immediate dominance dimension."""
-
- abbrev = 'id'
-
- def __init__(self, language=None, problem=None, abbrev=''):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem, abbrev=abbrev)
- self.set_labels()
- if not self.labels:
- # Default arc labels
- self.labels = ['sb', 'ob', 'adv', 'det', 'adj', 'prp', 'pob',
None]
- self.set_principles([self.tree_principle,
- self.group_arc_principle,
- self.valency_principle,
- self.agreement_principle,
- self.government_principle,
- self.cross_agreement_principle])
-
- def __str__(self):
- return 'ID'
-
-class LP(ArcDimension):
- """Class for the linear precedence dimension."""
-
- abbrev = 'lp'
-
- def __init__(self, language=None, problem=None, abbrev=''):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem, abbrev=abbrev)
- self.set_labels()
- if not self.labels:
- # Default arc labels
- self.labels =
['adjf', 'compf', 'detf', 'fadvf', 'lbf', 'mf1', 'mf2', 'nf', 'padjf',
- 'padvf', 'prepcf', 'rbf', 'relf', 'root', 'rprof', 'tadvf', 'vf', 'vvf',
None]
- self.set_principles([self.tree_principle,
- self.projectivity_principle,
- self.order_principle,
- self.valency_principle])
-
- def __str__(self):
- return 'LP'
-
-class Semantics(ArcDimension):
- """Class for the semantics dimension."""
-
- abbrev = 'sem'
-
- def __init__(self, language=None, problem=None, abbrev=''):
- """Create arc label list and principle groups."""
- Dimension.__init__(self, language, problem, abbrev=abbrev)
- self.set_labels()
- if not self.labels:
- # Default arc labels
- self.labels = ['arg1', 'arg2', 'del', 'mod', 'loc', None]
- self.set_principles([self.root_principle,
- self.dag_principle,
- self.valency_principle])
-
- def __str__(self):
- return 'Semantics'
-
-class IFDimension(Dimension):
- """Abstract class for interface dimensions."""
-
- def __init__(self, language=None, problem=None, dim1=None, dim2=None,
abbrev=''):
- """Assign two dimensions that this is the interface for."""
- Dimension.__init__(self, language, problem, abbrev=abbrev or
self.__class__.abbrev)
- self.dim1 = dim1
- self.dim2 = dim2
-
- def linking_daughter_end_principle(self, dim1, dim2):
- """Instantiate LinkingDaughterEnd constraints for nodes where
applicable.
- """
- # Only check nodes up to EOS node
- for mother in self.problem.get_nodes(eos=False):
- entries = mother.entries
- # Assume node ambiguity; does any entry have an ldend
constraint?
- any_ldend = False
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim and lex_dim.ldend:
- any_ldend = True
- # At least one entry does have an ldend constraint
- if any_ldend:
- mother_dim1 = mother.vars[dim1.abbrev]
- mother_dim2 = mother.vars[dim2.abbrev]
- dim1_daugh_arcs = mother_dim1['daughter_vars']
- dim2_daugh_arcs = mother_dim2['daughter_vars']
- dim1_daugh_indices = mother_dim1['var_daughters']
- dim2_daugh_indices = mother_dim2['var_daughters']
- self.problem.addConstraint(
- XDGConstraint(linking_daughter_endC(mother, self.abbrev,
- dim1_daugh_indices,
dim2_daugh_indices,
- len(dim1_daugh_arcs)),
- name='LinkingDaughterEnd',
cls='linkingDE'),
- [mother.entry_var] + dim1_daugh_arcs + dim2_daugh_arcs)
-
- def linking_end_principle(self, dim1, dim2):
- """For each dim1 arc label in each arg feature, constrain the dim2
daughter
- to have an in arc with one of the corresponding dim2 labels.
- """
- # Only check nodes up to EOS node
- for mother in self.problem.get_nodes(eos=False):
- entries = mother.entries
- # Check whether any entry has an arg feature
- any_arg = False
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim and lex_dim.arg:
- any_arg = True
- # At least one entry does have an arg constraint,
- # so create constraints
- if any_arg:
- mother_dim1 = mother.vars[dim1.abbrev]
- moth_entry_var = mother.entry_var
- daugh_arcs1 = mother_dim1['daughter_vars']
- daughters1 = mother_dim1['var_daughters']
- # For each arc to a daughter from mother...
- for daugh_index, daugh_arc in zip(daughters1, daugh_arcs1):
- daughter = self.problem.get_node(daugh_index)
- daugh_dim2 = daughter.vars[dim2.abbrev]
- daugh_in_arcs = daugh_dim2['mother_vars']
- # ... make a constraint with mother entry variable,
- # arc to daughter and daughter in arcs as variables
- self.problem.addConstraint(
- XDGConstraint(linking_end_matchC(mother, daughter,
self.abbrev),
- name='LinkingEnd', cls='linkingE'),
- [moth_entry_var, daugh_arc] + daugh_in_arcs)
-
- def linking_mother_principle(self, dim1, dim2):
- """For each dim1 arc label in each mod feature, constrain the dim1
daughter to be
- the dim2 mother of the dim1 mother.
- """
- for mother in self.problem.get_nodes(eos=False):
- entries = mother.entries
- # Check whether any entry has a mod feature
- any_mod = False
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim and lex_dim.mod:
- any_mod = True
- if not any_mod:
- continue
- # At least one entry does have a mod feature,
- # so create constraints: variables are
- # daughter arcs on dim1 and mother arcs on dim2
- mother_dim1 = mother.vars[dim1.abbrev]
- moth_entry_var = mother.entry_var
- daugh_arcs1 = mother_dim1['daughter_vars']
- daughters1 = mother_dim1['var_daughters']
- mother_dim2 = mother.vars[dim2.abbrev]
- moth_arcs2 = mother_dim2['mother_vars']
- mothers2 = mother_dim2['var_mothers']
- # Variables are the entry variable for the dim1 node, the dim1
daughter
- # arc variables out of that node and the dim2 mother arc
variables into
- # that node.
- variables = [moth_entry_var] + daugh_arcs1 + moth_arcs2
- func = linking_motherC(mother, daughters1, mothers2,
self.abbrev, dim2.abbrev)
- constraint = XDGConstraint(func, name='LinkingMother',
cls='linkingM')
- self.problem.addConstraint(constraint, variables)
-
- def linking_above_end_principle(self, dim1, dim2):
- """For each dim1 arc label in each laend feature, constrain the
dim1 daughter to be
- the dim2 descendant of the dim1 mother, with the first dim2 arc's
label one of the
- labels associated with dim1 arc label.
- """
- for mother in self.problem.get_nodes(eos=False):
- entries = mother.entries
- # Check whether any entry has an laend feature
- any_laend = False
- for entry in entries:
- lex_dim = self.get_lex_dim(entry)
- if lex_dim and lex_dim.laend:
- any_laend = True
- # At least one entry does have an laend constraint,
- # so create constraints
- if any_laend:
- mother_dim1 = mother.vars[dim1.abbrev]
- moth_entry_var = mother.entry_var
- daugh_arcs1 = mother_dim1['daughter_vars']
- daughters1 = mother_dim1['var_daughters']
- all_dim2_arcs =
self.problem.dim_vars[dim2.abbrev]['arc_var_list']
- # For each arc to a daughter from mother in dim1...
- for daugh_index, daugh_arc in zip(daughters1, daugh_arcs1):
- daughter = self.problem.get_node(daugh_index)
- daugh_dim2 = daughter.vars[dim2.abbrev]
- daugh_out_arcs = daugh_dim2['daughter_vars']
- n_daugh_out = len(daugh_out_arcs)
- # ... make a constraint with mother entry variable,
- # arc to daughter and daughter out arcs and all arcs
as variables
- self.problem.addConstraint(
- XDGConstraint(linking_above_end_matchC(mother,
daughter, n_daugh_out,
-
self.abbrev, dim2.abbrev,
-
self.problem),
- name='LinkingAboveEnd',
cls='linkingAE'),
- [moth_entry_var, daugh_arc] + daugh_out_arcs +
all_dim2_arcs)
-
- def linking_above_below_1or2_start_principle(self, dim1, dim2):
- """Given a m->d arc on dim1 with lab12s attribute and label l1,
there must be an arc into
- d or its mother from m or an ancestor of it with one of the
corresponding labels l2."""
- for mother1 in self.problem.get_nodes(eos=False):
- entries1 = mother1.entries
- # Check whether any entry has an laend feature
- any_lab12s = False
- for entry1 in entries1:
- lex_dim = self.get_lex_dim(entry1)
- if lex_dim and lex_dim.lab12s:
- any_lab12s = True
- # At least one entry does have an lab12s constraint,
- # so create constraints
- if any_lab12s:
- mother_dim1 = mother1.vars[dim1.abbrev]
- moth1_entry_var = mother1.entry_var
- daugh_arcs1 = mother_dim1['daughter_vars']
- daughters1 = mother_dim1['var_daughters']
- all_dim2_arcs =
self.problem.dim_vars[dim2.abbrev]['arc_var_list']
- # For each arc to a daughter from mother in dim1...
- for daugh_index, daugh_arc in zip(daughters1, daugh_arcs1):
- daughter1 = self.problem.get_node(daugh_index)
- # ... make a constraint with variables: mother1 entry,
- # daughter_arc1, and all dimension2 arcs
- name = '{0}->{1}:LinkingAB12S'.format(mother1.index,
daugh_index)
- variables = [moth1_entry_var, daugh_arc] +
all_dim2_arcs
- self.problem.addConstraint(
-
XDGConstraint(linking_above_below_1or2start_matchC(mother1, self.abbrev,
-
daughter1, dim2.abbrev,
-
self.problem),
- name=name, cls='linkingAB12S'),
- variables)
-
- def linking_below_1or2_start_principle(self, dim1, dim2):
- """Given a m->d arc on dim1 with lb12s attribute and label l1,
there must be an arc into
- d or its mother from m with one of the corresponding labels l2."""
- for mother1 in self.problem.get_nodes(eos=False):
- entries1 = mother1.entries
- # Check whether any entry has an lb12s feature
- any_lb12s = False
- for entry1 in entries1:
- lex_dim = self.get_lex_dim(entry1)
- if lex_dim and lex_dim.lb12s:
- any_lb12s = True
- # At least one entry does have an lb12s constraint,
- # so create constraints
- if any_lb12s:
- mother_dim1 = mother1.vars[dim1.abbrev]
- moth1_entry_var = mother1.entry_var
- daugh_arcs1 = mother_dim1['daughter_vars']
- daughters1 = mother_dim1['var_daughters']
- all_dim2_arcs =
self.problem.dim_vars[dim2.abbrev]['arc_var_list']
- # For each arc to a daughter from mother in dim1...
- for daugh_index, daugh_arc in zip(daughters1, daugh_arcs1):
- daughter1 = self.problem.get_node(daugh_index)
- # ... make a constraint with variables: mother1 entry,
- # daughter_arc1, and all dimension2 arcs
- variables = [moth1_entry_var, daugh_arc] +
all_dim2_arcs
- func = linking_below_1or2start_matchC(mother1,
self.abbrev,
- daughter1,
dim2.abbrev,
- self.problem)
- name = '{0}->{1}:LinkingB12S'.format(mother1.index,
daugh_index)
- constraint = XDGConstraint(func, name=name,
cls='linkingB12S')
- self.problem.addConstraint(constraint, variables)
-
- def climbing_principle(self, dim1, dim2):
***The diff for this file has been truncated for email.***
=======================================
--- /xdg/l3.py Fri Sep 24 21:43:47 2010 UTC
+++ /dev/null
@@ -1,1611 +0,0 @@
-# Cross-linguistic components of XDG, implemented with python_constraint.
-#
-# This file is part of the HLTDI L^3 project
-# for morphology, parsing, generation, and translation with XDG
-# Copyright (C) 2010 The HLTDI L^3 Team <gas...@cs.indiana.edu>
-#
-# This program is free software: you can redistribute it and/or
-# modify it under the terms of the GNU General Public License as
-# published by the Free Software Foundation, either version 3 of
-# the License, or (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program. If not, see <http://www.gnu.org/licenses/>.
-#
-# Michael Gasser <gas...@cs.indiana.edu>
-# 2009.09.12
-#
-# 2009.09.19
-# -- Added way to handle words lacking lexical entries
-# (but this should probably assign them to some lexical category
-# so we don't get the unknown word behaving like an adjective and a
verb).
-# -- Added grouping of principles within a dimension to make solving more
efficient.
-# For Syntax, tree, order, and valency principles may be enough to
-# converge on a single solution, or at least to constrain the variables
-# a lot.
-# -- Added hierarchy and inheritance in Lexicon
-#
-# 2009.10.23
-# -- Added multiple dimensions (dimensions now in separate module)
-#
-# 2009.10.25
-# -- Sentences can now be input as strings (without EOS punctuation).
-#
-# 2009.10.31
-# -- Representation of groups
-# -- Group constraints
-# -- Pretty-printing of solutions
-# -- Constraints are all created on problem init, rather than in solve()
-#
-# 2009.11.18
-# -- Added support for Multiling instance in XDGProblem constructor
-# -- Made dimensions attribute more flexible
-#
-# 2009.12.02
-# -- Debugged new agreement stuff:
-# agr vars for features (sb, etc.)
-# two kinds of agrs: lists of FSs and dicts of lists of FSs
-# inheritance and get_lex
-# -- Added Amharic morphology; integrated morph analysis into parsing
-#
-# 2009.12.04
-# -- Each node now has an entry var and arg vars, even when there is
-# no ambiguity. This greatly simplifies principles and constraints.
-#
-# 2009.12.09
-# -- YAML representation of lexicons (see languages/lex.py)
-#
-# 2009.12.19
-# -- End-of-sentence characters are now treated like ordinary "words".
-# See also lex.py.
-#
-# 2009.12.20
-# -- Node lexicalization combines explicit wordform entries (with 'word'
-# attributes) with output of morphological analysis (accessing
-# 'lexeme' entries in lexicon); see also languages/lex.py, am.yaml
-#
-# 2009.12.29
-# -- Various enhancements, fixes in linking principles (see dimension.py),
-# subject relative clauses in English now work
-#
-# 2010.01.02
-# -- Languages and Multilings now know what dimensions they include, based
on
-# what's in their lexicon (see languages/lex.py), and XDGProblems by
default
-# use these dimensions. To restrict an XDGProblem to fewer dimensions,
use
-# a list of *dimension abbreviations*, rather than classes, e.g.,
-# XDGProblem(language='en', dimensions=['syn'], sentence='people argue')
-#
-# 2010.01.23
-# -- ID, LP, and IDLP dimensions and their principles more or less complete
-# (see dimension.py).
-# -- Fixed an error in Node.get_ins() and Node.get_outs() that included
arcs
-# with '0' constraint, e.g., adv: 0
-#
-# 2010.01.31
-# -- Apparently finished with the principles needed for Syntax (with the
addition of
-# Government and cycles check).
-# -- Multiple entries are causing extreme slowness (see times at end).
-#
-# 2010.02.04
-# -- For simplicity, XDGProblem is now XDG.
-#
-# 2010.02.06
-# -- Empty nodes and valency constraint for them;
-# tested on subject and object in Amharic
-# (still no agrs attributes or government constraint)
-#
-# 2010.02.13
-# -- Empty nodes of two types ('%' and '%!'); details later
-# -- Node entries that are incompatible with any classes are
-# are ignored.
-#
-# 2010.02.14
-# -- Agrs features are all dicts rather than lists (converted to dicts if
in
-# the old format).
-# -- Govern attribute is now dict -- {arc_label: [daughter_agr_feat,
FS], ...}
-#
-# 2010.02.16
-# -- Agree attribute can now have list for last (daughter feature) list
element
-# (single string still also possible).
-# -- Agree variable created for this case (needed for cross-agreement
principle
-# (coming soon); all nodes have an agree_daugh_var dict with mother
features as
-# keys
-#
-# 2010.03.25
-# -- Split nodes and groups off into separate module: node.py
-#
-# 2010.03.27
-# -- Cross-dimensions (dimensions shared by the component languages of a
Multiling)
-# accommodated (apparently)
-#
-# 2010.04.03
-# -- A start toward morphological generation
-# prob = XDG('am', ...)
-# prob.gen_word('wdd', '[tm=j_i,sb=[+p1]]')
-# ልውደድ
-#
-# 2010.04.12
-# -- Fixed a problem with Agree principle that allowed multiple values for
an agr
-# variable to succeed with a non-matching lexical entry, e.g. ice in
-# They broke the ice.
-# -- Added big English lexicon.
-# -- Found a new bug with groups: only a single group entry is created for
a word,
-# even when it brances at a class node, e.g. broke in
-# They broke the ice
-# should have separate group entries for ROOT, REL, and SUB verb classes
-#
-# 2010.04.18
-# -- Lots of changes to how empty nodes are created and how their
constraints
-# are instantiated. In particular there are merged empty nodes with
their own
-# constraints
-
-import time
-from languages import *
-from dimension import *
-from node import *
-
-###
-### Dictionaries of arc dimension classes and interface dimension classes
-### with abbreviations as keys
-###
-
-ARC_DIMENSION_CLASSES = {}
-IF_DIMENSION_CLASSES = {}
-
-def make_dim_dict(parent, dct, interface=False):
- """Make a dictionary of dimension classes with abbreviations as keys.
- @param parent: dimension class
- @type parent: subclass of Dimension
- @param dct: dimension dictionary
- @type dct: dictionary with dimension abbreviations as keys
- """
- for cls in parent.__subclasses__():
- # Store all subclasses of parent that have an abbreviation
- if 'abbrev' in cls.__dict__:
- # For interface dimensions, store the class and
- # abbreviations of its two related dimensions
- if interface:
- value = (cls, cls.dim1, cls.dim2)
- # For arc dimensions, store only the class
- else:
- value = cls
- dct[cls.abbrev] = value
- # Do this recursively for subclasses of cls
- make_dim_dict(cls, dct, interface=interface)
-
-## Make separate dictionaries for arc dimension classes
-## and interface dimension classes
-make_dim_dict(ArcDimension, ARC_DIMENSION_CLASSES)
-make_dim_dict(IFDimension, IF_DIMENSION_CLASSES, True)
-
-### Constants for different kinds of processing
-PARSE = 0
-GENERATE = 1
-TRANSLATE = 2
-
-### ----------------------------------------------------------------------
-### Problems
-### ----------------------------------------------------------------------
-
-class XDG(Problem):
- """Class for an XDG constraint satisfaction problem.
-
- Example:
- >>> problem = XDG(language='en', sentence='the man eats the yogurt')
- >>> problem.solve()
-
- SOLUTION 0
- ...
-
- """
- def __init__(self, language='en', sentence='', analysis=None,
- solver=None, dimensions=None, process=PARSE, trace=False,
- pause=True, reload_lex=False):
- """Initialize a problem with a sentence, a language, some
dimensions, and a direction.
-
- @param solver: a CS solver, defaults to BacktrackingSolver
- @type solver: Solver
- @param sentence: list of words or string with spaces separating
words
- (the input to parsing)
- @type sentence: list or string
- @param analysis: an analysis of a sentence (the input to
generation)
- @type analysis: instance of Analysis
- @param dimensions: dimensions that are part of this problem if
they are
- other than the full set associated with the
language
- @type dimensions: list of dimension abbreviations (strings)
- @param language: the language or multiling for the problem
- @type language: language abbreviation (string)
- @param process: which kind of processing (parse, generate,
translate)
- @type process: int
- @param pause: whether to pause between initialization steps
- @param reload_lex: whether to reload the lexicon for the language
- """
- Problem.__init__(self, solver=solver)
- self.language = languages.get_language(language, reload=reload_lex)
- ## Whether to parse, generate, or translate
- # If language is a Multiling, translate automatically
- if isinstance(self.language, languages.Multiling):
- self.process = TRANSLATE
- else:
- self.process = process
- ## Whether to show inheritance during lexicalization
- self.trace = trace
- ## Specify the input and output languages
- ## Change these below if self.language is a multiling
- self.input_language = self.language
- self.output_language = self.language
- self.nodes = []
- # Empty nodes added because they have no overt form
- self.empty = []
-
- ### Create dimensions, either using the dimensions for the language
- ### or restricting them to the ones in an explicit list
- self.set_dimensions(dimensions)
- print('Dimensions', [(dim.language.abbrev, dim.abbrev) for dim in
self.dimensions])
-
- ### If there's no sentence, stop here.
- if not sentence:
- return
-
- ### Create the input sentence if parse, input analysis if
generation
- # If sentence is a string, split it at spaces and add EOS marker
- if process in (PARSE, TRANSLATE):
- ## For both parsing and translation, there is an input sentence
- if sentence and isinstance(sentence, list):
- self.sentence = sentence
- else:
- self.sentence = sentence.split()
- last_word = self.sentence[-1]
- # Split off sentence-final punctuation or add it
- self.sentence[-1:] =
self.input_language.separate_eos(last_word)
- print('Sentence: ', end=' ')
- for word in self.sentence:
- print(word, end=' ')
- print()
- else:
- raise NotImplementedError('Generation is not currently handled
in L3 :-(')
- ## Generation
-# self.eos = Node(word=self.input_language.eos_chars[0],
problem=self)
-
- self.lexicon = self.language.lexicon if self.language else None
-
- ### Initialize variables
- # Variables not specific to dimensions (entry)
- self.variables = {}
- # Variables specific to particular dimensions; make a
sub-dictionary
- # for each one
- self.dim_vars = dict([(dim.abbrev, {}) for dim in self.dimensions])
- # Initialize constraint call counts
- XDGConstraint.calls_by_class = {}
- XDGConstraint.calls = 0
-
- ### Begin processing
- if pause: input('CONTINUE? ')
- ## if parse, create nodes for the sentence, lexicalize,
- ## and create arc variables and empty nodes
- if process in (PARSE, TRANSLATE):
- self.create_nodes()
- print('Lexicalizing')
- # Lexicalize the nodes and find any groups
- self.lexicalize()
- # Create variables for each possible arc
- self.create_arc_variables()
- # Create empty nodes if called for in node entries
- self.create_empty_nodes()
-
- ### Instantiate principles, create remaining constraints
- if pause: input('CONTINUE? ')
- print('Instantiating principles')
- # Implement principles (creating constraints) for each dimension
- for dimension in self.dimensions:
- for principle in dimension.get_principles():
- principle()
-
- def set_dimensions(self, dimensions=None):
- '''Assign the problem's dimensions.
- @param dimensions: list of dimensions if all of those associated
- with the language are not to be included
- @type dimensions: None or a list of dimension abbreviations
(strings)
- or, if the language is a multiling, a list
- of language, dimension list pairs (all strings)
- '''
- self.dimensions = []
- # The language is a multiling: the dimensions are those found in
the
- # multiling's lexicon, which can include both arc and interface
dimensions
- # within each language, as well as one (or more?) interface
dimensions
- # connecting dimensions of the languages
- if self.language and isinstance(self.language,
languages.Multiling):
- # The two languages
- language1 = self.language.languages[0]
- language2 = self.language.languages[1]
- # List of (language, dimension list) pairs
- dims = dimensions or self.language.dimensions
- interlingua_dim = None
- if self.language.interlingua:
- dim_abbrev = self.language.interlingua
- dim_class = ARC_DIMENSION_CLASSES.get(dim_abbrev)
- interlingua_dim = dim_class(self.language, self,
abbrev=dim_abbrev)
- self.dimensions.append(interlingua_dim)
-# # List of (lang1, lang2, dim) triples for interface
dimensions joining the languages
-# if_dims = self.language.if_dimensions
-# # If there are any cross-dimensions, instantiate them
-# cross_dims = {}
-# for dim_abb in self.language.cross_dimensions:
-# dim_class = ARC_DIMENSION_CLASSES.get(dim_abb)
-# dim = dim_class(self.language, self, abbrev=dim_abb)
-# cross_dims[dim_abb] = dim
-# self.dimensions.append(dim)
- # Create language-specific dimensions
- for lang_abbrev, dimensions in dims:
- # The Language object
- language = languages.get_language(lang_abbrev)
- self.set_language_dimensions(language,
dimensions=dimensions,
-
interlingua=self.language.interlingua,
-
interlingua_dim=interlingua_dim)
-# cross_dims=cross_dims,
-# dim_prefix=lang_abbrev)
-# # Create interface dimensions
-# for lang1_abb, lang2_abb, dim_abb in if_dims:
-# # The interface dimension class and abbreviations for the
two dimensions it joins
-# if_dim_class, dim1_abb, dim2_abb =
IF_DIMENSION_CLASSES.get(dim_abb)
-# # The dimension objects to be associated with this
interface dimension
-# dim1 = [dim for dim in self.dimensions if dim.abbrev ==
lang1_abb + '-' + dim1_abb][0]
-# dim2 = [dim for dim in self.dimensions if dim.abbrev ==
lang2_abb + '-' + dim2_abb][0]
-# # Instantiate the interface dimension, specifying the name
-# dim = if_dim_class(self.language, self, dim1, dim2,
-# abbrev = lang1_abb + '_' + lang2_abb
+ '-' + dim_abb)
-# self.dimensions.append(dim)
- self.input_language = language1
- self.output_language = language2
- else:
- self.set_language_dimensions(self.language,
dimensions=dimensions)
-
- def get_language_dimensions(self, language, dimensions=None):
- """
- Return the dimension classes for a given language or for a list of
- dimension abbreviations.
-
- @param language: language
- @type language: instance of Language
- @param dimensions: list of dimension abbreviations
- @type dimensions: list of strings
- @return lists of pairs of dimension abbreviations and
dimension classes
- @rtype pair of lists of tuples: (string, Dimension subclass)
- """
- # Abbreviations for language's dimensions
- dims = dimensions or language.dimensions
- if_dims = [(dim, IF_DIMENSION_CLASSES.get(dim)) for dim in dims if
dim in IF_DIMENSION_CLASSES]
- arc_dims = [(dim, ARC_DIMENSION_CLASSES.get(dim)) for dim in dims
if dim in ARC_DIMENSION_CLASSES]
- return arc_dims, if_dims
-
- def set_language_dimensions(self, language, dimensions=None,
-# cross_dims=None,
- interlingua='',
- interlingua_dim = None,
- dim_prefix=''):
- """
- Add the dimensions for a given language to the problem dimensions.
- @param language: language (either the only one or one of the
sub-languages
- in a multiling)
- @type language: instance of Language
- @param dimensions: None or a list of dimension abbreviations
which
- override the dimension of language
- @type dimensions: list of strings
- @param cross_dims: (abbrev, dim class) for all cross-dimensions
- @type cross_dims: list of (string, Dimension class) pairs
- @param dim_prefix: language name to prefix to dimension
abbreviation if
- the language is a multiling sub-language
- @type dim_prefix: string
- """
- arc_dim_classes, if_dim_classes =
self.get_language_dimensions(language, dimensions=dimensions)
- arc_dims = {}
- if_dims = []
-# cross_dims = cross_dims or {}
- for arc_dim_abbrev, arc_dim_class in arc_dim_classes:
- if arc_dim_abbrev == interlingua:
- continue
- abbrev = arc_dim_abbrev
- if dim_prefix:
- abbrev = dim_prefix + '-' + abbrev
- dim_inst = arc_dim_class(language, self, abbrev=abbrev)
- arc_dims[arc_dim_abbrev] = dim_inst
- for if_dim_abbrev, (if_dim_class, dim1_abbrev, dim2_abbrev) in
if_dim_classes:
- abbrev = if_dim_abbrev
- if dim_prefix:
- abbrev = dim_prefix + '-' + abbrev
- if dim1_abbrev == interlingua:
- arc_dim1 = interlingua_dim
- else:
- arc_dim1 = arc_dims.get(dim1_abbrev)
-# or cross_dims.get(dim1_abbrev)
- if dim2_abbrev == interlingua:
- arc_dim2 = interlingua_dim
- else:
- arc_dim2 = arc_dims.get(dim2_abbrev)
-# or cross_dims.get(dim2_abbrev)
- if arc_dim1 and arc_dim2:
- if_dims.append(if_dim_class(language, self, arc_dim1,
arc_dim2, abbrev=abbrev))
- else:
- print('No dimensions to relate for', if_dim_abbrev)
- self.dimensions.extend(list(arc_dims.values()) + if_dims)
-
- def create_nodes(self):
- """Create a node for each word in sentence."""
- self.nodes = []
- # Each word is a string or a list of analyses
- for index, word in enumerate(self.sentence):
- self.nodes.append(Node(word=word, index=index, problem=self,
- multiling=self.process==TRANSLATE))
- # The end-of-sentence node
- self.eos = self.nodes[-1]
- self.eos.eos = True
-
- def lexicalize(self):
- """Lexicalize each node and instantiate each of the complete
groups that are found."""
- if not self.lexicon:
- raise ValueError('No lexicon stored!')
- # Keep track of groups found during first stage of lexicalization
- group_objs = {}
- # For all nodes except the end-of-sentence node
- for node in self.get_nodes():
- # Find lexical entries, including groups
- new_groups = node.lexicalize1(self.lexicon, self.dimensions,
self.input_language,
- trace=self.trace)
- for gid, (gwords, lex) in list(new_groups.items()):
- if gid in group_objs:
- # Check to see if the Lex is the same for another Node
- # (we have to check the id because different lexes
could be clones of the same
- # original entry)
- matching_group_lexs = [l for l in group_objs[gid].lex
if lex.id == l.id]
- if matching_group_lexs:
- # If so, append this to the list of nodes there
-
group_objs[gid].lex[matching_group_lexs[0]].append([node, 0])
- else:
- # If not, start a new list of nodes with this node
- group_objs[gid].lex[lex] = [[node, 0]]
- if gwords:
- # Number of words found is not 0; replace old value
- group_objs[gid].nwords = gwords
- else:
- # This group is new; initialize it with lex, node, and
gwords
- group_objs[gid] = Group(problem=self, gid=gid,
nwords=gwords,
- lex={lex: [[node, 0]]})
-
- # Check whether each group found has all of its words
- for gid, group in list(group_objs.items()):
- gwords = group.nwords
-# print('Checking group', group, 'gwords', gwords, 'group.lex',
group.lex)
- if not gwords or len(group.lex) != gwords:
- del group_objs[gid]
-
-# print('Surviving groups', group_objs)
-
- # For each group that survives, add it to the entries of the nodes
in it,
- # and record the entry index for each node that corresponds to the
group.
-
- for gid, group in group_objs.items():
- for lex, nodes in group.lex.items():
- for node_index in nodes:
- node = node_index[0]
- # Index of the group entry for this node
- group_entry = len(node.entries)
- node_index[1] = group_entry
- # Store this node along with the entry index
- node.entries.append(lex)
-
- ## Go back and run lexicalize2 on nodes
- for node in self.get_nodes():
- # Finish lexicalization: entry and agr variables
- node.lexicalize2(self.dimensions)
- # Create entry variable; this applies to all dimensions
- self.addVariable(node.entry_var, list(range(node.n_entries)))
- # Store the entry var in variables dict with index as key
- self.variables[node.index] = node.entry_var
- # Other node variables are specific to particular dimensions
- for dimension in self.dimensions:
- dim_abbrev = dimension.abbrev
- # Node var dict for this dimension
- node_dim_vars = node.vars[dim_abbrev]
- # Dimension variables for problem
- prob_dim_vars = self.dim_vars[dim_abbrev]
- # Only create mother and daughter vars for ArcDimensions
- if isinstance(dimension, ArcDimension):
- # These are needed later when arc variables are created
- node_dim_vars['mother_vars'],
node_dim_vars['daughter_vars'] = [], []
- node_dim_vars['var_daughters'],
node_dim_vars['var_mothers'] = [], []
- # Implement the Agr Principle: only for ID, Syntax;
- # Create agr variables and constraints
- if dimension.has_principle('agreement_principle') and
not node.eos:
- # Create the dict of agr variables if it doesn't
exist.
- if 'agr' not in prob_dim_vars:
- prob_dim_vars['agr'] = {}
- if 'agree' not in prob_dim_vars:
- prob_dim_vars['agree'] = {}
- agr_var = node_dim_vars.get('agr_var')
- # Maximum possible number of agrs over different
entries
- max_agrs = node_dim_vars['max_agrs']
- agree_daugh_var =
node_dim_vars.get('agree_daugh_var')
- agree_daugh_values =
node_dim_vars["agree_daugh_values"]
- # Agrs and agree are dicts of values.
- # agr var values are indices for agr dicts.
- # agree var values are indices for particular
daughter features,
- # for example, sb, ob, pob
- for feat, a_v in agr_var.items():
- max_agr1 = max_agrs[feat]
- if node.index not in prob_dim_vars['agr']:
- prob_dim_vars['agr'][node.index] = {}
- self.addVariable(a_v, list(range(max_agr1)))
- prob_dim_vars['agr'][node.index][feat] = a_v
- # Constrain the agr index to be less than the
lengths of
- # agrs in particular entries if there are any
agrs at all.
- if node.n_entries > 1:
- constr_name = str(node.index) + ':' +
dim_abbrev + ':' + feat + ':Agr'
-
self.addConstraint(XDGConstraint(agr_match_entry_featC(node, feat,
dim_abbrev),
-
name=constr_name, cls='agr'),
- [a_v, node.entry_var])
- for m_feat, a_v in agree_daugh_var.items():
- values1 = agree_daugh_values[m_feat]
- if node.index not in prob_dim_vars['agree']:
- prob_dim_vars['agree'][node.index] = {}
- self.addVariable(a_v, list(values1))
- prob_dim_vars['agree'][node.index][m_feat] =
a_v
-
- # Create the group constraints for each group
- for group_obj in group_objs.values():
- self.group_entry_constraint(group_obj.gid, group_obj)
-
- self.groups = group_objs
-
- def create_empty_nodes(self):
- """Check the outs constraints for each node, creating an empty
node for each '%' or '%!' encountered."""
- # Keep track of the empty nodes created and their heads and arcs
- empty_nodes = {}
- empty_constraints = []
- for dim in self.dimensions:
- if not isinstance(dim, ArcDimension):
- continue
- dim_abbrev = dim.abbrev
- dim_vars = self.dim_vars[dim_abbrev]
- merge_empty = []
- # Only arc dimensions have an "outs" attribute
- for node in self.get_nodes(eos=False):
- for entry_i, entry in enumerate(node.entries):
- entry_dim = entry.dims.get(dim_abbrev)
- if not entry_dim:
- continue
- entry_var = node.entry_var
- for arc, constraint in entry_dim.outs.items():
- if constraint == '%%':
- # Merge empty node constraint, do the merging
after all empty nodes are
- # created
- merge_empty.append((node, entry_var, entry_i,
arc))
- elif isinstance(constraint, str) and '%' in
constraint:
- # Only create the empty node if a suitable one
hasn't already been created
- # from another dimension (that is, one with
the same orig_node and in arc)
- for empty in self.empty:
- if empty.orig_node == node:
- empty_dim =
empty.entries[0].dims.get(dim_abbrev)
- if empty_dim and
empty_dim.ins.get(arc, 0) != 0:
- continue
- empty_node = self.create_empty_node(node,
entry, entry_i, arc, constraint == '%',
-
dim_abbrev, dim_vars, dim)
- if empty_node:
- empty_nodes[empty_node] = {dim: [(node,
entry, entry_i, arc, constraint)]}
-
- self.dummies = []
- n_empty = len(self.empty)
-
- # Create dummy nodes (from %% constraints) to attempt to merge
with existing empty nodes
- for node, entry, entry_i, arc in merge_empty:
- head_vars = node.vars[dim_abbrev]
- n_empty += 1
- empty_index = -n_empty
- head_index = node.index
- head_entry_var = node.entry_var
- arc_var = dim_abbrev + ':' + str(head_index) + '->' +
str(empty_index)
- # The empty node's word label
- label = '%' + arc.upper()
- # Create the dummy empty node
- dummy = Empty(label=label, problem=self, index=empty_index,
- life_arc=arc_var, orig_node=node,
orig_entry_i=entry_i,
- arc_label=arc, dummy=True)
- # Initialize it (for all dimensions)
- dummy.initialize(self.dimensions, self.lexicon)
- # Stop here if no lexical entry was found
- if not dummy.entries:
- return
- # Look for existing nodes that are compatible with the
dummy empty node
- merged = []
- for other_empty in self.empty:
-# print('Attempting to merge {0} with
{1}'.format(dummy, other_empty))
- if other_empty.merge(dummy, self.dimensions):
- empty_nodes[other_empty][dim].append((node, entry,
entry_i, arc, '%%'))
- merged.append(other_empty)
-# print('MERGED', dummy, 'with', other_empty)
-# else:
-# print('FAILED TO MERGE', dummy, 'with',
other_empty)
- empty_constraints.append((dim_abbrev, node, head_vars,
head_entry_var, entry_i, arc, merged))
- self.dummies.append(dummy)
-
- # Create the arc vars and constraints (except %%) for empty nodes
- for empty_node, dims in empty_nodes.items():
- for dim, arc_props in dims.items():
- for head, head_entry, head_entry_i, arc, constraint in
arc_props:
- self.create_empty_vars(head, head_entry, head_entry_i,
empty_node, arc,
- constraint, dim, dim.abbrev)
-
- # Now create the %% constraints
- for dim_abbrev, node, head_vars, entry_var, entry_i, arc,
empty_nodes in empty_constraints:
- self.merge_empty_constraint(node, node.index, entry_var,
entry_i, arc,
- [e.index for e in merged],
dim_abbrev,
- head_vars,
self.dim_vars[dim_abbrev])
-
- def create_empty_node(self, head, entry_var, entry_i, arc, alternate,
dim_abbrev, dim_vars, dim):
- """Create an empty node with an in arc from head with label arc.
- @param alternate: whether there is a choice between the link to
the empty node and a link
- to another non-empty node
- """
- head_vars = head.vars[dim_abbrev]
- empty_index = -(len(self.empty) + 1)
- head_index = head.index
- # Variable for the arc which governs whether the empty node "lives"
- arc_var = dim_abbrev + ':' + str(head_index) + '->' +
str(empty_index)
- # The empty node's word label
- label = '%' + arc.upper()
- # Create the empty node
- empty_node = Empty(label=label, problem=self, index=empty_index,
- life_arc=arc_var, orig_node=head,
orig_entry_i=entry_i,
- arc_label=arc)
- # Initialize it (for all dimensions)
- empty_node.initialize(self.dimensions, self.lexicon)
- # Stop here if no lexical entry was found
- if not empty_node.entries:
- return
-
- # Add the empty node's entry variable
- self.addVariable(empty_node.entry_var,
list(range(len(empty_node.entries))))
- # Add the empty node to the problem's empty node list
- self.empty.append(empty_node)
- empty_vars = empty_node.vars[dim_abbrev]
-
- return empty_node
-
- def create_empty_vars(self, head, head_entry, head_entry_i, empty,
arc, constraint,
- dim, dim_abbrev):
- """Create agr and arc variables for empty node on all relevant
dimensions."""
- # Add any agreement variables on this dimension
- dim_vars = self.dim_vars[dim_abbrev]
- empty_vars = empty.vars[dim_abbrev]
- head_vars = head.vars[dim_abbrev]
- head_entry_var = head.entry_var
- empty_index = empty.index
- head_index = head.index
- if dim.has_principle('agreement_principle'):
- if 'agr' not in dim_vars:
- dim_vars['agr'] = {}
- agr_var = empty_vars.get('agr_var')
- # Maximum possible number of agrs over different entries
- max_agrs = empty_vars['max_agrs']
- # Agrs is a dict of values.
- # agr var values are indices for agr dicts.
- for feat, a_v in agr_var.items():
- if a_v in self._variables:
- # The variable might already have been created
- continue
- max_agr1 = max_agrs[feat]
- if empty_index not in dim_vars['agr']:
- dim_vars['agr'][empty_index] = {}
- self.addVariable(a_v, list(range(max_agr1)))
- dim_vars['agr'][empty_index][feat] = a_v
-
- self.empty_arc(head, head_index, head_entry_var, head_entry_i,
- head_vars, empty, empty_index, empty_vars,
- dim_abbrev, dim_vars, arc, None, constraint)
-
- self.add_empty_dimensions(empty, empty_index,
- head, head_index, head_entry_i,
head_entry_var,
- constraint, dim_abbrev)
-
- def add_empty_dimensions(self, empty_node, empty_index,
- head, head_index, head_entry_i,
head_entry_var,
- arc_constraint, dim1_abbrev):
- '''Create variables for dimensions other than the one that led to
the
- creation of the empty node.'''
- # See if arcs and variables for any other dimensions need to be
created
- for dim in self.dimensions:
- dim_abbrev = dim.abbrev
- if dim_abbrev == dim1_abbrev:
- continue
- head_vars = head.vars[dim_abbrev]
- entry = empty_node.entries[0]
- entry_dim = entry.dims.get(dim_abbrev)
- empty_vars = empty_node.vars[dim_abbrev]
- if not entry_dim or not entry_dim.ins:
- continue
- dim_vars = self.dim_vars[dim_abbrev]
-
- # Add any agreement variables on this dimension
- if dim.has_principle('agreement_principle'):
- if 'agr' not in dim_vars:
- dim_vars['agr'] = {}
- agr_var = empty_vars.get('agr_var')
- # Maximum possible number of agrs over different entries
- max_agrs = empty_vars['max_agrs']
- # Agrs is a dict of values.
- # agr var values are indices for agr dicts.
- for feat, a_v in agr_var.items():
- max_agr1 = max_agrs[feat]
- if empty_index not in dim_vars['agr']:
- dim_vars['agr'][empty_index] = {}
- self.addVariable(a_v, list(range(max_agr1)))
- dim_vars['agr'][empty_index][feat] = a_v
-
- # Get the ins attribute for other dimensions of the empty node
- for arc, constraint in entry_dim.ins.items():
- if constraint == 0:
- continue
- # Create the arc and arc constraint using the constraint
character
- # for the *head* out (not the empty node in), which is
ignored
- # unless 0
- self.empty_arc(head, head_index, head_entry_var,
head_entry_i,
- head_vars, empty_node, empty_index,
empty_vars,
- dim_abbrev, dim_vars, arc, None, constraint)
-
- def empty_arc(self, head, head_index, head_entry_var, head_entry_i,
head_vars,
- empty_node, empty_index, empty_vars,
- dim_abbrev, dim_vars,
- arc, arc_var, constraint):
- '''Create an arc variable into an empty node and do all of the
- necessary bookkeeping.'''
- arc_var = arc_var or dim_abbrev + ':' + str(head_index) + '->' +
str(empty_index)
- var_exists = arc_var in self._variables
- if var_exists:
- # The variable already exists; just add arc as a new values
- # if it's not already in the domain
- if arc not in self._variables[arc_var]:
- self.extendVariable(arc_var, [arc])
- else:
- # Don't add the arc to miscellaneous dicts if it's already
there
- self.addVariable(arc_var, [arc, None])
- dim_vars['arc_daughs'][(head_index, empty_index)] = arc_var
- dim_vars['arc_vars'][arc_var] = (head_index, empty_index)
- empty_vars['mother_vars'].append(arc_var)
- empty_vars['var_mothers'].append(head_index)
- var_index = len(dim_vars['arc_var_list'])
- if head_index not in dim_vars['daughter_arc_vars']:
- dim_vars['daughter_arc_vars'][head_index] = []
- if empty_index not in dim_vars['mother_arc_vars']:
- dim_vars['mother_arc_vars'][empty_index] = []
- dim_vars['daughter_arc_vars'][head_index].append((var_index,
empty_index))
- dim_vars['mother_arc_vars'][empty_index].append((var_index,
head_index))
- dim_vars['arc_var_list'].append(arc_var)
- head_vars['daughter_vars'].append(arc_var)
- head_vars['var_daughters'].append(empty_index)
-
- if '%' in constraint and constraint != '%%':
- self.empty_constraint(head, head_index, head_entry_var,
head_entry_i,
- arc, empty_node, empty_index, constraint,
- dim_abbrev, head_vars, dim_vars)
-
- def empty_constraint(self, head, head_index, entry_var, entry_i,
- arc, empty, empty_index, constraint,
- dim_abbrev, head_vars, dim_vars):
- """If head->empty is arc, there is no other out arc from head with
label arc.
- If head->empty is None, some other out arc from head must have
label arc.
- """
- alternate = constraint == '%'
- head_empty_arc = dim_vars['arc_daughs'][(head_index, empty_index)]
- # Remove head_empty_arc from daughter_arcs
- daughter_arcs = [arc_var for arc_var in
head_vars.get('daughter_vars') if arc_var != head_empty_arc]
-# print('Empty constraint', head, head_index, arc, empty,
constraint)
- constraint = XDGConstraint(empty_arcC(arc, entry_i, alternate,
constraint),
- name=str(head_index) + '[' +
str(entry_i) + ']:' + arc + ':EmptyValency',
- cls='valency')
- self.addConstraint(constraint, [entry_var, head_empty_arc] +
daughter_arcs)
-
- def merge_empty_constraint(self, head, head_index, entry_var, entry_i,
arc,
- empty_indices, dim_abbrev, head_vars,
dim_vars):
- """If head->empty is arc for any of empties, there is no other out
arc from head with label arc.
- If head->empty is None for all empties, some other out arc from
head must have label arc.
- """
- head_empty_arcs = [dim_vars['arc_daughs'][(head_index,
empty_index)] for empty_index in empty_indices]
- n_empty = len(head_empty_arcs)
- # Remove head_empty_arcs from daughter_arcs
- daughter_arcs = [arc_var for arc_var in
head_vars.get('daughter_vars') if arc_var not in head_empty_arcs]
-# print('Merge empty constraint, head:{0}, head_entry: {3},
arc:{1}, empty_indices:{2}'.format(head, arc, empty_indices, entry_i))
- variables = [entry_var] + head_empty_arcs + daughter_arcs
- constraint = XDGConstraint(merge_empty_arcC(arc, entry_i, n_empty),
- name=str(head_index) + '[' +
str(entry_i) + ']:' + arc + ':MergeEmptyValency',
- cls='valency')
- self.addConstraint(constraint, variables)
-
- def group_entry_constraint(self, gid, group_obj):
- """Create the constraint that requires the same entry for all
nodes belonging to group.
- Also add to groups the dict that simplifies group_arc_principle.
-
- @param gid: group id
- @type gid: string
- @param group_obj: a Group object
- """
- # Entry variables for the constraint
- variables = []
- # List of lists of group entries for each node
- group_entries = []
- # List of lists of variable, entry pairs
- for lex, node_ls in group_obj.lex.items():
- group_entry_sublist = []
- for node, index in node_ls:
- entry_var = node.entry_var
- variables.append(entry_var)
- group_entry_sublist.append(index)
- group_entries.append(group_entry_sublist)
- # Exactly one node in each sublist must take the group entry if
any does
-
self.addConstraint(XDGConstraint(group_entries_agreeC(group_entries),
- name=gid + ':Group', cls='group'),
- variables)
-
- def create_arc_variables(self):
- """Create variables for arc labels."""
- # Create separate arcs for each dimension that has them
- for dimension in self.dimensions:
- # Only if this is a dimension that has arcs
- if not isinstance(dimension, ArcDimension):
- continue
- dim_abbrev = dimension.abbrev
- dim_vars = self.dim_vars[dim_abbrev]
- # Initialize dicts for storing vars in problem dimension
- for attrib in
['arc_vars', 'arc_daughs', 'daughter_arc_vars', 'mother_arc_vars']:
- if attrib not in dim_vars:
- dim_vars[attrib] = {}
- if 'arc_var_list' not in dim_vars:
- # List of arc vars
- dim_vars['arc_var_list'] = []
- # Check here for whether dimension has arcs
- # For all nodes except the end-of-sentence node
- for index, node1 in enumerate(self.get_nodes()):
- index1 = node1.index
- str1 = str(index1)
- # Variables for this dimension in node1
- vars1 = node1.vars[dim_abbrev]
- # Outs and ins for node1 on dimension
- outs1 = vars1.get('outs', [])
- ins1 = vars1.get('ins', [])
- haslex1 = node1.n_entries > 0
- # Create arc variables in both directions to other nodes
- for node2 in self.get_nodes()[index+1:]:
- vars2 = node2.vars[dim_abbrev]
- haslex2 = node2.n_entries > 0
- outs2 = vars2.get('outs', [])
- ins2 = vars2.get('ins', [])
- index2 = node2.index
- str2 = str(index2)
- # Intersections of ins and outs of node1 and node2
- if haslex1 and haslex2:
- outs2ins1 = outs2 & ins1
- outs1ins2 = outs1 & ins2
- elif haslex1:
- outs2ins1 = ins1 - set(['root'])
- outs1ins2 = outs1
- else:
- outs2ins1 = outs2
- outs1ins2 = ins2 - set(['root'])
- # Arc into node1, only if outs2 & ins1 is not empty
- if outs2ins1:
- self.make_arc_vars(str1, str2, index1, index2,
vars1, vars2,
- dim_vars, dim_abbrev, outs2ins1)
- # Arc out of node1, only if ins2 & outs1 is not empty
- if outs1ins2:
- self.make_arc_vars(str2, str1, index2, index1,
vars2, vars1,
- dim_vars, dim_abbrev, outs1ins2)
-
- def make_arc_vars(self, str1, str2, index1, index2, vars1, vars2,
dim_vars, dim_abbrev,
- outs2ins1):
- # String name for the arc variable
- var2 = dim_abbrev + ':' + str2 + '->' + str1
- # Values constrained to be labels in the in-out intersection
- self.addVariable(var2, list(outs2ins1) + [None])
- # Store the arc var in variables dict with index pair as key
- # and vice versa
- dim_vars['arc_daughs'][(index2, index1)] = var2
- dim_vars['arc_vars'][var2] = (index2, index1)
- # Add the variable to mother and daughter var lists in nodes
- vars1['mother_vars'].append(var2)
- vars2['daughter_vars'].append(var2)
- # Add daughter index to var_daughters list in mother
- vars2['var_daughters'].append(index1)
- # Add mother index to var_mothers list in daughter
- vars1['var_mothers'].append(index2)
- # Add the var to the variable list
- var_index = len(dim_vars['arc_var_list'])
- if index2 not in dim_vars['daughter_arc_vars']:
- dim_vars['daughter_arc_vars'][index2] = []
- if index1 not in dim_vars['mother_arc_vars']:
- dim_vars['mother_arc_vars'][index1] = []
- dim_vars['daughter_arc_vars'][index2].append((var_index, index1))
- dim_vars['mother_arc_vars'][index1].append((var_index, index2))
- dim_vars['arc_var_list'].append(var2)
-
- def solve(self, dimensions=None, verbose=1, raw=False, draw=True,
- trace_var=''):
- """Find solutions to problem and pretty-print them.
- @param dimensions: dimensions to solve for
- @type dimensions: list of dimension abbrevs (strings)
- @param verbose: whether to print out verbose messages
- @type verbose: int -- 0: terse, 1: summary msg, 2: lots of msgs
- @param raw: whether to return "raw" solutions
- @type raw: boolean
- """
- if dimensions:
- dimensions = [dim for dim in self.dimensions if dim.abbrev in
dimensions]
- else:
- dimensions = self.dimensions
- print('\nSOLVING', end=' ')
- for word in self.sentence:
- if isinstance(word, tuple):
- print(word[0], end=' ')
- else:
- print(word, end=' ')
- print()
- XDGConstraint.calls = 0
- for cls in XDGConstraint.calls_by_class:
- XDGConstraint.calls_by_class[cls] = 0
- if verbose:
- t1 = time.time()
- solutions = self.getSolutions(verbose=verbose, trace_var=trace_var)
- # Record time here to avoid timing printing
- if verbose:
- time_diff = time.time() - t1
- solution_objects = [Solution(self, solution, self.get_nodes(),
self.sentence, self.dimensions) \
- for solution in solutions]
- if len(solutions) > 10:
- print('\nFound', len(solutions), 'solutions')
- elif draw:
- self.print_solutions(solutions, solution_objects)
- if verbose:
- # Print out information about number of calls
- print()
- print('Time: %0.3f ms' % (time_diff * 1000.0,))
- print('Total calls:', XDGConstraint.calls)
- if XDGConstraint.calls_by_class:
- print('Calls by class:')
- calls = list(XDGConstraint.calls_by_class.items())
- calls.sort(key=lambda x: x[1], reverse=True)
- for c in calls:
- print(' ', c[0], c[1])
- if raw:
- return solutions
- else:
- return [(s.nodes, s.arcs) for s in solution_objects]
-
- def get_nodes(self, eos=True, empty=False):
- """Return the sentence's nodes.
- @param eos: whether to include end-of-sentence node
- @type eos: boolean
- @param empty: whether to include empty nodes
- @type empty: boolean
- """
- nodes = [node for node in self.nodes if eos or node != self.eos]
- if empty:
- nodes += self.empty
- return nodes
-
- def get_node(self, index):
- if index < 0:
***The diff for this file has been truncated for email.***
=======================================
--- /xdg/languages/__init__.py Mon Jan 25 05:34:33 2010 UTC
+++ /dev/null
@@ -1,19 +0,0 @@
-"""
-This file is part of L3XDG.
-
- L3XDG is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- L3XDG is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with L3XDG. If not, see <http://www.gnu.org/licenses/>.
-
-Author: Michael Gasser <gas...@cs.indiana.edu>
-"""
-from . import languages
=======================================
--- /xdg/languages/am.py Mon Apr 26 22:12:47 2010 UTC
+++ /dev/null
@@ -1,215 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-This file is part of L3XDG.
-
- L3XDG is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- L3XDG is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with L3XDG. If not, see <http://www.gnu.org/licenses/>.
-
-Author: Michael Gasser <gas...@cs.indiana.edu>
-"""
-# 2009.11.18
-#
-# 2009.11.30
-# Updated to include morphology
-#
-# 2010.04.15
-# Semantics added (some time back)
-# Labels updated to include 'rel': from root into
-# relative clause verb
-
-from .language import *
-from .morpho import *
-from .morpho.geez import *
-
-AMHARIC = Language('Amharic', 'Am', morph_processing=True,
- labels={'syn': ['sb', 'ob', 'top', 'pob', 'adv', 'sub',
-# 'sbrel',
- # links FROM modifiers to noun
- 'det', 'adj', 'rel',
- # del for 0 arguments?
- 'del',
- 'prp', 'mv', None],
-# 'id':
['adj', 'adv', 'comp', 'det', 'iob', 'ob', 'part', 'pmod', 'pob1',
-# 'pob2', 'prpc', 'rel', 'root', 'sub', 'sb', 'vbse', 'vinf', 'vprt',
None],
-# 'lp':
['adjf', 'compf', 'detf', 'fadvf', 'lbf', 'mf1', 'mf2', 'nf', 'padjf',
-# 'padvf', 'prpcf', 'rbf', 'relf', 'root', 'rprof', 'tadvf', 'vf', 'vvf',
None],
- 'sem':
['arg1', 'arg2', 'arg3', 'del', 'rel', 'vmod', 'nmod', 'loc', 'coref',
None]},
- postproc=lambda form: sera2geez(GEEZ_SERA['am'][1],
form, lang='am'),
- preproc=lambda form: geez2sera(GEEZ_SERA['am'][0],
form, lang='am'),
- eos_chars=['።', '፧', '!', '?'],
-
seg_units=[["a", "e", "E", "i", "I", "o", "u", "H", "w", "y", "'", "`", "_", "|", "*"],
- {"b": ["b", "bW"], "c": ["c", "cW"], "C":
["C", "CW"],
- "d": ["d", "dW"], "f": ["f", "fW"], "g":
["g", "gW"],
- "h": ["h", "hW"], "j": ["j", "jW"], "k":
["k", "kW"],
- "l": ["l", "lW"], "m": ["m", "mW"], "n":
["n", "nW"],
- "p": ["p", "pW"], "P": ["P", "PW"],
- "N": ["N", "NW"], "q": ["q", "qW"], "r":
["r", "rW"],
- "s": ["s", "sW"], "S": ["S", "SW"], "t":
["t", "tW"],
- "T": ["T", "TW"], "v": ["v", "vW"], "x":
["x", "xW"],
- "z": ["z", "zW"], "Z": ["Z", "ZW"],
- "^":
["^s", "^S", "^h", "^hW", "^sW", "^SW"]}])
-
-AMHARIC.set_morphology(Morphology((),
- pos_morphs=['cop', 'n', 'v'],
- # Exclude ^ and - (because it can be
used in compounds)
- punctuation=r'[“‘”’–—:;/,<>?.!%$()[\]{}|
#@&*\_+=\"፡።፣፤፥፦፧፨]',
- # Include digits?
- characters=r'[a-zA-Zሀ-ፚ\'`^]'))
-
-# Function that extracts root and relevant features from analysis
-AMHARIC.extract = lambda analyses: extract(analyses)
-
-# Functions that simplifies Amharic orthography
-AMHARIC.morphology.simplify = lambda word: simplify(word)
-AMHARIC.morphology.orthographize = lambda word: orthographize(word)
-
-def simplify(word):
- """Simplify Amharic orthography."""
- word =
word.replace("`", "'").replace('H', 'h').replace('^', '').replace('_', '')
- return word
-
-def orthographize(word):
- '''Convert phonological romanization to orthographic.'''
- word = word.replace('_', '').replace('I', '')
- return word
-
-def v_get_citation(root, fs, simplified=False, guess=False, vc_as=False):
- '''Return the canonical (prf, 3sm) form for the root and featstructs
in featstruct set fss.
-
- If vc_as is True, preserve the voice and aspect of the original word.
- '''
- if root == 'al_e':
- return "'ale"
- # Return root if no citation is found
- result = root
- # Unfreeze the feature structure
- fs = fs.unfreeze()
- # Update the feature structure to incorporate default (with or without
vc and as)
- fs.update(AMHARIC.morphology['v'].citationFS if vc_as else
AMHARIC.morphology['v'].defaultFS)
- # Refreeze the feature structure
- fs.freeze()
- # Find the first citation form compatible with the updated feature
structure
- citation = AMHARIC.morphology['v'].gen(root, fs, from_dict=False,
- simplified=simplified,
guess=guess)
- if citation:
- result = citation[0][0]
- elif not vc_as:
- # Verb may not occur in simplex form; try passive
- fs = fs.unfreeze()
- fs.update({'vc': 'ps'})
- fs.freeze()
- citation = AMHARIC.morphology['v'].gen(root, fs, from_dict=False,
- simplified=simplified,
guess=guess)
- if citation:
- result = citation[0][0]
- return result
-
-def n_get_citation(root, fs, simplified=False, guess=False, vc_as=False):
- '''Return the canonical (prf, 3sm) form for the root and featstructs
in featstruct set fss.
-
- If vc_as is True, preserve the voice and aspect of the original word.
- '''
- if fs.get('v'):
- # It's a deverbal noun
- return v_get_citation(root, fs, simplified=simplified,
guess=guess, vc_as=vc_as)
- else:
- return root
-
-def extract(analyses):
- """Extract relevant features from analyses."""
- results = []
- for analysis in analyses:
- pos = analysis[0]
- if pos == 'n':
- results.append(n_extract(analysis))
- elif pos == 'v':
- results.append(v_extract(analysis))
- elif pos == 'cop':
- results.append(cop_extract(analysis))
- else:
- # Known word with no analysis
- results.append(analysis[1])
- return results
-
-def cop_extract(analysis):
-# root = analysis[1]
-# citation = analysis[2]
- fs = analysis[3]
-# res = FeatStruct()
- res = dict()
- for feat in ['sb']:
- val = fs.get(feat)
- if val != None:
- res[feat] = val
- return 'ነው', res
-
-def v_extract(analysis):
- '''Extract the features we care about from the analysis.'''
-# root = analysis[1]
- citation = analysis[2]
- fs = analysis[3]
-# res = FeatStruct()
- res = dict()
- for feat in ['sb', 'ob', 'tm', 'sub', 'rel', 'neg', 'rl']:
- val = fs.get(feat)
- if val != None:
- if isinstance(val, FeatStruct):
- if val.get('expl', True) != False:
- res[feat] = val
- else:
- res[feat] = val
- return citation, res
-
-def n_extract(analysis):
- '''Extract the features we care about from the analysis.'''
-# root = analysis[1]
- citation = analysis[2]
- fs = analysis[3]
-# res = FeatStruct()
- res = dict()
- res['^'] = FeatStruct()
- for feat in ['rl', 'poss']:
- val = fs.get(feat)
- if val != None and val.get('expl', True) != False:
- res[feat] = val
- for feat in ['def', 'gen', 'plr', 'pp']:
- res['^'][feat] = fs.get(feat)
- return citation, res
-
-## Functions converting between feature structures and simple dicts
-#AMHARIC.morphology['v'].anal_to_dict = lambda root, anal:
v_anal_to_dict(root, anal)
-#AMHARIC.morphology['v'].dict_to_anal = lambda root, anal:
v_dict_to_anal(root, anal)
-
-## Default feature structures for POSMorphology objects
-## Used in generation and production of citation form
-AMHARIC.morphology['v'].defaultFS = \
- FeatStruct("[pos=v,tm=prf,as=smp,vc=smp,"\
-
+ "sb=[-p1,-p2,-plr,-fem],ob=[-expl,-p1,-p2,-plr,-fem,-b,-l,-prp,-frm],"\
-
+ "cj1=None,cj2=None,pp=None,ax=None,-neg,-rel,-sub,-def,-acc,rl=[-acc,-p,-adv,-comp]]")
-AMHARIC.morphology['v'].FS_implic = {'rel': ['sub'], 'cj1': ['sub'], 'pp':
['rel', 'sub'],
- 'def': ['rel', 'sub'], 'l':
['prp'], 'b': ['prp'], 'ob': [['expl']]}
-# defaultFS with voice and aspect unspecified
-AMHARIC.morphology['v'].citationFS = \
-
FeatStruct("[pos=v,tm=prf,sb=[-p1,-p2,-plr,-fem],ob=[-expl],cj1=None,cj2=None,pp=None,ax=None,-neg,-rel,-sub,-def,-acc,rl=[-p,-acc,-adv,-comp]]")
-AMHARIC.morphology['n'].defaultFS = \
-
FeatStruct("[pos=n,-acc,-def,-neg,-fem,as=smp,cnj=None,der=[-ass],-dis,-gen,-plr,poss=[-expl,-p1,-p2,-plr,-fem,-frm],pp=None,v=None,vc=smp,rl=[-acc,-p,-gen]]")
-AMHARIC.morphology['n'].FS_implic = {'poss': [['expl']]}
-# defaultFS with voice and aspect unspecified
-AMHARIC.morphology['n'].citationFS = \
-
FeatStruct("[-acc,-def,-neg,-fem,cnj=None,der=[-ass],-dis,-gen,-plr,poss=[-expl],pp=None,v=inf,rl=[-acc,-p,-gen]]")
-AMHARIC.morphology['cop'].defaultFS = \
-
FeatStruct("[cj2=None,-neg,ob=[-expl],-rel,sb=[-fem,-p1,-p2,-plr,-frm],-sub,tm=prs]")
-
-## Functions that return the citation forms for words
-AMHARIC.morphology['v'].citation = lambda root, fss, simplified, guess,
vc_as: v_get_citation(root, fss, simplified, guess, vc_as)
-AMHARIC.morphology['n'].citation = lambda root, fss, simplified, guess,
vc_as: n_get_citation(root, fss, simplified, guess, vc_as)
=======================================
--- /xdg/languages/am.yaml Mon May 3 00:26:04 2010 UTC
+++ /dev/null
@@ -1,216 +0,0 @@
-# 2010.05.01
-# Removed govern constraint for relative verb -> head noun
-
-- word: ROOT
- syn:
- outs: {root: '!', del: '*'}
- sem:
- outs: {root: '*', rel: '*', del: '*'}
-
-#### Grammar
-### Verbs
-- gram: V
- # Main clause verbs; -sub (subordinate)
- label: V_MAIN
- syn:
- ins: {root: '!'}
- # not subordinate
- agrs: {sub: [False]}
- # all possible outs
- outs: {adv: '*', pob: '*', sb: '?', ob: '?', top: '?'}
- govern: {sb: [rl, '[-acc,-p]'], top: [rl, '[-acc,-p]'], pob:
[rl, '[-acc,+p]'], ob: [rl, '[-p]']}
- # everything precedes the verb but is unordered otherwise
- order: [[adv, ^], [pob, ^], [sb, ^], [ob, ^], [top, ^]]
- sem:
- ins: {root: '!'}
- outs: {vmod: '*', arg1: '?', arg2: '?', arg3: '?'}
- synsem:
- # Default sem-syn role mappings
- lb12s: {arg3: [pob], arg1: [sb, top], arg2: [ob]}
-- gram: V
- # Relative clause verbs; +rel; sb is relativized; so clause sb is empty
- label: V_REL_SB
- syn:
- ins: {root: 0, sb: '?', ob: '?', pob: '?', top: '?'}
- agrs: {rel: [True]}
- # sb features agree with those of modified noun (on sb arc)
- agree: {sb: [sb, ^]}
- # sb must agree with same feature in head and with sb node (noun)
- cross: {sb: [sb]}
- outs: {adv: '*', pob: '*', ob: '?', top: '?', sb: '%%'}
- # verb government, and restriction on modified noun
- # no government of head noun (sb)
- govern: {top: [rl, '[-acc,-p]'], pob: [rl, '[-acc,+p]'], ob:
[rl, '[-p]']}
- # everything precedes the verb, except the noun it modifies (sb)
- order: [[adv, ^], [pob, ^], [^, sb], [ob, ^], [top, ^]]
- sem:
- ins: {rel: '!'}
- outs: {arg1: '!', arg2: '?', arg3: '?', vmod: '*'}
- synsem:
- ldend: {arg1: [sb]}
- lb12s: {arg3: [pob], arg2: [ob]}
- merge: {sb: [arg1, arg2, arg3]}
-- gram: V
- # Relative clause verbs; +rel; ob is relativized; so clause ob is empty
- label: V_REL_OB
- syn:
- ins: {root: 0, sb: '?', ob: '?', pob: '?', top: '?'}
- # Usually definite too, but not always?
- agrs: {rel: [True]}
- # ob features agree with those of modified noun (on ob arc)
- agree: {ob: [ob, ^]}
- # ob must agree with same feature in head and in ob node (noun)
- cross: {ob: [ob]}
- outs: {adv: '*', pob: '*', sb: '?', ob: '%%', top: '?'}
- # verb government, and restriction on modified noun
- # no government of head noun (ob)
- govern: {sb: [rl, '[-acc,-p]'], top: [rl, '[-acc,-p]'], pob:
[rl, '[-acc,+p]']}
- # everything precedes the verb, except the noun it modifies (ob)
- order: [[adv, ^], [pob, ^], [sb, ^], [^, ob], [top, ^]]
- sem:
- ins: {rel: '!'}
- outs: {arg1: '?', arg2: '!', arg3: '?', vmod: '*'}
- synsem:
- ldend: {arg2: [ob]}
- lb12s: {arg1: [sb, top], arg3: [pob]}
- merge: {ob: [arg1, arg2, arg3]}
-## syntactic subjects and topics: V -> {V_SB, V_TOP, V_IMPRS}; determined
lexically
-## V_TOP is not incompatible with V_SB and V_IMPRS: አለው, ደከመው
-# verbs that require syntactic subjects
-- gram: V_SB
- classes: [V]
- syn:
- outs: {sb: '%'}
- agree: {sb: [sb, [^, sb, ob, pob]]}
- sem:
- outs: {arg1: '!'}
-# synsem:
-# lb12s: {arg1: [sb]}
-# verbs that require syntactic topics, agreeing with morphological objects
-# some have subjects; others don't
-# አለው, አመመው, ደከመው
-- gram: V_TOP
- classes: [V]
- syn:
- outs: {top: '%'}
- agree: {top: [ob, [^, sb, ob, pob]]}
- sem:
- outs: {arg1: '%'}
- synsem:
- lb12s: {arg1: [top]}
-# verbs that disallow syntactic subjects
-# ዘነበ, አመመው, ደከመው
-- gram: V_IMPRS
- classes: [V]
- syn:
- outs: {sb: 0}
-## syntactic objects: V -> {V_T, V_TI, V_I}; determined lexically
-# verbs that require syntactic objects, agreeing with morphological objects
-# also disallow topics
-# ወሰደ
-- gram: V_T
- classes: [V]
- syn:
- outs: {ob: '%', top: 0}
- agree: {ob: [ob, [^, sb, ob, pob]]}
- sem:
- outs: {arg2: '!'}
-# synsem:
-# lb12s: {arg2: [ob]}
-# reflexive verbs that permit syntactic objects
-# ታጠበ
-- gram: V_REFL
- classes: [V]
- syn:
- outs: {ob: '?'}
- sem:
- outs: {arg2: '!'}
- synsem:
- lb12s: {arg2: [ob]}
-# verbs that permit syntactic objects
-# በላ, ተጫወተ, አመመው
-- gram: V_TI
- classes: [V]
- syn:
- outs: {ob: '?'}
- agree: {ob: [ob, [^, sb, ob, pob]]}
- sem:
- outs: {arg2: '?'}
- lb12s: {arg2: [ob]}
-# verbs that disallow syntactic objects (and topics)
-# ሄደ, ሞተ, ተደረገ
-- gram: V_I
- classes: [V]
- syn:
- agrs: ['[ob=[-expl]]']
- outs: {ob: 0, top: 0}
- sem:
- outs: {arg2: 0}
-## joint categories
-- gram: V_wesede
- classes: [V_T, V_SB]
-- gram: V_bela
- classes: [V_TI, V_SB]
-- gram: V_mote
- classes: [V_I, V_SB]
-- gram: V_ameme
- classes: [V_TI, V_TOP]
- syn:
- outs: {sb: 0}
-- gram: V_alle
- classes: [V_TOP, V_SB, V_I]
-- gram: V_taTebe
- classes: [V_SB, V_REFL]
-- gram: V_dekeme
- classes: [V_TOP, V_IMPRS]
-- gram: V_zenebe
- classes: [V_IMPRS, V_TI]
-
-### Nouns
-- gram: N
- syn:
- ins: {sb: '?', ob: '?', pob: '?', top: '?', det: '?', adj: '*',
rel: '*'}
- sem:
- # Can come from multiple relative clauses as well as main or
complement clauses
- ins: {arg1: '*', arg2: '*', arg3: '*'}
-- gram: N_C
- classes: [N]
- syn:
- agrs: ['[-p1,-p2]']
- sem:
- outs: {nmod: '*'}
- synsem:
- # mod: {nmod: [rel, adj]}
- # For now treat adjectives differently from relative clauses
- mod: {nmod: [adj]}
-- gram: PRO
- classes: [N]
-
-#### Lexemes
-### Sublexicons
-- sublexicon: am_nouns
-- sublexicon: am_verbs
-
-#### Wordforms
-
-#### Empty nodes
-## The agrs rl features are probably wrong; a merged empty node
-## can be subject and object (both +acc and -acc??)
-- word: '%SB'
- syn:
- ins: {sb: '?'}
- agrs: {^: [T], rl: [T]}
- sem:
- ins: {arg1: '?'}
-- word: '%OB'
- syn:
- ins: {ob: '?'}
- agrs: {^: [T], rl: [T]}
- sem:
- ins: {arg2: '?'}
-- word: '%TOP'
- syn:
- ins: {top: '?'}
- agrs: {^: [T], rl: [T]}
- sem:
- ins: {arg1: '?'}
=======================================
--- /xdg/languages/en.py Sat Apr 3 21:31:24 2010 UTC
+++ /dev/null
@@ -1,39 +0,0 @@
-"""
-This file is part of L3XDG.
-
- L3XDG is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- L3XDG is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with L3XDG. If not, see <http://www.gnu.org/licenses/>.
-
-Author: Michael Gasser <gas...@cs.indiana.edu>
-"""
-# English greatly simplified
-#
-# 2009.10.31
-# Added groups: 'had an argument', 'BREAK the ice'
-#
-# 2010.01.05
-# Added id and lp dimensions
-
-from .language import *
-
-ENGLISH = Language('English', 'En', morph_processing=False,
- labels={'syn':
['sb', 'ob', 'adv', 'det', 'adj', 'prp', 'pob', 'mv', 'rel', 'sub', None],
- 'id':
['adj', 'adv', 'comp', 'det', 'iob', 'ob', 'part', 'pmod', 'pob1',
- 'pob2', 'prpc', 'rel', 'root', 'sub', 'sb', 'vbse', 'vinf', 'vpspt',
- # Beyond diss grammar
- 'vprog', # Progressive -ing
- None],
- 'lp':
['adjf', 'compf', 'detf', 'fadvf', 'lbf', 'mf1', 'mf2', 'nf', 'padjf',
- 'padvf', 'prpcf', 'rbf', 'relf', 'root', 'rprof', 'tadvf', 'vf', 'vvf',
- None],
- 'sem':
['arg1', 'arg2', 'arg3', 'del', 'mod', 'loc', 'coref', None]})
=======================================
--- /xdg/languages/en.yaml Sun May 2 00:45:20 2010 UTC
+++ /dev/null
@@ -1,527 +0,0 @@
-##
-## 2010.01.30: added barrier restriction on relative clause verbs
-##
-## 2010.01.31
-## new relative pronouns, including constraints on that, who, whom
-
-#### End-of-sentence node
-- word: ROOT
- id:
- outs: {root: '!', del: '*'}
- lp:
- outs: {root: '!'}
- sem:
- outs: {root: '*', del: '*'}
-#### Grammar
-### Verbs
-- gram: V
- idlp:
- # Slots for subject, dir object, ind object
- arg: {mf1: [iob], mf2: [ob]}
- ldend: {vf: [sb]}
-## V -> {V_MAIN, V_AUX, V_AUQ}
-## distinguished by order in LP, presence of PRO_WH, vbse/vpspt/vinf/vprog
-- gram: V_MAIN
- classes: [V]
- id:
- outs: {adv: '*', pob1: '?', pmod: '*'}
- lp:
- order: [[compf, rprof, vvf, vf, fadvf, ^, lbf, mf1, mf2, rbf, padvf,
tadvf, nf]]
-- gram: V_AUX
- classes: [V]
- id:
- outs: {vbse: '?', vpspt: '?', vprog: '?', vinf: '?'}
- lp:
- order: [[compf, rprof, vvf, vf, ^, fadvf, lbf, mf1, mf2, rbf, padvf,
tadvf, nf]]
- sem:
- ins: {del: '!'}
-- gram: V_QAUX
- classes: [V]
- id:
- outs: {vbse: '?', vpspt: '?', vprog: '?', vinf: '?'} # what about
progressive -ing?
- lp:
- # Has to have a PRO_WH in the vvf slot
- outs: {vvf: '!'}
- order: [[compf, rprof, vvf, ^, vf, fadvf, lbf, mf1, mf2, rbf, padvf,
tadvf, nf]]
- sem:
- ins: {del: '!'}
-## V -> {V_FIN, V_NONFIN}
-# V_FIN: must have a subject, but not necessarily a vf
-- gram: V_FIN
- classes: [V]
- id:
- outs: {sb: '!'}
- agree: [[sb, sb, ^]]
- lp:
- outs: {lbf: '?', fadvf: '*', mf1: '?', mf2: '?', rbf: '*',
padvf: '*', tadvf: '*', nf: '?'}
- idlp:
- blocks: [adv, comp, pmod, sub, vbse, vinf, vprog, vpspt]
- sem:
- outs: {arg1: '!'}
- idsem:
- lb12s: {arg1: [sb]}
-## V_FIN -> {V_FIN_ROOT, V_FIN_REL, V_FIN_SUB}
-- gram: V_FIN_ROOT
- classes: [V_FIN]
- id:
- ins: {root: '!'} # '?' in Diss: ch. 9
- lp:
- ins: {root: '!'} # '?' in Diss: ch. 9
- outs: {vvf: '?', vf: '!'}
- sem:
- ins: {root: '!'}
-- gram: V_FIN_REL
- classes: [V_FIN]
- id:
- ins: {rel: '!'} # '?' in Diss: ch. 9
- lp:
- ins: {relf: '!'} # '?' in Diss: ch. 9
- outs: {rprof: '!', vf: '?'} # Doesn't handle the case of no relpro
- sem:
- ins: {mod: '!'}
-# You can't extract anything from a relative clause.
-# NOT in Diss grammar.
-# Needed to prevent:
-# people that eat eat yogurt .
-# ----ob--->
- idlp:
- blocks: [sb, ob]
- sem:
- ins: {mod: '!'}
-- gram: V_FIN_SUB
- classes: [V_FIN]
- id:
- ins: {sub: '!'} # '?' in Diss: ch. 9
- outs: {comp: '?'}
- lp:
- ins: {vvf: '?', nf: '?'}
- outs: {compf: '?', vf: '?'}
-## V_FIN_ROOT -> {V_AUX_FIN (1), V_MAIN_FIN}
-## V_AUX -> {V_AUX_FIN (2)}
-## V_MAIN -> {V_MAIN_FIN (2)}
-- gram: V_AUX_FIN
- label: V_AUX_FIN_ROOT
- classes: [V_AUX, V_FIN_ROOT]
-- gram: V_AUX_FIN
- label: V_AUX_FIN_REL
- classes: [V_AUX, V_FIN_REL]
-- gram: V_MAIN_FIN
- label: V_MAIN_FIN_ROOT
- classes: [V_MAIN, V_FIN_ROOT]
-- gram: V_MAIN_FIN
- label: V_MAIN_FIN_REL
- classes: [V_MAIN, V_FIN_REL]
-- gram: V_MAIN_FIN
- label: V_MAIN_FIN_SUB
- classes: [V_MAIN, V_FIN_SUB]
-# Non-finite verbs
-- gram: V_NONFIN
- classes: [V]
-## V_NONFIN -> {V_BSE, V_INF, V_PSPT, V_ING}
-# Base form infinitives
-- gram: V_BSE
- classes: [V_NONFIN]
- id:
- ins: {vbse: '!'}
- lp:
- ins: {lbf: '?'}
- outs: {lbf: '?'}
-# Full infinitives
-- gram: V_INF
- classes: [V_NONFIN]
- id:
- ins: {vinf: '!'}
- outs: {part: '!'}
- lp:
- ins: {rbf: '?'}
- outs: {vvf: '!', lbf: '?', fadvf: '*', mf1: '?', mf2: '?', rbf: '*',
padvf: '*', tadvf: '*', nf: '?'}
-# Past participles; later split into (adverbial) participles and passive
participles
-- gram: V_PSPT
- classes: [V_NONFIN]
- id:
- ins: {vpspt: '!'}
- lp:
- ins: {lbf: '?'}
- outs: {lbf: '?'}
-# V_ING -> {V_GER | V_PROG | V_PRPT}
-# gerunds
-- gram: V_ING
- classes: [V_NONFIN]
- label: V_GER
-# progressive -ing
-- gram: V_ING
- classes: [V_NONFIN]
- label: V_PROG
- id:
- ins: {vprog: '!'}
- lp:
- ins: {lbf: '?'}
- outs: {lbf: '?'}
-# adverbial present participles (where do these go in LP?)
-- gram: V_ING
- classes: [V_NONFIN]
- label: V_PRPT
-## Verb valency classes
-- gram: V_T
- classes: [V_MAIN]
- id:
- outs: {ob: '!'}
- sem:
- outs: {arg2: '!', arg3: 0}
- idsem:
- lb12s: {arg1: [sb], arg2: [ob]}
-- gram: V_TI
- classes: [V_MAIN]
- id:
- outs: {ob: '?'}
- sem:
- outs: {arg2: '?', arg3: 0}
- idsem:
- lb12s: {arg1: [sb], arg2: [ob]}
-## Ditransitive verbs; inherit from transitive to get object
-- gram: V_DT
- classes: [V_T]
- id:
- outs: {iob: '!'}
- lp:
- outs: {mf1: '!'}
- sem:
- outs: {arg3: '!'}
- idsem:
- lb12s: {arg1: [sb], arg2: [ob], arg3: [iob]}
-## Intransitive verbs: no object
-- gram: V_I
- classes: [V_MAIN]
- id:
- outs: {ob: 0, iob: 0}
- sem:
- outs: {arg2: 0, arg3: 0}
-## Complement verbs
-- gram: V_SUBC
- classes: [V_I]
- id:
- outs: {sub: '!'}
-
-## Passive verbs (complete later; for now just a place to keep pob2)
-- gram: V_PAS
- id:
- outs: {pob2: '?'}
-## V_MAIN_FIN -> {V_3SG, V_NON3SG, V_PS}
-# subclasses of finite verbs for particular tenses
-- gram: V_3SG
- classes: [V_MAIN_FIN]
- id:
- agrs: {sb: ['[-p1,-p2, -pl]'],
- tns: ['pres']}
-- gram: V_NON3SG
- classes: [V_MAIN_FIN]
- id:
- agrs: {sb: ['[+p1,-p2]', '[-p1,+p2]', '[-p1,-p2,+pl]'],
- tns: ['pres']}
-- gram: V_PS
- classes: [V_MAIN_FIN]
- id:
- agrs: {tns: ['past']}
-## ambiguity of verb stem
-- gram: V_STEM
- label: V_STEM_FIN
- classes: [V_NON3SG]
-- gram: V_STEM
- label: V_STEM_BSE
- classes: [V_BSE]
-- gram: V_STEM
- label: V_STEM_INF
- classes: [V_INF]
-### Nouns
-# N: all possible arcs
-- gram: N
- id:
- ins: {sb: '?', ob: '?', iob: '?', prpc: '?'}
- outs: {adj: '*', det: '?', rel: '*', pmod: '*'}
- lp:
- ins: {vvf: '?', vf: '?', mf1: '?', mf2: '?', prpcf: '?'}
- outs: {detf: '?', adjf: '*', padjf: '?', relf: '?'}
- order: [[detf, adjf, ^, padjf, relf]]
- idlp:
- arg: {detf: [det], adjf: [adj], padjf: [pmod], relf: [rel]}
- blocks: [det, adj, pmod, rel]
- sem:
- ins: {arg1: '?', arg2: '?', arg3: '?'}
- outs: {mod: '*'} # , coref: '*'}
- idsem:
- # Is this the right linking principle?
- lb12s: {mod: [rel, adj]}
-# # Doesn't work because LB12S doesn't constrain the dim2 link to be
at the end
-# lb12s: {coref: [sb, ob, pob]}
-# Personal pronouns shouldn't inherit from N because of case
-# restrictions.
-## N -> {PRO, N_P, N_C, N_MS, N_ADJ}
-# Pronouns: no det, adj, rel, pmod
-- gram: PRO
- classes: [N]
- id:
- outs: {adj: 0, rel: 0, det: 0, pmod: 0}
- lp:
- outs: {adjf: 0, relf: 0, detf: 0, padjf: 0}
- sem:
- outs: {mod: 0}
-# Proper nouns: no det, adj, or pmod NORMALLY
-- gram: N_P
- classes: [N]
- id:
- outs: {det: 0, adj: 0}
- agrs: ['[-p1,-p2, -pl]']
- lp:
- outs: {detf: 0, adjf: 0}
-# Common nouns
-- gram: N_C
- classes: [N]
-# Count nouns
-- gram: N_CT
- classes: [N_C]
-# Adjectives behaving like nouns: "the rich"; how to constrain det?
-- gram: N_ADJ
- classes: [N]
- id:
- # Almost always for people
- agrs: ['[-p1,-p2, +pl,+hum]']
- outs: {det: '!', rel: 0, adj: 0, pmod: 0}
- lp:
- outs: {detf: '!', relf: 0, adjf: 0, padjf: 0}
-## PRO -> {PRO_PRS, PRO_REL, PRO_WH}
-- gram: PRO_PRS
- classes: [PRO]
- id:
- agrs: ['[+hum]']
-- gram: PRO_REL
- classes: [PRO]
- lp:
- ins: {rprof: '?', prepcf: '?', vvf: 0, vf: 0, mf1: 0, mf2: 0} # what
about det, mf: ...whose book ... ?
- idlp:
- # Relative pronoun slot is for obj or subj or prep comp
- arg: {rprof: [sb, ob, iob, prpc]}
-# sem:
-# # Coreference link from modified noun
-# ins: {'coref': '!'}
-# who, whom, what (inherit some of this from N later)
-- gram: PRO_WH
- classes: [PRO]
- lp:
- ins: {vvf: '?', vf: '?', prepcf: '?'}
-## N_CT -> {N_PL, N_SG}
-## N_SG -> {N_SG_CT, N_MS}
-# Plural nouns: 3p agreement
-- gram: N_PL
- classes: [N_CT]
- id:
- agrs: ['[-p1,-p2, +pl]']
-# N_SG -> {N_MS, N_SG_CT}
-# Singular nouns: 3s agreement
-- gram: N_SG
- classes: [N_C]
- id:
- agrs: ['[-p1,-p2, -pl]']
-# Mass nouns: 3s agreement; how to prevent indef det?
-- gram: N_MS
- classes: [N_SG]
- id:
- agrs: ['[-hum]']
-# Singular count nouns: must have a determiner
-- gram: N_SG_CT
- classes: [N_SG, N_CT]
- id:
- outs: {det: '!'}
- agrs: ['[-p1,-p2, -pl]']
- lp:
- outs: {detf: '!'}
-# N_P -> {N_P_PL, N_P_NM}
-# place names
-- gram: N_P_PLC
- classes: [N_P]
- id:
- agrs: ['[-hum]']
-# human names
-- gram: N_P_HUM
- classes: [N_P]
- id:
- agrs: ['[+hum]']
-### Adverbs
-- gram: ADV
- id:
- ins: {adv: '!'}
- sem:
- ins: {root: '?', mod: '?'}
-# frequency: often, etc.
-- gram: ADV_F
- classes: [ADV]
- lp:
- ins: {fadvf: '!'}
- idlp:
- arg: {fadvf: [adv]} # or ldend??
-# "time": now, etc.
-- gram: ADV_T
- classes: [ADV]
- lp:
- ins: {tadvf: '!'}
- idlp:
- arg: {tadvf: [adv]} # or ldend??
-# place or manner: carefully, etc.
-- gram: ADV_P
- classes: [ADV]
- lp:
- ins: {padvf: '!'}
- idlp:
- arg: {padvf: [adv]} # or ldend??
-### Adjectives
-- gram: ADJ
- id:
- ins: {adj: '!'}
- lp:
- ins: {adjf: '!'}
- sem:
- ins: {mod: '!'}
-- gram: ADJer
- classes: [ADJ]
-- gram: ADJest
- classes: [ADJ]
-- gram: ADJ_PRED
- classes: [ADJ]
-- gram: ADJ_ATR
- classes: [ADJ]
-### Prepositions (include pob2?)
-- gram: PREP
- id:
- ins: {pob1: '?', pob2: '?', pmod: '?'}
- outs: {prpc: '!'}
- lp:
- outs: {prpcf: '?'} # ? because the pcomp may end up elsewhere
- order: [[^, prpcf]]
- idlp:
- # Slot for prep comp
- arg: {prpcf: [prpc]}
- # Let all prepositions be handled by case, etc.
- sem:
- ins: {del: '!'}
-## PREP -> {POBJ, PMOD}
-- gram: POBJ
- classes: [PREP]
- id:
- ins: {pob1: '?', pob2: '?', pmod: 0}
- lp:
- ins: {rbf: '?', vvf: '?', rprof: '?'}
-- gram: PMOD
- classes: [PREP]
- id:
- ins: {pmod: '!', pob1: 0, pob2: 0}
-## PMOD -> {PADJ, PADV}
-- gram: PADJ
- classes: [PMOD]
- lp:
- ins: {padjf: '!'} # ? in Diss: ch. 9
-- gram: PADV
- classes: [PMOD]
- lp:
- ins: {padvf: '?', vvf: '?', rprof: '?'}
-### Determiners
-- gram: DET
- id:
- ins: {det: '!'}
- lp:
- ins: {detf: '!'}
- # This shouldn't really happen to demonstratives and possessives
- sem:
- ins: {del: '!'}
-### Particles
-- gram: PART
- id:
- ins: {part: '!'}
-### Conjunctions
-- gram: CONJ
- id:
- ins: {}
- lp:
- ins: {}
-
-#### Lexemes
-
-#### Words
-
-### Verbs
-
-### Particles
-# infinitive "to"
-- word: to
- classes: [PART]
- lp:
- ins: {vvf: '!'}
- sem:
- ins: {del: '!'}
-### Nouns
-
-# Test novel word categories
-#- word: scrabbits
-# classes: [V]
-# old: ambiguous (no N_ADJ in sublexicons yet)
-- word: old
- classes: [N_ADJ]
-### Pronouns
-### Complementizers
-- word: that
- id:
- ins: {comp: '?'}
- lp:
- ins: {compf: '?'}
- idlp:
- ldend: {compf: [comp]}
- sem:
- ins: {del: '!'}
-### Adverbs
-### Adjectives
-### Determiners
-- word: the
- classes: [DET]
-- word: an
- classes: [DET]
-### Prepositions
-
-#### Groups
-# BREAK the ice
-- - lexeme: ~V~break
- classes: [V_T]
- id:
- groupouts: {ob: ice}
- - word: the
- classes: [DET]
- - word: ice
- classes: [N_MS]
- id:
- groupouts: {det: the}
-# LEAD the way
-- - lexeme: ~V~lead
- classes: [V_T]
- id:
- groupouts: {ob: way}
- - word: the
- classes: [DET]
- - word: way
- classes: [N_SG_CT, ~N~way]
- id:
- groupouts: {det: the}
-# had an argument
-- - word: had
- classes: [V_PS, ~V~have]
- id:
- groupouts: {ob: argument}
- - word: an
- classes: [DET]
- - word: argument
- classes: [N_SG_CT, ~N~argument]
- id:
- groupouts: {det: an}
-
-#### Sublexicons
-- sublexicon: en_nouns
-- sublexicon: en_nouns_prop
-- sublexicon: en_verbs
-- sublexicon: en_adjs
-- sublexicon: en_misc
=======================================
--- /xdg/languages/en_adjs.yaml Mon Apr 12 05:47:13 2010 UTC
+++ /dev/null
@@ -1,17982 +0,0 @@
-- lexeme: ~A~able
- classes: [ADJ]
-- lexeme: ~A~ample
- classes: [ADJ]
-- lexeme: ~A~angry
- classes: [ADJ]
-- lexeme: ~A~bad
- classes: [ADJ]
-- lexeme: ~A~bald
- classes: [ADJ]
-- lexeme: ~A~balmy
- classes: [ADJ]
-- lexeme: ~A~bandy
- classes: [ADJ]
-- lexeme: ~A~base
- classes: [ADJ]
-- lexeme: ~A~bawdy
- classes: [ADJ]
-- lexeme: ~A~beastly
- classes: [ADJ]
-- lexeme: ~A~beefy
- classes: [ADJ]
-- lexeme: ~A~big
- classes: [ADJ]
-- lexeme: ~A~bitchy
- classes: [ADJ]
-- lexeme: ~A~black
- classes: [ADJ]
-- lexeme: ~A~bland
- classes: [ADJ]
-- lexeme: ~A~bleak
- classes: [ADJ]
-- lexeme: ~A~blond
- classes: [ADJ]
-- lexeme: ~A~bloody
- classes: [ADJ]
-- lexeme: ~A~blue
- classes: [ADJ]
-- lexeme: ~A~blunt
- classes: [ADJ]
-- lexeme: ~A~bold
- classes: [ADJ]
-- lexeme: ~A~bonny
- classes: [ADJ]
-- lexeme: ~A~bony
- classes: [ADJ]
-- lexeme: ~A~boozy
- classes: [ADJ]
-- lexeme: ~A~bossy
- classes: [ADJ]
-- lexeme: ~A~bouncy
- classes: [ADJ]
-- lexeme: ~A~brainy
- classes: [ADJ]
-- lexeme: ~A~brash
- classes: [ADJ]
-- lexeme: ~A~brassy
- classes: [ADJ]
-- lexeme: ~A~brave
- classes: [ADJ]
-- lexeme: ~A~brawny
- classes: [ADJ]
-- lexeme: ~A~breezy
- classes: [ADJ]
-- lexeme: ~A~brief
- classes: [ADJ]
-- lexeme: ~A~bright
- classes: [ADJ]
-- lexeme: ~A~brisk
- classes: [ADJ]
-- lexeme: ~A~bristly
- classes: [ADJ]
-- lexeme: ~A~broad
- classes: [ADJ]
-- lexeme: ~A~brown
- classes: [ADJ]
-- lexeme: ~A~bubbly
- classes: [ADJ]
-- lexeme: ~A~bulky
- classes: [ADJ]
-- lexeme: ~A~bumpy
- classes: [ADJ]
-- lexeme: ~A~burly
- classes: [ADJ]
-- lexeme: ~A~busy
- classes: [ADJ]
-- lexeme: ~A~calm
- classes: [ADJ]
-- lexeme: ~A~canny
- classes: [ADJ]
-- lexeme: ~A~catchy
- classes: [ADJ]
-- lexeme: ~A~catty
- classes: [ADJ]
-- lexeme: ~A~chalky
- classes: [ADJ]
-- lexeme: ~A~chatty
- classes: [ADJ]
-- lexeme: ~A~cheap
- classes: [ADJ]
-- lexeme: ~A~cheeky
- classes: [ADJ]
-- lexeme: ~A~chilly
- classes: [ADJ]
-- lexeme: ~A~choosy
- classes: [ADJ]
-- lexeme: ~A~choppy
- classes: [ADJ]
-- lexeme: ~A~chubby
- classes: [ADJ]
-- lexeme: ~A~chunky
- classes: [ADJ]
-- lexeme: ~A~clammy
- classes: [ADJ]
-- lexeme: ~A~classy
- classes: [ADJ]
-- lexeme: ~A~clean
- classes: [ADJ]
-- lexeme: ~A~clear
- classes: [ADJ]
-- lexeme: ~A~clever
- classes: [ADJ]
-- lexeme: ~A~close
- classes: [ADJ]
-- lexeme: ~A~cloudy
- classes: [ADJ]
-- lexeme: ~A~clumsy
- classes: [ADJ]
-- lexeme: ~A~coarse
- classes: [ADJ]
-- lexeme: ~A~cocky
- classes: [ADJ]
-- lexeme: ~A~cold
- classes: [ADJ]
-- lexeme: ~A~comely
- classes: [ADJ]
-- lexeme: ~A~comfy
- classes: [ADJ]
-- lexeme: ~A~common
- classes: [ADJ]
-- lexeme: ~A~cool
- classes: [ADJ]
-- lexeme: ~A~corny
- classes: [ADJ]
-- lexeme: ~A~costly
- classes: [ADJ]
-- lexeme: ~A~cosy
- classes: [ADJ]
-- lexeme: ~A~coy
- classes: [ADJ]
-- lexeme: ~A~cozy
- classes: [ADJ]
-- lexeme: ~A~crafty
- classes: [ADJ]
-- lexeme: ~A~cranky
- classes: [ADJ]
-- lexeme: ~A~crazy
- classes: [ADJ]
-- lexeme: ~A~creaky
- classes: [ADJ]
-- lexeme: ~A~creamy
- classes: [ADJ]
-- lexeme: ~A~creepy
- classes: [ADJ]
-- lexeme: ~A~crinkly
- classes: [ADJ]
-- lexeme: ~A~crisp
- classes: [ADJ]
-- lexeme: ~A~crude
- classes: [ADJ]
-- lexeme: ~A~cruel
- classes: [ADJ]
-- lexeme: ~A~crumbly
- classes: [ADJ]
-- lexeme: ~A~crusty
- classes: [ADJ]
-- lexeme: ~A~cuddly
- classes: [ADJ]
-- lexeme: ~A~curly
- classes: [ADJ]
-- lexeme: ~A~cushy
- classes: [ADJ]
-- lexeme: ~A~cute
- classes: [ADJ]
-- lexeme: ~A~daft
- classes: [ADJ]
-- lexeme: ~A~dainty
- classes: [ADJ]
-- lexeme: ~A~damp
- classes: [ADJ]
-- lexeme: ~A~dank
- classes: [ADJ]
-- lexeme: ~A~dark
- classes: [ADJ]
-- lexeme: ~A~deadly
- classes: [ADJ]
-- lexeme: ~A~deaf
- classes: [ADJ]
-- lexeme: ~A~dear
- classes: [ADJ]
-- lexeme: ~A~deep
- classes: [ADJ]
-- lexeme: ~A~dense
- classes: [ADJ]
-- lexeme: ~A~dewy
- classes: [ADJ]
-- lexeme: ~A~dim
- classes: [ADJ]
-- lexeme: ~A~dingy
- classes: [ADJ]
-- lexeme: ~A~dire
- classes: [ADJ]
-- lexeme: ~A~dirty
- classes: [ADJ]
-- lexeme: ~A~dishy
- classes: [ADJ]
-- lexeme: ~A~dizzy
- classes: [ADJ]
-- lexeme: ~A~dotty
- classes: [ADJ]
-- lexeme: ~A~dowdy
- classes: [ADJ]
-- lexeme: ~A~drab
- classes: [ADJ]
-- lexeme: ~A~draughty
- classes: [ADJ]
-- lexeme: ~A~dreamy
- classes: [ADJ]
-- lexeme: ~A~dreary
- classes: [ADJ]
-- lexeme: ~A~dressy
- classes: [ADJ]
-- lexeme: ~A~drowsy
- classes: [ADJ]
-- lexeme: ~A~drunk
- classes: [ADJ]
-- lexeme: ~A~dry
- classes: [ADJ]
-- lexeme: ~A~dull
- classes: [ADJ]
-- lexeme: ~A~dumb
- classes: [ADJ]
-- lexeme: ~A~dusty
- classes: [ADJ]
-- lexeme: ~A~early
- classes: [ADJ]
-- lexeme: ~A~earthy
- classes: [ADJ]
-- lexeme: ~A~easy
- classes: [ADJ]
-- lexeme: ~A~edgy
- classes: [ADJ]
-- lexeme: ~A~eerie
- classes: [ADJ]
-- lexeme: ~A~empty
- classes: [ADJ]
-- lexeme: ~A~faddy
- classes: [ADJ]
-- lexeme: ~A~faint
- classes: [ADJ]
-- lexeme: ~A~fair
- classes: [ADJ]
-- lexeme: ~A~fancy
- classes: [ADJ]
-- lexeme: ~A~far
- classes: [ADJ]
-- lexeme: ~A~fast
- classes: [ADJ]
-- lexeme: ~A~fat
- classes: [ADJ]
-- lexeme: ~A~fatty
- classes: [ADJ]
-- lexeme: ~A~feeble
- classes: [ADJ]
-- lexeme: ~A~few
- classes: [ADJ]
-- lexeme: ~A~fierce
- classes: [ADJ]
-- lexeme: ~A~filthy
- classes: [ADJ]
-- lexeme: ~A~fine
- classes: [ADJ]
-- lexeme: ~A~firm
- classes: [ADJ]
-- lexeme: ~A~fishy
- classes: [ADJ]
-- lexeme: ~A~fit
- classes: [ADJ]
-- lexeme: ~A~fizzy
- classes: [ADJ]
-- lexeme: ~A~flabby
- classes: [ADJ]
-- lexeme: ~A~flaky
- classes: [ADJ]
-- lexeme: ~A~flashy
- classes: [ADJ]
-- lexeme: ~A~flat
- classes: [ADJ]
-- lexeme: ~A~flimsy
- classes: [ADJ]
-- lexeme: ~A~floppy
- classes: [ADJ]
-- lexeme: ~A~flowery
- classes: [ADJ]
-- lexeme: ~A~fluffy
- classes: [ADJ]
-- lexeme: ~A~foggy
- classes: [ADJ]
-- lexeme: ~A~fond
- classes: [ADJ]
-- lexeme: ~A~foul
- classes: [ADJ]
-- lexeme: ~A~foxy
- classes: [ADJ]
-- lexeme: ~A~frail
- classes: [ADJ]
-- lexeme: ~A~frank
- classes: [ADJ]
-- lexeme: ~A~freaky
- classes: [ADJ]
-- lexeme: ~A~free
- classes: [ADJ]
-- lexeme: ~A~fresh
- classes: [ADJ]
-- lexeme: ~A~friendly
- classes: [ADJ]
-- lexeme: ~A~frisky
- classes: [ADJ]
-- lexeme: ~A~frosty
- classes: [ADJ]
-- lexeme: ~A~frothy
- classes: [ADJ]
-- lexeme: ~A~fruity
- classes: [ADJ]
-- lexeme: ~A~full
- classes: [ADJ]
-- lexeme: ~A~funny
- classes: [ADJ]
-- lexeme: ~A~furry
- classes: [ADJ]
-- lexeme: ~A~fussy
- classes: [ADJ]
-- lexeme: ~A~fuzzy
- classes: [ADJ]
-- lexeme: ~A~gassy
- classes: [ADJ]
-- lexeme: ~A~gaudy
- classes: [ADJ]
-- lexeme: ~A~gawky
- classes: [ADJ]
-- lexeme: ~A~gay
- classes: [ADJ]
-- lexeme: ~A~gentle
- classes: [ADJ]
-- lexeme: ~A~ghastly
- classes: [ADJ]
-- lexeme: ~A~giddy
- classes: [ADJ]
-- lexeme: ~A~gloomy
- classes: [ADJ]
-- lexeme: ~A~glossy
- classes: [ADJ]
-- lexeme: ~A~glum
- classes: [ADJ]
-- lexeme: ~A~good
- classes: [ADJ]
-- lexeme: ~A~gooey
- classes: [ADJ]
-- lexeme: ~A~gory
- classes: [ADJ]
-- lexeme: ~A~grand
- classes: [ADJ]
-- lexeme: ~A~grassy
- classes: [ADJ]
-- lexeme: ~A~grave
- classes: [ADJ]
-- lexeme: ~A~gray
- classes: [ADJ]
-- lexeme: ~A~greasy
- classes: [ADJ]
-- lexeme: ~A~great
- classes: [ADJ]
-- lexeme: ~A~greedy
- classes: [ADJ]
-- lexeme: ~A~green
- classes: [ADJ]
-- lexeme: ~A~grey
- classes: [ADJ]
-- lexeme: ~A~grim
- classes: [ADJ]
-- lexeme: ~A~grimy
- classes: [ADJ]
-- lexeme: ~A~gritty
- classes: [ADJ]
-- lexeme: ~A~groovy
- classes: [ADJ]
-- lexeme: ~A~grotty
- classes: [ADJ]
-- lexeme: ~A~grubby
- classes: [ADJ]
-- lexeme: ~A~gruff
- classes: [ADJ]
-- lexeme: ~A~grumpy
- classes: [ADJ]
-- lexeme: ~A~guilty
- classes: [ADJ]
-- lexeme: ~A~hairy
- classes: [ADJ]
-- lexeme: ~A~handsome
- classes: [ADJ]
-- lexeme: ~A~handy
- classes: [ADJ]
-- lexeme: ~A~happy
- classes: [ADJ]
-- lexeme: ~A~hard
- classes: [ADJ]
-- lexeme: ~A~hardy
- classes: [ADJ]
-- lexeme: ~A~harsh
- classes: [ADJ]
-- lexeme: ~A~haughty
- classes: [ADJ]
-- lexeme: ~A~hazy
- classes: [ADJ]
-- lexeme: ~A~healthy
- classes: [ADJ]
-- lexeme: ~A~hearty
- classes: [ADJ]
-- lexeme: ~A~heavy
- classes: [ADJ]
-- lexeme: ~A~hefty
- classes: [ADJ]
-- lexeme: ~A~high
- classes: [ADJ]
-- lexeme: ~A~hilly
- classes: [ADJ]
-- lexeme: ~A~hoarse
- classes: [ADJ]
-- lexeme: ~A~holy
- classes: [ADJ]
-- lexeme: ~A~homely
- classes: [ADJ]
-- lexeme: ~A~hot
- classes: [ADJ]
-- lexeme: ~A~humble
- classes: [ADJ]
-- lexeme: ~A~hungry
- classes: [ADJ]
-- lexeme: ~A~husky
- classes: [ADJ]
-- lexeme: ~A~icy
- classes: [ADJ]
-- lexeme: ~A~idle
- classes: [ADJ]
-- lexeme: ~A~jolly
- classes: [ADJ]
-- lexeme: ~A~juicy
- classes: [ADJ]
-- lexeme: ~A~keen
- classes: [ADJ]
-- lexeme: ~A~kind
- classes: [ADJ]
-- lexeme: ~A~kindly
- classes: [ADJ]
-- lexeme: ~A~kinky
- classes: [ADJ]
-- lexeme: ~A~knotty
- classes: [ADJ]
-- lexeme: ~A~lame
- classes: [ADJ]
-- lexeme: ~A~lanky
- classes: [ADJ]
-- lexeme: ~A~large
- classes: [ADJ]
-- lexeme: ~A~late
- classes: [ADJ]
-- lexeme: ~A~lazy
- classes: [ADJ]
-- lexeme: ~A~leafy
- classes: [ADJ]
-- lexeme: ~A~leaky
- classes: [ADJ]
-- lexeme: ~A~lean
- classes: [ADJ]
-- lexeme: ~A~lengthy
- classes: [ADJ]
-- lexeme: ~A~lewd
- classes: [ADJ]
-- lexeme: ~A~light
- classes: [ADJ]
-- lexeme: ~A~likely
- classes: [ADJ]
-- lexeme: ~A~little
- classes: [ADJ]
-- lexeme: ~A~lively
- classes: [ADJ]
-- lexeme: ~A~loamy
- classes: [ADJ]
-- lexeme: ~A~lofty
- classes: [ADJ]
-- lexeme: ~A~lonely
- classes: [ADJ]
-- lexeme: ~A~long
- classes: [ADJ]
-- lexeme: ~A~loony
- classes: [ADJ]
-- lexeme: ~A~loose
- classes: [ADJ]
-- lexeme: ~A~loud
- classes: [ADJ]
-- lexeme: ~A~lousy
- classes: [ADJ]
-- lexeme: ~A~lovely
- classes: [ADJ]
-- lexeme: ~A~low
- classes: [ADJ]
-- lexeme: ~A~lowly
- classes: [ADJ]
-- lexeme: ~A~lucky
- classes: [ADJ]
-- lexeme: ~A~lumpy
- classes: [ADJ]
-- lexeme: ~A~lush
- classes: [ADJ]
-- lexeme: ~A~lusty
- classes: [ADJ]
-- lexeme: ~A~mad
- classes: [ADJ]
-- lexeme: ~A~mangy
- classes: [ADJ]
-- lexeme: ~A~manly
- classes: [ADJ]
-- lexeme: ~A~marshy
- classes: [ADJ]
-- lexeme: ~A~mean
- classes: [ADJ]
-- lexeme: ~A~meaty
- classes: [ADJ]
-- lexeme: ~A~meek
- classes: [ADJ]
-- lexeme: ~A~mellow
- classes: [ADJ]
-- lexeme: ~A~merry
- classes: [ADJ]
-- lexeme: ~A~messy
- classes: [ADJ]
-- lexeme: ~A~mighty
- classes: [ADJ]
-- lexeme: ~A~mild
- classes: [ADJ]
-- lexeme: ~A~milky
- classes: [ADJ]
-- lexeme: ~A~minute
- classes: [ADJ]
-- lexeme: ~A~misty
- classes: [ADJ]
-- lexeme: ~A~moldy
- classes: [ADJ]
-- lexeme: ~A~moody
- classes: [ADJ]
-- lexeme: ~A~mossy
- classes: [ADJ]
-- lexeme: ~A~mouldy
- classes: [ADJ]
-- lexeme: ~A~mousy
- classes: [ADJ]
-- lexeme: ~A~much
- classes: [ADJ]
-- lexeme: ~A~mucky
- classes: [ADJ]
-- lexeme: ~A~muddy
- classes: [ADJ]
-- lexeme: ~A~murky
- classes: [ADJ]
-- lexeme: ~A~mushy
- classes: [ADJ]
-- lexeme: ~A~musty
- classes: [ADJ]
-- lexeme: ~A~muzzy
- classes: [ADJ]
-- lexeme: ~A~narrow
- classes: [ADJ]
-- lexeme: ~A~nasty
- classes: [ADJ]
-- lexeme: ~A~natty
- classes: [ADJ]
-- lexeme: ~A~naughty
- classes: [ADJ]
-- lexeme: ~A~near
- classes: [ADJ]
-- lexeme: ~A~neat
- classes: [ADJ]
-- lexeme: ~A~needy
- classes: [ADJ]
-- lexeme: ~A~new
- classes: [ADJ]
-- lexeme: ~A~nice
- classes: [ADJ]
-- lexeme: ~A~nifty
- classes: [ADJ]
-- lexeme: ~A~nimble
- classes: [ADJ]
-- lexeme: ~A~noble
- classes: [ADJ]
-- lexeme: ~A~noisy
- classes: [ADJ]
-- lexeme: ~A~nosy
- classes: [ADJ]
-- lexeme: ~A~nutty
- classes: [ADJ]
-- lexeme: ~A~odd
- classes: [ADJ]
-- lexeme: ~A~oily
- classes: [ADJ]
-- lexeme: ~A~old
- classes: [ADJ]
-- lexeme: ~A~pale
- classes: [ADJ]
-- lexeme: ~A~paltry
- classes: [ADJ]
-- lexeme: ~A~pasty
- classes: [ADJ]
-- lexeme: ~A~patchy
- classes: [ADJ]
-- lexeme: ~A~paunchy
- classes: [ADJ]
-- lexeme: ~A~perky
- classes: [ADJ]
-- lexeme: ~A~pesky
- classes: [ADJ]
-- lexeme: ~A~petty
- classes: [ADJ]
-- lexeme: ~A~pimply
- classes: [ADJ]
-- lexeme: ~A~pink
- classes: [ADJ]
-- lexeme: ~A~plain
- classes: [ADJ]
-- lexeme: ~A~pleasant
- classes: [ADJ]
-- lexeme: ~A~plucky
- classes: [ADJ]
-- lexeme: ~A~plummy
- classes: [ADJ]
-- lexeme: ~A~plump
- classes: [ADJ]
-- lexeme: ~A~plush
- classes: [ADJ]
-- lexeme: ~A~podgy
- classes: [ADJ]
-- lexeme: ~A~poky
- classes: [ADJ]
-- lexeme: ~A~polite
- classes: [ADJ]
-- lexeme: ~A~poor
- classes: [ADJ]
-- lexeme: ~A~posh
- classes: [ADJ]
-- lexeme: ~A~potty
- classes: [ADJ]
-- lexeme: ~A~pretty
- classes: [ADJ]
-- lexeme: ~A~pricey
- classes: [ADJ]
-- lexeme: ~A~prickly
- classes: [ADJ]
-- lexeme: ~A~prim
- classes: [ADJ]
-- lexeme: ~A~princely
- classes: [ADJ]
-- lexeme: ~A~prosy
- classes: [ADJ]
-- lexeme: ~A~proud
- classes: [ADJ]
-- lexeme: ~A~pudgy
- classes: [ADJ]
-- lexeme: ~A~puffy
- classes: [ADJ]
-- lexeme: ~A~puny
- classes: [ADJ]
-- lexeme: ~A~pure
- classes: [ADJ]
-- lexeme: ~A~quaint
- classes: [ADJ]
-- lexeme: ~A~queasy
- classes: [ADJ]
-- lexeme: ~A~queer
- classes: [ADJ]
-- lexeme: ~A~quick
- classes: [ADJ]
-- lexeme: ~A~quiet
- classes: [ADJ]
-- lexeme: ~A~racy
- classes: [ADJ]
-- lexeme: ~A~rainy
- classes: [ADJ]
-- lexeme: ~A~randy
- classes: [ADJ]
-- lexeme: ~A~rare
- classes: [ADJ]
-- lexeme: ~A~rash
- classes: [ADJ]
-- lexeme: ~A~ratty
- classes: [ADJ]
-- lexeme: ~A~ready
- classes: [ADJ]
-- lexeme: ~A~red
- classes: [ADJ]
-- lexeme: ~A~reedy
- classes: [ADJ]
-- lexeme: ~A~remote
- classes: [ADJ]
-- lexeme: ~A~rich
- classes: [ADJ]
-- lexeme: ~A~ripe
- classes: [ADJ]
-- lexeme: ~A~risky
- classes: [ADJ]
-- lexeme: ~A~rocky
- classes: [ADJ]
-- lexeme: ~A~roomy
- classes: [ADJ]
-- lexeme: ~A~ropey
- classes: [ADJ]
-- lexeme: ~A~rosy
- classes: [ADJ]
-- lexeme: ~A~rough
- classes: [ADJ]
-- lexeme: ~A~round
- classes: [ADJ]
-- lexeme: ~A~rowdy
- classes: [ADJ]
-- lexeme: ~A~ruddy
- classes: [ADJ]
-- lexeme: ~A~rude
- classes: [ADJ]
-- lexeme: ~A~runny
- classes: [ADJ]
-- lexeme: ~A~rusty
- classes: [ADJ]
-- lexeme: ~A~sad
- classes: [ADJ]
-- lexeme: ~A~safe
- classes: [ADJ]
-- lexeme: ~A~saintly
- classes: [ADJ]
-- lexeme: ~A~salty
- classes: [ADJ]
-- lexeme: ~A~sandy
- classes: [ADJ]
-- lexeme: ~A~sane
- classes: [ADJ]
-- lexeme: ~A~saucy
- classes: [ADJ]
-- lexeme: ~A~scaly
- classes: [ADJ]
-- lexeme: ~A~scanty
- classes: [ADJ]
-- lexeme: ~A~scarce
- classes: [ADJ]
-- lexeme: ~A~scary
- classes: [ADJ]
-- lexeme: ~A~scatty
- classes: [ADJ]
-- lexeme: ~A~schmaltzy
- classes: [ADJ]
-- lexeme: ~A~scrappy
- classes: [ADJ]
-- lexeme: ~A~scratchy
- classes: [ADJ]
-- lexeme: ~A~scrawny
- classes: [ADJ]
-- lexeme: ~A~screwy
- classes: [ADJ]
-- lexeme: ~A~scruffy
- classes: [ADJ]
-- lexeme: ~A~seamy
- classes: [ADJ]
-- lexeme: ~A~seedy
- classes: [ADJ]
-- lexeme: ~A~severe
- classes: [ADJ]
-- lexeme: ~A~sexy
- classes: [ADJ]
-- lexeme: ~A~shabby
- classes: [ADJ]
-- lexeme: ~A~shady
- classes: [ADJ]
-- lexeme: ~A~shaggy
- classes: [ADJ]
-- lexeme: ~A~shaky
- classes: [ADJ]
-- lexeme: ~A~shallow
- classes: [ADJ]
-- lexeme: ~A~shapely
- classes: [ADJ]
-- lexeme: ~A~sharp
- classes: [ADJ]
-- lexeme: ~A~shifty
- classes: [ADJ]
-- lexeme: ~A~shiny
- classes: [ADJ]
-- lexeme: ~A~shoddy
- classes: [ADJ]
-- lexeme: ~A~short
- classes: [ADJ]
-- lexeme: ~A~shrewd
- classes: [ADJ]
-- lexeme: ~A~shrill
- classes: [ADJ]
-- lexeme: ~A~shy
- classes: [ADJ]
-- lexeme: ~A~sickly
- classes: [ADJ]
-- lexeme: ~A~silky
- classes: [ADJ]
-- lexeme: ~A~silly
- classes: [ADJ]
-- lexeme: ~A~simple
- classes: [ADJ]
-- lexeme: ~A~sketchy
- classes: [ADJ]
-- lexeme: ~A~skimpy
- classes: [ADJ]
-- lexeme: ~A~skinny
- classes: [ADJ]
-- lexeme: ~A~slack
- classes: [ADJ]
-- lexeme: ~A~sleazy
- classes: [ADJ]
-- lexeme: ~A~sleek
- classes: [ADJ]
-- lexeme: ~A~sleepy
- classes: [ADJ]
-- lexeme: ~A~slick
- classes: [ADJ]
-- lexeme: ~A~slight
- classes: [ADJ]
-- lexeme: ~A~slim
- classes: [ADJ]
-- lexeme: ~A~slimy
- classes: [ADJ]
-- lexeme: ~A~slippery
- classes: [ADJ]
-- lexeme: ~A~sloppy
- classes: [ADJ]
-- lexeme: ~A~slovenly
- classes: [ADJ]
-- lexeme: ~A~slow
- classes: [ADJ]
-- lexeme: ~A~slummy
- classes: [ADJ]
-- lexeme: ~A~slushy
- classes: [ADJ]
-- lexeme: ~A~sly
- classes: [ADJ]
-- lexeme: ~A~small
- classes: [ADJ]
-- lexeme: ~A~smart
- classes: [ADJ]
-- lexeme: ~A~smelly
- classes: [ADJ]
-- lexeme: ~A~smoky
- classes: [ADJ]
-- lexeme: ~A~smooth
- classes: [ADJ]
-- lexeme: ~A~smug
- classes: [ADJ]
-- lexeme: ~A~smutty
- classes: [ADJ]
-- lexeme: ~A~snappy
- classes: [ADJ]
-- lexeme: ~A~sneaky
- classes: [ADJ]
-- lexeme: ~A~snooty
- classes: [ADJ]
-- lexeme: ~A~snug
- classes: [ADJ]
-- lexeme: ~A~soft
- classes: [ADJ]
-- lexeme: ~A~soggy
- classes: [ADJ]
-- lexeme: ~A~sooty
- classes: [ADJ]
-- lexeme: ~A~soppy
- classes: [ADJ]
-- lexeme: ~A~sorry
- classes: [ADJ]
-- lexeme: ~A~sparse
- classes: [ADJ]
-- lexeme: ~A~speedy
- classes: [ADJ]
-- lexeme: ~A~spicy
- classes: [ADJ]
-- lexeme: ~A~spiky
- classes: [ADJ]
-- lexeme: ~A~spindly
- classes: [ADJ]
-- lexeme: ~A~spongy
- classes: [ADJ]
-- lexeme: ~A~spooky
- classes: [ADJ]
-- lexeme: ~A~spotty
- classes: [ADJ]
-- lexeme: ~A~sprightly
- classes: [ADJ]
-- lexeme: ~A~springy
- classes: [ADJ]
-- lexeme: ~A~stale
- classes: [ADJ]
-- lexeme: ~A~starry
- classes: [ADJ]
-- lexeme: ~A~stately
- classes: [ADJ]
-- lexeme: ~A~steady
- classes: [ADJ]
-- lexeme: ~A~stealthy
- classes: [ADJ]
-- lexeme: ~A~steamy
- classes: [ADJ]
-- lexeme: ~A~steely
- classes: [ADJ]
-- lexeme: ~A~steep
- classes: [ADJ]
-- lexeme: ~A~stern
- classes: [ADJ]
-- lexeme: ~A~sticky
- classes: [ADJ]
-- lexeme: ~A~stiff
- classes: [ADJ]
-- lexeme: ~A~stingy
- classes: [ADJ]
-- lexeme: ~A~stocky
- classes: [ADJ]
-- lexeme: ~A~stodgy
- classes: [ADJ]
-- lexeme: ~A~stony
- classes: [ADJ]
-- lexeme: ~A~stormy
- classes: [ADJ]
-- lexeme: ~A~stout
- classes: [ADJ]
-- lexeme: ~A~strange
- classes: [ADJ]
-- lexeme: ~A~strict
- classes: [ADJ]
-- lexeme: ~A~stringy
- classes: [ADJ]
-- lexeme: ~A~strong
- classes: [ADJ]
-- lexeme: ~A~stuffy
- classes: [ADJ]
-- lexeme: ~A~sturdy
- classes: [ADJ]
-- lexeme: ~A~subtle
- classes: [ADJ]
-- lexeme: ~A~sugary
- classes: [ADJ]
-- lexeme: ~A~sulky
- classes: [ADJ]
-- lexeme: ~A~sultry
- classes: [ADJ]
-- lexeme: ~A~sunny
- classes: [ADJ]
-- lexeme: ~A~sure
- classes: [ADJ]
-- lexeme: ~A~surly
- classes: [ADJ]
-- lexeme: ~A~swanky
- classes: [ADJ]
-- lexeme: ~A~sweet
- classes: [ADJ]
-- lexeme: ~A~swift
- classes: [ADJ]
-- lexeme: ~A~tacky
- classes: [ADJ]
-- lexeme: ~A~tall
- classes: [ADJ]
-- lexeme: ~A~tame
- classes: [ADJ]
-- lexeme: ~A~tangy
- classes: [ADJ]
-- lexeme: ~A~tasty
- classes: [ADJ]
-- lexeme: ~A~tatty
- classes: [ADJ]
-- lexeme: ~A~teeny
- classes: [ADJ]
-- lexeme: ~A~tender
- classes: [ADJ]
-- lexeme: ~A~tense
- classes: [ADJ]
-- lexeme: ~A~tetchy
- classes: [ADJ]
-- lexeme: ~A~thick
- classes: [ADJ]
-- lexeme: ~A~thin
- classes: [ADJ]
***The diff for this file has been truncated for email.***
=======================================
--- /xdg/languages/en_am.py Sun May 2 00:45:20 2010 UTC
+++ /dev/null
@@ -1,19 +0,0 @@
-# English - Amharic
-#
-# 2009.11.15
-#
-# 2010.01.02
-# Modified to suit new constructor for Multiling (English, Amharic replaced
-# by 'en','am')
-#
-# 2010.03.27
-# Added arc labels for cross-dimension Semantics
-#
-# 2010.04.29
-# Working with simplified English and Amharic (en2, am2)
-
-from .language import *
-
-EN_AM = Multiling('EnglishAmharic', 'en_am', langs=['en2', 'am2'],
- interlingua='sem',
- labels={'sem':
['arg1', 'arg2', 'arg3', 'del', 'rel', 'vmod', 'nmod', 'loc', 'coref',
None]})
=======================================
--- /xdg/languages/en_misc.yaml Sun May 2 00:45:20 2010 UTC
+++ /dev/null
@@ -1,6299 +0,0 @@
-# 2010.04.30
-# commented out "that" as ADV
-# (need DEGREE ADVERB category)
-- word: à_la_carte
- classes: [ADV]
-- word: à_la_mode
- classes: [ADV]
-- word: aback
- classes: [ADV]
-- word: abed
- classes: [ADV]
-- word: abjectly
- classes: [ADV]
-- word: ablaze
- classes: [ADV]
-- word: ably
- classes: [ADV]
-- word: abnormally
- classes: [ADV]
-- word: aboard
- classes: [ADV]
-- word: abominably
- classes: [ADV]
-- word: abortively
- classes: [ADV]
-- word: about
- classes: [ADV]
-- word: above
- classes: [ADV]
-- word: above_board
- classes: [ADV]
-- word: abreast
- classes: [ADV]
-- word: abroad
- classes: [ADV]
-- word: abruptly
- classes: [ADV]
-- word: absent-mindedly
- classes: [ADV]
-- word: absently
- classes: [ADV]
-- word: absolutely
- classes: [ADV]
-- word: abstemiously
- classes: [ADV]
-- word: abstractedly
- classes: [ADV]
-- word: abstrusely
- classes: [ADV]
-- word: absurdly
- classes: [ADV]
-- word: abundantly
- classes: [ADV]
-- word: abusively
- classes: [ADV]
-- word: abysmally
- classes: [ADV]
-- word: academically
- classes: [ADV]
-- word: acceptably
- classes: [ADV]
-- word: accidentally
- classes: [ADV]
-- word: accordingly
- classes: [ADV]
-- word: accurately
- classes: [ADV]
-- word: accusingly
- classes: [ADV]
-- word: across
- classes: [ADV]
-- word: actively
- classes: [ADV]
-- word: actually
- classes: [ADV]
-- word: acutely
- classes: [ADV]
-- word: ad_hoc
- classes: [ADV]
-- word: ad_infinitum
- classes: [ADV]
-- word: ad_lib
- classes: [ADV]
-- word: ad_nauseam
- classes: [ADV]
-- word: adagio
- classes: [ADV]
-- word: additionally
- classes: [ADV]
-- word: adequately
- classes: [ADV]
-- word: administratively
- classes: [ADV]
-- word: admirably
- classes: [ADV]
-- word: admiringly
- classes: [ADV]
-- word: admittedly
- classes: [ADV]
-- word: adorably
- classes: [ADV]
-- word: adoringly
- classes: [ADV]
-- word: adrift
- classes: [ADV]
-- word: adroitly
- classes: [ADV]
-- word: advantageously
- classes: [ADV]
-- word: adverbially
- classes: [ADV]
-- word: adversely
- classes: [ADV]
-- word: advisedly
- classes: [ADV]
-- word: aesthetically
- classes: [ADV]
-- word: afar
- classes: [ADV]
-- word: affably
- classes: [ADV]
-- word: affectingly
- classes: [ADV]
-- word: affectionately
- classes: [ADV]
-- word: afield
- classes: [ADV]
-- word: afore
- classes: [ADV]
-- word: aforethought
- classes: [ADV]
-- word: afresh
- classes: [ADV]
-- word: aft
- classes: [ADV]
-- word: after
- classes: [ADV]
-- word: afterwards
- classes: [ADV]
-- word: again
- classes: [ADV]
-- word: aggressively
- classes: [ADV]
-- word: agilely
- classes: [ADV]
-- word: ago
- classes: [ADV]
-- word: agonizingly
- classes: [ADV]
-- word: agreeably
- classes: [ADV]
-- word: aground
- classes: [ADV]
-- word: ahead
- classes: [ADV]
-- word: aimlessly
- classes: [ADV]
-- word: airily
- classes: [ADV]
-- word: akimbo
- classes: [ADV]
-- word: alarmingly
- classes: [ADV]
-- word: alertly
- classes: [ADV]
-- word: alfresco
- classes: [ADV]
-- word: algebraically
- classes: [ADV]
-- word: alias
- classes: [ADV]
-- word: alike
- classes: [ADV]
-- word: all
- classes: [ADV]
-- word: allegedly
- classes: [ADV]
-- word: allegretto
- classes: [ADV]
-- word: allegro
- classes: [ADV]
-- word: alliteratively
- classes: [ADV]
-- word: almost
- classes: [ADV]
-- word: aloft
- classes: [ADV]
-- word: alone
- classes: [ADV]
-- word: along
- classes: [ADV]
-- word: alongside
- classes: [ADV]
-- word: aloof
- classes: [ADV]
-- word: aloud
- classes: [ADV]
-- word: alphabetically
- classes: [ADV]
-- word: already
- classes: [ADV]
-- word: alright
- classes: [ADV]
-- word: also
- classes: [ADV]
-- word: alternately
- classes: [ADV]
-- word: alternatively
- classes: [ADV]
-- word: altogether
- classes: [ADV]
-- word: altruistically
- classes: [ADV]
-- word: always
- classes: [ADV]
-- word: amazingly
- classes: [ADV]
-- word: ambiguously
- classes: [ADV]
-- word: ambitiously
- classes: [ADV]
-- word: amiably
- classes: [ADV]
-- word: amicably
- classes: [ADV]
-- word: amidships
- classes: [ADV]
-- word: amiss
- classes: [ADV]
-- word: amok
- classes: [ADV]
-- word: amorously
- classes: [ADV]
-- word: amply
- classes: [ADV]
-- word: amuck
- classes: [ADV]
-- word: amusingly
- classes: [ADV]
-- word: analogously
- classes: [ADV]
-- word: analytically
- classes: [ADV]
-- word: anarchically
- classes: [ADV]
-- word: anatomically
- classes: [ADV]
-- word: andante
- classes: [ADV]
-- word: anew
- classes: [ADV]
-- word: angelically
- classes: [ADV]
-- word: angrily
- classes: [ADV]
-- word: annually
- classes: [ADV]
-- word: anomalously
- classes: [ADV]
-- word: anon
- classes: [ADV]
-- word: anonymously
- classes: [ADV]
-- word: antagonistically
- classes: [ADV]
-- word: ante_meridiem
- classes: [ADV]
-- word: anticlockwise
- classes: [ADV]
-- word: anxiously
- classes: [ADV]
-- word: any
- classes: [ADV]
-- word: anyhow
- classes: [ADV]
-- word: anyplace
- classes: [ADV]
-- word: anyway
- classes: [ADV]
-- word: anywhere
- classes: [ADV]
-- word: apace
- classes: [ADV]
-- word: apart
- classes: [ADV]
-- word: apathetically
- classes: [ADV]
-- word: apiece
- classes: [ADV]
-- word: apologetically
- classes: [ADV]
-- word: appallingly
- classes: [ADV]
-- word: apparently
- classes: [ADV]
-- word: appealingly
- classes: [ADV]
-- word: appositely
- classes: [ADV]
-- word: appreciably
- classes: [ADV]
-- word: appreciatively
- classes: [ADV]
-- word: appropriately
- classes: [ADV]
-- word: approvingly
- classes: [ADV]
-- word: approximately
- classes: [ADV]
-- word: apropos
- classes: [ADV]
-- word: aptly
- classes: [ADV]
-- word: arbitrarily
- classes: [ADV]
-- word: architecturally
- classes: [ADV]
-- word: archly
- classes: [ADV]
-- word: ardently
- classes: [ADV]
-- word: arduously
- classes: [ADV]
-- word: arguably
- classes: [ADV]
-- word: aright
- classes: [ADV]
-- word: aristocratically
- classes: [ADV]
-- word: arithmetically
- classes: [ADV]
-- word: around
- classes: [ADV]
-- word: arrogantly
- classes: [ADV]
-- word: artfully
- classes: [ADV]
-- word: articulately
- classes: [ADV]
-- word: artificially
- classes: [ADV]
-- word: artistically
- classes: [ADV]
-- word: artlessly
- classes: [ADV]
-- word: ascetically
- classes: [ADV]
-- word: ashamedly
- classes: [ADV]
-- word: ashore
- classes: [ADV]
-- word: aside
- classes: [ADV]
-- word: askance
- classes: [ADV]
-- word: askew
- classes: [ADV]
-- word: aslant
- classes: [ADV]
-- word: asleep
- classes: [ADV]
-- word: assertively
- classes: [ADV]
-- word: assiduously
- classes: [ADV]
-- word: assuredly
- classes: [ADV]
-- word: astern
- classes: [ADV]
-- word: astonishingly
- classes: [ADV]
-- word: astray
- classes: [ADV]
-- word: astride
- classes: [ADV]
-- word: astronomically
- classes: [ADV]
-- word: astutely
- classes: [ADV]
-- word: asunder
- classes: [ADV]
-- word: asymmetrically
- classes: [ADV]
-- word: asymptotically
- classes: [ADV]
-- word: athwart
- classes: [ADV]
-- word: atop
- classes: [ADV]
-- word: atrociously
- classes: [ADV]
-- word: attentively
- classes: [ADV]
-- word: attractively
- classes: [ADV]
-- word: attributively
- classes: [ADV]
-- word: atypically
- classes: [ADV]
-- word: audaciously
- classes: [ADV]
-- word: audibly
- classes: [ADV]
-- word: auspiciously
- classes: [ADV]
-- word: austerely
- classes: [ADV]
-- word: authentically
- classes: [ADV]
-- word: authoritatively
- classes: [ADV]
-- word: autocratically
- classes: [ADV]
-- word: automatically
- classes: [ADV]
-- word: avariciously
- classes: [ADV]
-- word: avidly
- classes: [ADV]
-- word: avowedly
- classes: [ADV]
-- word: away
- classes: [ADV]
-- word: awfully
- classes: [ADV]
-- word: awhile
- classes: [ADV]
-- word: awkwardly
- classes: [ADV]
-- word: awry
- classes: [ADV]
-- word: axiomatically
- classes: [ADV]
-- word: back
- classes: [ADV]
-- word: backstage
- classes: [ADV]
-- word: backward
- classes: [ADV]
-- word: backwards
- classes: [ADV]
-- word: badly
- classes: [ADV]
-- word: baldly
- classes: [ADV]
-- word: balefully
- classes: [ADV]
-- word: bang
- classes: [ADV]
-- word: banteringly
- classes: [ADV]
-- word: barbarously
- classes: [ADV]
-- word: bareback
- classes: [ADV]
-- word: barefacedly
- classes: [ADV]
-- word: barefoot
- classes: [ADV]
-- word: barefooted
- classes: [ADV]
-- word: barely
- classes: [ADV]
-- word: bashfully
- classes: [ADV]
-- word: basically
- classes: [ADV]
-- word: bawdily
- classes: [ADV]
-- word: beastly
- classes: [ADV]
-- word: beautifully
- classes: [ADV]
-- word: becomingly
- classes: [ADV]
-- word: befittingly
- classes: [ADV]
-- word: before
- classes: [ADV]
-- word: beforehand
- classes: [ADV]
-- word: behind
- classes: [ADV]
-- word: belatedly
- classes: [ADV]
-- word: belike
- classes: [ADV]
-- word: belligerently
- classes: [ADV]
-- word: below
- classes: [ADV]
-- word: beneath
- classes: [ADV]
-- word: beneficially
- classes: [ADV]
-- word: benevolently
- classes: [ADV]
-- word: benignly
- classes: [ADV]
-- word: beseechingly
- classes: [ADV]
-- word: besides
- classes: [ADV]
-- word: best
- classes: [ADV]
-- word: bestially
- classes: [ADV]
-- word: betimes
- classes: [ADV]
-- word: better
- classes: [ADV]
-- word: between
- classes: [ADV]
-- word: betwixt
- classes: [ADV]
-- word: bewitchingly
- classes: [ADV]
-- word: beyond
- classes: [ADV]
-- word: biennially
- classes: [ADV]
-- word: bilaterally
- classes: [ADV]
-- word: biologically
- classes: [ADV]
-- word: bitingly
- classes: [ADV]
-- word: bitterly
- classes: [ADV]
-- word: blamelessly
- classes: [ADV]
-- word: blandly
- classes: [ADV]
-- word: blankly
- classes: [ADV]
-- word: blasphemously
- classes: [ADV]
-- word: blatantly
- classes: [ADV]
-- word: bleakly
- classes: [ADV]
-- word: blindly
- classes: [ADV]
-- word: blissfully
- classes: [ADV]
-- word: blithely
- classes: [ADV]
-- word: bloodlessly
- classes: [ADV]
-- word: bloody
- classes: [ADV]
-- word: bluffly
- classes: [ADV]
-- word: bluntly
- classes: [ADV]
-- word: blushingly
- classes: [ADV]
-- word: boastfully
- classes: [ADV]
-- word: bodily
- classes: [ADV]
-- word: boisterously
- classes: [ADV]
-- word: boldly
- classes: [ADV]
-- word: bolt
- classes: [ADV]
-- word: bombastically
- classes: [ADV]
-- word: bonnily
- classes: [ADV]
-- word: boorishly
- classes: [ADV]
-- word: both
- classes: [ADV]
-- word: boundlessly
- classes: [ADV]
-- word: bounteously
- classes: [ADV]
-- word: bountifully
- classes: [ADV]
-- word: boyishly
- classes: [ADV]
-- word: bravely
- classes: [ADV]
-- word: breadthways
- classes: [ADV]
-- word: breadthwise
- classes: [ADV]
-- word: breast-deep
- classes: [ADV]
-- word: breast-high
- classes: [ADV]
-- word: breathlessly
- classes: [ADV]
-- word: breezily
- classes: [ADV]
-- word: briefly
- classes: [ADV]
-- word: bright
- classes: [ADV]
-- word: brightly
- classes: [ADV]
-- word: brilliantly
- classes: [ADV]
-- word: briskly
- classes: [ADV]
-- word: broadcast
- classes: [ADV]
-- word: broadly
- classes: [ADV]
-- word: brusquely
- classes: [ADV]
-- word: brutally
- classes: [ADV]
-- word: brutishly
- classes: [ADV]
-- word: bump
- classes: [ADV]
-- word: bumptiously
- classes: [ADV]
-- word: buoyantly
- classes: [ADV]
-- word: bureaucratically
- classes: [ADV]
-- word: busily
- classes: [ADV]
-- word: but
- classes: [ADV]
-- word: by
- classes: [ADV]
-- word: cagily
- classes: [ADV]
-- word: calmly
- classes: [ADV]
-- word: candidly
- classes: [ADV]
-- word: cannily
- classes: [ADV]
-- word: cantankerously
- classes: [ADV]
-- word: capably
- classes: [ADV]
-- word: capriciously
- classes: [ADV]
-- word: carefully
- classes: [ADV]
-- word: carelessly
- classes: [ADV]
-- word: caressingly
- classes: [ADV]
-- word: carnally
- classes: [ADV]
-- word: casually
- classes: [ADV]
-- word: catastrophically
- classes: [ADV]
-- word: categorically
- classes: [ADV]
-- word: caustically
- classes: [ADV]
-- word: cautiously
- classes: [ADV]
-- word: ceaselessly
- classes: [ADV]
-- word: centrally
- classes: [ADV]
-- word: ceremonially
- classes: [ADV]
-- word: ceremoniously
- classes: [ADV]
-- word: certainly
- classes: [ADV]
-- word: champion
- classes: [ADV]
-- word: chaotically
- classes: [ADV]
-- word: characteristically
- classes: [ADV]
-- word: charily
- classes: [ADV]
-- word: charitably
- classes: [ADV]
-- word: charmingly
- classes: [ADV]
-- word: chastely
- classes: [ADV]
-- word: chattily
- classes: [ADV]
-- word: cheaply
- classes: [ADV]
-- word: cheekily
- classes: [ADV]
-- word: cheerfully
- classes: [ADV]
-- word: cheerily
- classes: [ADV]
-- word: cheerlessly
- classes: [ADV]
-- word: chemically
- classes: [ADV]
-- word: chiefly
- classes: [ADV]
-- word: childishly
- classes: [ADV]
-- word: chirpily
- classes: [ADV]
-- word: chivalrously
- classes: [ADV]
-- word: chock-a-block
- classes: [ADV]
-- word: chop-chop
- classes: [ADV]
-- word: chronically
- classes: [ADV]
-- word: chronologically
- classes: [ADV]
-- word: churlishly
- classes: [ADV]
-- word: circumspectly
- classes: [ADV]
-- word: circumstantially
- classes: [ADV]
-- word: civilly
- classes: [ADV]
-- word: clammily
- classes: [ADV]
-- word: clannishly
- classes: [ADV]
-- word: classically
- classes: [ADV]
-- word: clean
- classes: [ADV]
-- word: cleanly
- classes: [ADV]
-- word: clear
- classes: [ADV]
-- word: clear-cut
- classes: [ADV]
-- word: clearly
- classes: [ADV]
-- word: cleverly
- classes: [ADV]
-- word: climatically
- classes: [ADV]
-- word: clinically
- classes: [ADV]
-- word: clockwise
- classes: [ADV]
-- word: close
- classes: [ADV]
-- word: closely
- classes: [ADV]
-- word: clumsily
- classes: [ADV]
-- word: coarsely
- classes: [ADV]
-- word: coaxingly
- classes: [ADV]
-- word: cock-a-hoop
- classes: [ADV]
-- word: coherently
- classes: [ADV]
-- word: coldly
- classes: [ADV]
-- word: collect
- classes: [ADV]
-- word: collectedly
- classes: [ADV]
-- word: colloquially
- classes: [ADV]
-- word: comfortably
- classes: [ADV]
-- word: comfortingly
- classes: [ADV]
-- word: comically
- classes: [ADV]
-- word: commercially
- classes: [ADV]
-- word: commonly
- classes: [ADV]
-- word: communally
- classes: [ADV]
-- word: compactly
- classes: [ADV]
-- word: comparatively
- classes: [ADV]
-- word: compassionately
- classes: [ADV]
-- word: compatibly
- classes: [ADV]
-- word: competently
- classes: [ADV]
-- word: complacently
- classes: [ADV]
-- word: complainingly
- classes: [ADV]
-- word: completely
- classes: [ADV]
-- word: composedly
- classes: [ADV]
-- word: comprehensively
- classes: [ADV]
-- word: compulsively
- classes: [ADV]
-- word: compulsorily
- classes: [ADV]
-- word: computationally
- classes: [ADV]
-- word: comradely
- classes: [ADV]
-- word: con
- classes: [ADV]
-- word: conceitedly
- classes: [ADV]
-- word: conceivably
- classes: [ADV]
-- word: conceptually
- classes: [ADV]
-- word: concernedly
- classes: [ADV]
-- word: concisely
- classes: [ADV]
-- word: conclusively
- classes: [ADV]
-- word: concretely
- classes: [ADV]
-- word: concurrently
- classes: [ADV]
-- word: condescendingly
- classes: [ADV]
-- word: conditionally
- classes: [ADV]
-- word: confessedly
- classes: [ADV]
-- word: confidentially
- classes: [ADV]
-- word: confidently
- classes: [ADV]
-- word: confidingly
- classes: [ADV]
-- word: confusedly
- classes: [ADV]
-- word: congenially
- classes: [ADV]
-- word: conjointly
- classes: [ADV]
-- word: conjugally
- classes: [ADV]
-- word: conscientiously
- classes: [ADV]
-- word: consciously
- classes: [ADV]
-- word: consecutively
- classes: [ADV]
-- word: consequentially
- classes: [ADV]
-- word: consequently
- classes: [ADV]
-- word: conservatively
- classes: [ADV]
-- word: considerably
- classes: [ADV]
-- word: considerately
- classes: [ADV]
-- word: consistently
- classes: [ADV]
-- word: conspicuously
- classes: [ADV]
-- word: constantly
- classes: [ADV]
-- word: constitutionally
- classes: [ADV]
-- word: constrainedly
- classes: [ADV]
-- word: constructively
- classes: [ADV]
-- word: contagiously
- classes: [ADV]
-- word: contemporaneously
- classes: [ADV]
-- word: contemptuously
- classes: [ADV]
-- word: contentedly
- classes: [ADV]
-- word: contiguously
- classes: [ADV]
-- word: continually
- classes: [ADV]
-- word: continuously
- classes: [ADV]
-- word: contrarily
- classes: [ADV]
-- word: contrariwise
- classes: [ADV]
-- word: contrastingly
- classes: [ADV]
-- word: contritely
- classes: [ADV]
-- word: controversially
- classes: [ADV]
-- word: conveniently
- classes: [ADV]
-- word: conventionally
- classes: [ADV]
-- word: conversationally
- classes: [ADV]
-- word: conversely
- classes: [ADV]
-- word: convincingly
- classes: [ADV]
-- word: convivially
- classes: [ADV]
-- word: convulsively
- classes: [ADV]
-- word: coolly
- classes: [ADV]
-- word: copiously
- classes: [ADV]
-- word: coquettishly
- classes: [ADV]
-- word: cordially
- classes: [ADV]
-- word: correctly
- classes: [ADV]
-- word: correspondingly
- classes: [ADV]
-- word: corruptly
- classes: [ADV]
-- word: cosily
- classes: [ADV]
-- word: counter
- classes: [ADV]
-- word: courageously
- classes: [ADV]
-- word: courteously
- classes: [ADV]
-- word: covertly
- classes: [ADV]
-- word: covetously
- classes: [ADV]
-- word: coyly
- classes: [ADV]
-- word: craftily
- classes: [ADV]
-- word: crash
- classes: [ADV]
-- word: crazily
- classes: [ADV]
-- word: creakily
- classes: [ADV]
-- word: creatively
- classes: [ADV]
-- word: credibly
- classes: [ADV]
-- word: creditably
- classes: [ADV]
-- word: credulously
- classes: [ADV]
-- word: criminally
- classes: [ADV]
-- word: crisply
- classes: [ADV]
-- word: crisscross
- classes: [ADV]
-- word: critically
- classes: [ADV]
-- word: crookedly
- classes: [ADV]
-- word: cross-legged
- classes: [ADV]
-- word: crosscountry
- classes: [ADV]
-- word: crossly
- classes: [ADV]
-- word: crosswise
- classes: [ADV]
-- word: crucially
- classes: [ADV]
-- word: crudely
- classes: [ADV]
-- word: cruelly
- classes: [ADV]
-- word: crushingly
- classes: [ADV]
-- word: cryptically
- classes: [ADV]
-- word: culpably
- classes: [ADV]
-- word: cumulatively
- classes: [ADV]
-- word: cunningly
- classes: [ADV]
-- word: curiously
- classes: [ADV]
-- word: currently
- classes: [ADV]
-- word: cursedly
***The diff for this file has been truncated for email.***
=======================================
***Additional files exist in this changeset.***

==============================================================================
Revision: 5902fb7637ec
Branch: default
Author: Michael Gasser <gas...@cs.indiana.edu>
Date: Sun May 11 05:20:29 2014 UTC
Log: LGLP paper first draft mostly done.
http://code.google.com/p/hltdi-l3/source/detail?r=5902fb7637ec

Modified:
/hiiktuu.py
/hiiktuu/sentence.py
/paperdrafts/lglp/lglp14.pdf
/paperdrafts/lglp/lglp14.tex

=======================================
--- /hiiktuu.py Fri May 9 22:41:32 2014 UTC
+++ /hiiktuu.py Sun May 11 05:20:29 2014 UTC
@@ -45,10 +45,10 @@
verbosity=verbosity)
# s.do(verbosity=verbosity)
s.initialize()
- s.solve()
- sol = s.solutions[0]
- sol.translate()
- return sol
+# s.solve()
+# sol = s.solutions[0]
+# sol.translate()
+ return s

def end_of_world(verbosity=0):
"""
=======================================
--- /hiiktuu/sentence.py Fri May 9 22:41:32 2014 UTC
+++ /hiiktuu/sentence.py Sun May 11 05:20:29 2014 UTC
@@ -149,7 +149,8 @@
"""Run constraints and create a single solution."""
if verbosity:
print("Attempting to find solutions for {}".format(self))
- if self.run(verbosity=verbosity):
+ self.run(verbosity=verbosity)
+ if self.solver.status == Solver.succeeded:
self.create_solution(verbosity=verbosity)
if verbosity:
print("Found solution {}".format(self.solutions[0]))
@@ -460,6 +461,14 @@
self.solver.run(verbosity=verbosity)
if verbosity:
print("Solver status after run: {}".format(self.solver.status))
+# if self.solver.status == Solver.succeeded:
+# # All essential variables are determined; one solution found
+# return True
+# else:
+# # Try to see if the really essential variables are in fact
determined
+# groups = self.variables['groups'].get_value(dstore=dstore)
+# ginsts = [self.groups[g] for g in groups]
+# trees = [list(g.variables['tree'].get_value(dstore=dstore))
for g in ginsts]
return self.solver.status

def create_solution(self, dstore=None, verbosity=0):
@@ -702,7 +711,7 @@
self.variables['cgnodes'] = self.variables['gnodes']
# SNode positions of GNodes for this GInst
self.svar('gnodes_pos', 'g{}->gnodes_pos'.format(self.index),
- set(), set(cand_snodes), self.ngnodes, self.ngnodes,
ess=True)
+ set(), set(cand_snodes), self.ngnodes, self.ngnodes,
ess=False)
# set(), set(range(nsnodes)), self.ngnodes, self.ngnodes)
# SNode positions of abstract GNodes for this GInst
if self.nanodes == 0:
@@ -741,7 +750,7 @@
else:
self.svar('tree', 'g{}->tree'.format(self.index),
# at least as long as the number of self's nodes
- set(), set(cand_snodes), self.ngnodes,
len(cand_snodes), ess=True)
+ set(), set(cand_snodes), self.ngnodes,
len(cand_snodes), ess=False)
# set(), set(range(nsnodes)), self.ngnodes, nsnodes,
ess=True)
# Determined variable for within-source agreement constraints,
gen: 0}
agr = self.get_agr()
=======================================
--- /paperdrafts/lglp/lglp14.pdf Sun May 11 00:25:21 2014 UTC
+++ /paperdrafts/lglp/lglp14.pdf Sun May 11 05:20:29 2014 UTC
Binary file, no diff available.
=======================================
--- /paperdrafts/lglp/lglp14.tex Sun May 11 00:25:21 2014 UTC
+++ /paperdrafts/lglp/lglp14.tex Sun May 11 05:20:29 2014 UTC
@@ -45,7 +45,6 @@
% smaller than 5cm (the original size); we will check this
% in the camera-ready version and ask you to change it back.

-
\title{Minimal Dependency Grammar for Machine Translation}

\author{
@@ -87,22 +86,6 @@
Both analysis and realization are implemented through constraint
satisfaction.

\end{abstract}
-
-%\section{Credits}
-
-%This document has been adapted from the instructions for the ACL-2014
-%proceedings compiled by Alexander Koller and Yusuke Miyao,
-%which are, in turn, based on the instructions for earlier ACL proceedings,
-%including those for ACL-2012 by Maggie Li and Michael
-%White, those from ACL-2010 by Jing-Shing Chang and Philipp Koehn,
-%those for ACL-2008 by Johanna D. Moore, Simone Teufel, James Allan,
-%and Sadaoki Furui, those for ACL-2005 by Hwee Tou Ng and Kemal
-%Oflazer, those for ACL-2002 by Eugene Charniak and Dekang Lin, and
-%earlier ACL and EACL formats. Those versions were written by several
-%people, including John Chen, Henry S. Thompson and Donald
-%Walker. Additional elements were taken from the formatting
-%instructions of the {\em International Joint Conference on Artificial
-% Intelligence}.

\section{Introduction}
\label{intro}
@@ -170,8 +153,10 @@
to the extent these are available.
Our long-term goals are most similar to those of the Apertium
\cite{apertium} project.

-In this paper we describe the initial steps in developing Hiiktuu, a
lexical-grammatical
-framework for MT and CAT.
+In this paper we describe the initial steps in developing
Hiiktuu,\footnote{\textit{Hiiktuu} is the Oromo
+word for a (female) translator.} a lexical-grammatical framework for MT
and CAT.
+Although our focus is on the language pairs Spanish-Guarani and
Amharic-Oromo, we illustrate
+Hiiktuu with examples from English-Spanish in this paper.

\section{Lexica and grammars}

@@ -388,7 +373,7 @@
For example, to parse the sentence \textit{I gave the mayor a piece of my
mind} requires
that positions 2 and 6 in the group
\textit{give\_v\_$sbd\_a\_piece\_of\_$sbds\_mind} be
filled by the heads of the groups \textit{the\_mayor} and \textit{my}.
-This merging process is illustrated in Figure~\ref{fig:mind}.
+This \textbf{node merging} process is illustrated in Figure~\ref{fig:mind}.

\begin{figure}[ht]
\begin{center}
@@ -410,19 +395,53 @@
\section{Constraint satisfaction and translation}
\label{sect:cs}

-Translation in Hiiktuu takes place in two phases.
-First the sentence is parsed.
-A parse consists of an assignment of groups from the lexicon to words in
the sentence
-in such a way that several constraints are satisfied.
+Translation in Hiiktuu takes place in three phases: analysis, transfer,
and realization.
+Analysis begins with a lexical lookup of the wordforms in the
source-language sentence.
+The forms dictionary includes roots and grammatical features for some
words.\footnote{In future versions
+of the system, it will be possible to call a morphological analyzer on the
input forms at
+this stage.}
+The forms resulting from this first pass are then used to look up
candidate groups in the
+group dictionary.
+Next the system assigns a set of groups so the input sentence, effectively
chunking the sentence.
+A successful group assignment satisfies several constraints: (1)~each word
in the input sentence
+must be assigned to zero, one, or, in the case of node merging, two group
elements.
+(2)~Each element in a selected group must be assigned to one word in the
sentence.
+(3)~For each selected group, within-group agreement restrictions are
satisfied.
+(4)~For each category element in a selected group, it is merged with a
non-category element in another
+selected group.
+Analysis is a robust process; some words in the input sentence may end up
unassigned to any group.
+
+Analysis is implemented in the form of constraint satisfaction, making use
of a number of the insights
+from the Extensive Dependency Grammar framework (XDG) \cite{debusmann}.
+Although considerable source-sentence ambiguity is eliminated because
groups (unless they consist of single words),
+incorporate context, ambiguity is still possible, particularly in the
context of figurative expressions
+that also have a literal interpretation.
+In this case, the constraint satisfaction process undertakes a search
through the space of possible group
+assignments, creating an analysis for successful assignment.
+Again this process relies on notions from XDG.
+
+During the transfer phase, a source-language group assignment is converted
to an assignment of target-language
+groups to the sentence.
+In this process some target-language items are assigned grammatical
features on the basis of cross-language
+agreement constraints or within-group agreement constraints in the target
language.
+For example, it is during this stage in the translation of the English
sentence \textit{the mayor passes the buck}
+to Spanish that the Spanish verb assigned to the head of the group
\textit{escurrir el bulto} would be
+assigned the tense (\textit{tiempo}), person and number features
\texttt{tmp=prs, prs=3, num=1}.
+
+During the realization phase, target-language surface forms are generated.
+In the current version of the system, this is accomplished through a
dictionary that maps
+roots and feature sets to surface forms.
+In a future version, it will be possible to call a morphological generator
at this stage.
+Finally, target-language words are sequenced in a way that satisfies
conditions in target-language groups.

\section{Related work}
\label{sect:related}

... Apertium, MOLTO?, other grammatical formalisms? ...

-\section{Evaluation}
-\label{sect:eval}
-???
+\section{Status of project}
+\label{sect:status}
+... tiiiiny lexicons

\section{Ongoing and future work}
\label{sect:future}

Reply all

Reply to author

Forward

0 new messages