Automated glossary of Namibian legislation using AKN

8 views
Skip to first unread message

Greg Kempe

unread,
Sep 30, 2019, 2:28:51 AM9/30/19
to akomant...@googlegroups.com
Hello Akoma Ntoso fans,

We recently launched a tool that we think this is an interesting use case for legislation structured legislation with AKN. We have created a automated glossary of all the defined terms in our Namibian legislation collection. 

You can explore the glossary at https://edit.laws.africa/places/na/labs/glossary

Our blog post talks more about how we built the glossary and the basic machine learning we use to cluster similar term definitions. 

Our Namibian legislation collection comprises 429 Acts in Akoma Ntoso and uses def and term tags to markup defined terms, which we automatically identify when marking up the document. For example:

     “charge” means an indictment, charge sheet, summons or written notice;

The glossary extracts all the terms and their definitions from the corpus. It then clusters the terms based on similarity. For example, the definition of the term "Minister" varies greatly across the corpus, but there are similar (sometimes identical) definitions in works that are related, such as those to do with Home Affairs or Education.

glossary.png

It's an interesting way of exploring the legislation. We can see that definitions evolve over time ("Minister of X" changes to "Minister responsible for X" in the late 1990s), and some definitions should probably be standardised, such as "youth":

youth.png

The clustering is done using the scikit-learn python package. It's naive but works quite well: for each term, we extract definition text, vectorise it using tf-idf, compute the cosine similarity between the definition vectors, and use agglomerative clustering to group similar definitions. More details in our blog post.

Best,
Greg

--
Greg Kempe
CTO and Co-founder, Laws.Africa
https://laws.africa · gr...@laws.africa · +27 78 246 1116

Reply all
Reply to author
Forward
0 new messages