Re: [TheyWorkForYou-Wales] theyworkforyou-wales Digest, Vol 5, Issue 2

16 views
Skip to first unread message

Siôn Jones

unread,
Jun 27, 2010, 4:18:46 PM6/27/10
to theyworkfo...@lists.mysociety.org
I don't think you are right. I think the decision was not to translate contributions in English into Welsh, but instead to publish transcriptions of the concurrent translation which will continue.

When this project first came up, I asked my AM if he would assist in ensuring that the proceedings were published in a sensible, electronic form, and he was very supportive of the suggestion.

Should I follow this up?

On 26 June 2010 12:00, <theyworkforyou...@lists.mysociety.org> wrote:
Send theyworkforyou-wales mailing list submissions to
       theyworkfo...@lists.mysociety.org

To subscribe or unsubscribe via the World Wide Web, visit
       https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales

or, via email, send a message with subject or body 'help' to
       theyworkforyou...@lists.mysociety.org

You can reach the person managing the list at
       theyworkforyo...@lists.mysociety.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of theyworkforyou-wales digest..."


Today's Topics:

  1. Re: theyworkforyou-wales Digest, Vol 5,   Issue 1 (K THOMPSON)


----------------------------------------------------------------------

Message: 1
Date: Fri, 25 Jun 2010 23:01:40 +0000 (GMT)
From: K THOMPSON <kthomp...@btinternet.com>
Subject: Re: [TheyWorkForYou-Wales] theyworkforyou-wales Digest, Vol
       5,      Issue 1
To: theyworkfo...@lists.mysociety.org
Message-ID: <509069....@web87111.mail.ird.yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"

Thanks to all who responded; it sounds as thought the computer skills necessary to assist in this are a little beyond me! It seems the bi-lingual nature of proceedings in the Assembly make things difficult; a recent review ( http://news.bbc.co.uk/1/hi/wales/wales_politics/8692117.stm?) suggests that soon all spoken business will be in English. If there are any simple tasks to assist in 'theyworkforyou' Welsh Assembly - please let me know!
Caebrwyn



________________________________
From: "theyworkforyou...@lists.mysociety.org" <theyworkforyou...@lists.mysociety.org>
To: theyworkfo...@lists.mysociety.org
Sent: Friday, 25 June, 2010 12:00:03
Subject: theyworkforyou-wales Digest, Vol 5, Issue 1

Send theyworkforyou-wales mailing list submissions to
??? theyworkfo...@lists.mysociety.org

To subscribe or unsubscribe via the World Wide Web, visit
??? https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales

or, via email, send a message with subject or body 'help' to
??? theyworkforyou...@lists.mysociety.org

You can reach the person managing the list at
??? theyworkforyo...@lists.mysociety.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of theyworkforyou-wales digest..."


Today's Topics:

? 1. welsh assembly (K THOMPSON)
? 2. Re: welsh assembly (Duncan Parkes)
? 3. Re: welsh assembly (Sam Knight)
? 4. Re: welsh assembly (Duncan Parkes)


----------------------------------------------------------------------

Message: 1
Date: Thu, 24 Jun 2010 19:44:41 +0000 (GMT)
From: K THOMPSON <kthomp...@btinternet.com>
Subject: [TheyWorkForYou-Wales] welsh assembly
To: theyworkfo...@lists.mysociety.org
Message-ID: <988224....@web87101.mail.ird.yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi
has anyone made any progress for Theyworkforyou?on the Welsh Assembly Government or National Assembly for Wales? How can contributions be made?
Caebrwyn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </admin/lists/pipermail/theyworkforyou-wales/attachments/20100624/cbf5c7e6/attachment-0001.htm>

------------------------------

Message: 2
Date: Thu, 24 Jun 2010 20:47:30 +0100
From: Duncan Parkes <duncan...@gmail.com>
Subject: Re: [TheyWorkForYou-Wales] welsh assembly
To: TheyWorkForYou for the Welsh Assembly
??? <theyworkfo...@lists.mysociety.org>
Message-ID:
??? <AANLkTilcR_F3erlOOZyBo...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi Caebrwyn

> has anyone made any progress for Theyworkforyou?on the Welsh Assembly
> Government or National Assembly for Wales? How can contributions be made?
> Caebrwyn

I don't think so, I'm afraid. I started working on this over a year
ago when I was just a volunteer for mySociety rather than an employee,
but I just ran out of time. I will try to go through the code I have
written, tidy it up, and make it available so that anyone else who
wants to can help with it.

Sorry for the lack of progress!

Duncan



------------------------------

Message: 3
Date: Fri, 25 Jun 2010 00:08:37 +0100
From: Sam Knight <samkn...@gmail.com>
Subject: Re: [TheyWorkForYou-Wales] welsh assembly
To: TheyWorkForYou for the Welsh Assembly
??? <theyworkfo...@lists.mysociety.org>
Message-ID:
??? <AANLkTimVoxahfiu8qclc-...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

I've managed to make some progress but the way the speeches are laid out on
the site make it difficult to scrape. i.e it's bilingual and not always
displayed in the same order. If anyone has found a simple dataset in one
language I could manage it.


On Thu, Jun 24, 2010 at 8:47 PM, Duncan Parkes <duncan...@gmail.com>wrote:

> Hi Caebrwyn
>
> > has anyone made any progress for Theyworkforyou on the Welsh Assembly
> > Government or National Assembly for Wales? How can contributions be made?
> > Caebrwyn
>
> I don't think so, I'm afraid. I started working on this over a year
> ago when I was just a volunteer for mySociety rather than an employee,
> but I just ran out of time. I will try to go through the code I have
> written, tidy it up, and make it available so that anyone else who
> wants to can help with it.
>
> Sorry for the lack of progress!
>
> Duncan
>
> _______________________________________________
> theyworkforyou-wales mailing list
> theyworkfo...@lists.mysociety.org
>
> https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </admin/lists/pipermail/theyworkforyou-wales/attachments/20100625/7f0ed966/attachment-0001.htm>

------------------------------

Message: 4
Date: Fri, 25 Jun 2010 00:15:17 +0100
From: Duncan Parkes <duncan...@gmail.com>
Subject: Re: [TheyWorkForYou-Wales] welsh assembly
To: TheyWorkForYou for the Welsh Assembly
??? <theyworkfo...@lists.mysociety.org>
Message-ID:
??? <AANLkTil3jvaJUGUwQebjE...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

The order is what was actually spoken on the left and the translation I'm
the right. Sometimes this changes mid speech. I've not yet found an example
mid paragraph.

My half built scraper works out which language by a technique roughly as
sophisticated as counting the els...

Cheers

Duncan

On 25 Jun 2010 00:09, "Sam Knight" <samkn...@gmail.com> wrote:

I've managed to make some progress but the way the speeches are laid out on
the site make it difficult to scrape. i.e it's bilingual and not always
displayed in the same order. If anyone has found a simple dataset in one
language I could manage it.


On Thu, Jun 24, 2010 at 8:47 PM, Duncan Parkes <duncan...@gmail.com>
wrote:

> >
> > Hi Caebrwyn
> >
> > > has anyone made any progress for Theyworkforyou on the Welsh Assembly
> > > Gove...
> _______________________________________________
> theyworkforyou-wales mailing list
> theyworkfo...@lists.mysociety.org
>
> https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales
>


_______________________________________________
theyworkforyou-wales mailing list
theyworkfo...@lists.mysociety.org
https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </admin/lists/pipermail/theyworkforyou-wales/attachments/20100625/1257068b/attachment-0001.htm>

------------------------------

_______________________________________________
theyworkforyou-wales mailing list
theyworkfo...@lists.mysociety.org
https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales


End of theyworkforyou-wales Digest, Vol 5, Issue 1
**************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </admin/lists/pipermail/theyworkforyou-wales/attachments/20100625/99c433c6/attachment.html>

------------------------------

_______________________________________________
theyworkforyou-wales mailing list
theyworkfo...@lists.mysociety.org
https://secure.mysociety.org/admin/lists/mailman/listinfo/theyworkforyou-wales


End of theyworkforyou-wales Digest, Vol 5, Issue 2
**************************************************



--
Pob Hwyl - Best Regards

Siôn

Carl Morris

unread,
Jul 3, 2010, 10:52:09 AM7/3/10
to TheyWorkForYou for the Welsh Assembly
Si�n, if nothing else Y Cofnod should be published in appropriate
format(s). Definitely follow this up because Dafydd Elis-Thomas in
particular is putting a lot of emphasis on "imaginative use of modern
technology".

Sam, why is language a challenge here? Can you not just scrape the blob
of text (ignoring language) - can the resulting interface just display
both side-by-side?

There are language detection libraries out there, e.g.
http://googlesystem.blogspot.com/2008/03/google-launched-another-ajax-api-this.html

About the provision of translation, Si�n is correct. The original speech
will always been transcribed and published. But there will only be
published translation for Welsh speeches into English (not vice-versa).

This decision will of course underprivilege people who wish to use
Welsh, here's more background for the curious:
http://hedyn.net/y_cofnod_llawn

Carl

Duncan Parkes

unread,
Jul 3, 2010, 2:28:14 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
> Sam, why is language a challenge here? Can you not just scrape the blob of
> text (ignoring language) - can the resulting interface just display both
> side-by-side?

I don't think language is a big problem. It was solvable before, and
now the translations only appear when the original was in Welsh, it's
obvious which is the English and which is the Welsh (though not
translating the English into Welsh makes producing a Welsh version of
the site impossible).

>> Should I follow this up?

Absolutely! I'm going to have another go at parsing it - I got quite a
long way before, but just ran out of spare time.

The problems I had before were mostly due to inconsistencies in the
markup rather than the welsh/english thing. I've just had a quick look
and I /think/ it's got better. We'll see!

Cheers,

Duncan

Duncan Parkes

unread,
Jul 3, 2010, 2:29:42 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
> Should I follow this up?

I meant to say that the most important thing is that they understand
what a good format for publishing it in is. It needs to be published
in such a way as it's easy for a computer to read and understand it
rather than just a human. I'm sure we have some guidelines about
somewhere for how to make it good.

Cheers,

Duncan

Carl Morris

unread,
Jul 3, 2010, 4:38:11 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
Duncan Parkes wrote:
> I don't think language is a big problem. It was solvable before, and
> now the translations only appear when the original was in Welsh, it's
> obvious which is the English and which is the Welsh (though not
> translating the English into Welsh makes producing a Welsh version of
> the site impossible).

I do see a distinction between the interface and the content here.

There would be a warm welcome for a Welsh language interface, what do
you think? Could it be run in the same fashion as the Pledgebank interface?
http://www.cy.pledgebank.com/
(ah, this translation also needs updating...)

I'd like to volunteer to translate the interface for TWFY Welsh Assembly
into Welsh - if you'll take it. I'm sure there will be other people
interested in contributing too. I'm happy to coordinate it.

If it's anything like Pledgebank it would be done with gettext/po files
- correct? It's the same method I've used before for WordPress localisation.

I wanted to lodge this offer early. On the content side the recent
change in translation provision *may* turn out to be a momentary glitch
in service FWIW.

> The problems I had before were mostly due to inconsistencies in the
> markup rather than the welsh/english thing. I've just had a quick look
> and I /think/ it's got better. We'll see!

This is tremendously exciting, thanks Duncan!

Carl

Sam Knight

unread,
Jul 3, 2010, 6:25:37 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
As I am fairly new to this mailing list. Is there any standards that are required for the project. So far I'm using ruby outputting to an xml feed.

Matthew Somerville

unread,
Jul 3, 2010, 6:33:38 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
On Sat, Jul 03, 2010 at 11:25:37PM +0100, Sam Knight wrote:
> As I am fairly new to this mailing list. Is there any standards that are
> required for the project. So far I'm using ruby outputting to an xml feed.

The archive can be read here:
https://secure.mysociety.org/admin/lists/pipermail/theyworkforyou-wales/

A couple of old posts by me:
https://secure.mysociety.org/admin/lists/pipermail/theyworkforyou-wales/2009-July/000002.html
https://secure.mysociety.org/admin/lists/pipermail/theyworkforyou-wales/2009-December/000031.html

I would just use the existing XML data for the bodies TheyWorkForYou covers
as guidance. And ask Duncan to put the code he's already written somewhere :)

ATB,
Matthew

Matthew Somerville

unread,
Jul 3, 2010, 6:42:06 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
Hi,

I was in contact with someone at the Assembly last year about a possible
project they were considering about outputting the Record of Proceedings
in some form of machine readable format. As far as I know, the project
was approved but there were lots of other unrelated things that also had
to be done, so I don't know where it's currently at. I'll write again
and ask next week.

For interest, here's basically what I wrote to them in July 2009 or so:
| Your project sounds like it would be of great value. Coincidentally,
| we've recently set up a mailing list for volunteers to discuss being
| able to get TheyWorkForYou for the Welsh Assembly up and running, and
| if the Assembly could provide machine readable data (such as XML, but
| the format isn't that important) in the first place, that would make
| it much easier than having to parse HTML or PDF and convert it into
| machine-readable data.

| Whatever information you have to be made available in a
| machine-readable format, I'm sure people would be glad if you provided
| it (you'd be better than the UK Parliament, the Scottish Parliament,
| or the Northern Ireland Assembly if you provided machine-readable data
| of your proceedings :) ). All I'd say at this stage is that wherever
| possible, you link things together with IDs. For example, give each
| Assembly Member an ID, and in the machine-readable data for a day's
| proceedings, mark each speech with that ID so anyone can pick out who said what.
| Also, give each speech its own ID so it can be referred to by other
| people (and perhaps by you when someone refers back to a previous
| speech or ansewr they gave, for example). I'm afraid I'm not an expert
| on Welsh procedure, but say you're discussing a Bill and someone
| proposes an Amendment, having each bit of the Bill marked up with some
| form of ID means the Amendment can "know" which bit of the Bill it is
| amending by referring to those IDs. Just having that sort of mindset
| when you approach any data will help.
|
| From TheyWorkForYou's point of view, having machine-readable data on
| Assembly Members (not just current, but historical including
| ministerial positions, party changes, etc.), proceedings, and things
| like that would be most useful, but I'm sure someone can do something
| with whatever you can produce :)

ATB,
Matthew

Matthew Somerville

unread,
Jul 3, 2010, 6:46:04 PM7/3/10
to TheyWorkForYou for the Welsh Assembly
On Sat, Jul 03, 2010 at 09:38:11PM +0100, Carl Morris wrote:
> There would be a warm welcome for a Welsh language interface, what do
> you think? Could it be run in the same fashion as the Pledgebank
> interface? http://www.cy.pledgebank.com/ (ah, this translation also
> needs updating...)

Not sure it was ever fully finished. If you wanted to help there too...
:-)

> I'd like to volunteer to translate the interface for TWFY Welsh Assembly
> into Welsh - if you'll take it. I'm sure there will be other people
> interested in contributing too. I'm happy to coordinate it.

Of course we'd take it. :) I would have to say that the codebase has
probably next to no i18n support, so it would be quite some effort just
to get to the point of having a .po file to translate - but it would
certainly be doable.

> If it's anything like Pledgebank it would be done with gettext/po files
> - correct? It's the same method I've used before for WordPress localisation.

Yes.

ATB,
Matthew

Duncan Parkes

unread,
Jul 4, 2010, 6:03:32 AM7/4/10
to TheyWorkForYou for the Welsh Assembly
Hi Sam

On 3 July 2010 23:25, Sam Knight <samkn...@gmail.com> wrote:
> As I am fairly new to this mailing list. Is there any standards that are
> required for the project. So far I'm using ruby outputting to an xml feed.

No standard exactly, but it would probably be best to do things in
Python as that's what all the other scrapers and infrastructure for
running them are in. Have you got much written already? I have some
code already written in Python for parsing Welsh Assembly pages,
though it's unfinished and needs work.

Are you happy to work in Python? If so, I suggest we work together on
this and host it temporarily on github. I'll try to tidy my stuff up a
bit and put it up on my github account. What I've got at the moment is
a start on parsing a page of The Record. I've not really done anything
on scraping the pages. Ideally we should try to do all this roughly
like the scrapers for the other parliaments.

Can we chat on XMPP? I'm duncan...@gmail.com

Cheers,

Duncan

Sam Knight

unread,
Jul 4, 2010, 10:35:55 AM7/4/10
to TheyWorkForYou for the Welsh Assembly
I've not used python before but could give it a go. I am preferring ruby at the moment because of RubyGems which is currently running my parser and language detector.

Reply all
Reply to author
Forward
0 new messages