CA not picking up committees when there are more than one

28 views
Skip to first unread message

Mike Hedblom

unread,
Jun 16, 2016, 12:08:31 AM6/16/16
to Open State Project
When a Calif bill is sent to multiple committees, there is no 'id', and the 'details' only picks up the first in the list.  For example,

      {  
         "related_entities":[  
            {  
               "type":"committee",
               "name":"Coms. on ED. and HUMAN S.",
               "id":null
            }
         ],
         "actor":"upper",
         "action":"Referred to Coms. on ED. and HUMAN S.",
         "+actor_info":{  
            "details":"Senate (Committee CS44)"
         },
         "date":"2016-06-09 00:00:00",
         "type":[  
            "committee:referred"
         ]
      }

There should be two actor_info:details

Senate (Committee CS44) -- Education

Senate (Committee CS74) -- Human Services


Note, where there are multiple, they appear to always be separated by an ' and '.

Mike Hedblom

unread,
Jun 17, 2016, 1:11:26 AM6/17/16
to Open State Project
I did a little more research today and found out that, while the bill is often assigned to multiple committees, it is actually in only one at a time. In the example above, the bill was actually in the Education committee (CS44).

I see we have those committee codes hard-coded in ca\bills.py.  But I also see it appears to not be hooked up to anything.

    def committee_code_to_name(self, code,
        committee_code_to_name
=get_committee_code_data()):
       
'''Need to map committee codes to names.
        '''

       
return committee_code_to_name[code]...

I am using Open States to identify the committee members to contact when certain bills are in committee.  Any suggestions how we can fix this?  I still don't understand the system enough to make a change myself.

- - Mike

Andy Lo

unread,
Jun 20, 2016, 11:07:17 AM6/20/16
to Open State Project
Thanks for this Mike. Working on getting all the CA changes delegated, but it may be a little bit as we're currently pretty saturated.

Mike Hedblom

unread,
Jun 24, 2016, 1:04:18 AM6/24/16
to Open State Project
Thanks Andy.  I'm actually finding lots of problems with the data.  I'm still struggling to understand how the system works.  Especially with the fact that Calif pulls from an SQL database.  There is a lot of useful information in the DB that doesn't need to be -- and is more accurate than -- the scrapping. 

Is there some kind of overview that explains how OpenStates works (at the programming level)?

Andy Lo

unread,
Jul 23, 2016, 11:57:05 AM7/23/16
to Open State Project
Hi Mike,

I've addressed the issue with California not detecting multiple committee abbreviations in the bill histories, but I'm not sure I fully understand why the +actor_info field was introduced, or that it's properly labeled and working as intended - it may need to be reconsidered.  Typically, fields prefixed by '+' are non-standard fields that are specific to a state's scraper.  Can you point to a couple of instances of where there's data that the database and Open States both explicitly provide, but the Open States data does not match?  It'll help me pinpoint where to target my efforts.

There is some documentation about billy, which is the framework underlying the Open States scrapers (written specifically for Open States).  You can find the documentation at: http://billy.readthedocs.io/en/latest/.  I'm going to aim at expanding and rewriting the documentation in the near future, however.

Mike Hedblom

unread,
Aug 3, 2016, 11:22:25 PM8/3/16
to Open State Project
Hi Andy,

Thank you for addressing the multiple committee abbreviations issue.  We did a lot of digging and found California is wacky in how it does some of its reporting. A bill is never actually in more than one committee at a time, but Calif seems to report them that way sometimes. A good example is Calif bill AB 2368.

We found we could not rely on OpenStates to identify the correct committee_id:. We then thought we could rely on committee_code: instead, but we found cases when that element is not reported by OpenStates (this happens when the bill is in the "suspense file").  The only thing we found that consistently has the correct committee information is the +actor_info: details: element. As a workaround, we parse the details: field and use that to lookup the ID. We started to follow this process:

1. Only look for the committee when the action is "committee:referred".
2. Retrieve the code from the committee_code field.
2a. If the field does not exist, scrape the details field to retrieve the code. (or you might just always do this)
3. Use a hand-made map to find the committee_id.  [I wish there was a way to easily map Calif committee code to OpenStates committee ID. ]

Here are some of the examples we found:
  • Committee code is not defined for every action when the bill is in the committee. Although Committee Id is given. You can see in the following snippet
{
            "related_entities": [
                {
                    "type": "committee",
                    "name": "Standing Committee on Human Services",
                    "id": "CAC000690"
                }
            ],
            "actor": "lower",
            "action": "From committee chair, with author's amendments: Amend, and re-refer to Standing Committee on Human Services. Read second time and amended.",
            "+actor_info": {
                "details": "Assembly (E&E Engrossing)"
            },
            "date": "2016-04-05 00:00:00",
            "type": [
                "committee:passed",
                "bill:reading:1",
                "bill:reading:2"
            ]
        }


And
  • Committee code is not always defined for null Committee ids.

{
            "related_entities": [
                {
                    "type": "committee",
                    "name": "APPR. suspense file.",
                    "id": null
                },
                {
                    "type": "committee",
                    "name": "APPR",
                    "id": null
                }
            ],
            "actor": "lower",
            "action": "In committee: Set, first hearing. Referred to APPR. suspense file.",
            "+actor_info": {
                "details": "Assembly (Committee CX25)"
            },
            "date": "2016-04-27 00:00:00",
            "type": [
                "committee:referred"
            ]
        },

And

  • In case of multiple Committee, Open state gives the same details for both the committees.

{
            "related_entities": [
                {
                    "type": "committee",
                    "name": "Standing Committee on Appropriations",
                    "id": "CAC000797"
                },
                {
                    "type": "committee",
                    "name": "Standing Committee on Appropriations",
                    "id": "CAC000797"
                }
            ],
            "+no_votes": [
                "0"
            ],
            "actor": "lower",
            "+yes_votes": [
                "7"
            ],
            "action": "From committee: Do pass and re-refer to Standing Committee on Appropriations. with recommendation: To Consent Calendar. (Ayes 7. Noes 0.) (April 12). Re-referred to Standing Committee on Appropriations.",
            "+actor_info": {
                "details": "Assembly (Committee CX25)",
                "committee_code": "CX25"
            },
            "date": "2016-04-13 00:00:00",
            "type": [
                "committee:passed",
                "committee:referred",
                "committee:passed:favorable"
            ],
            "related_votes": [
                "CAV00129529"
            ]
        }

When looking at the database provided by Calif, we noted that 'tertiary' field always shows the correct committee_code when the bill is in committee.

Every bill in the Calif bill history table has an 'action_code'. I have been trying to get the codes from the Calif Leg IT department. I manually mapped them to bill states, but I don't trust I am 100% and do not trust they won't change or be added too.  That, however, would be the ideal method for OpenStates instead of scrapping the text.








Reply all
Reply to author
Forward
0 new messages