Import css3.0 flat files - general questions

27 views
Skip to first unread message

hage...@bc.edu

unread,
Dec 27, 2017, 1:25:47 PM12/27/17
to pisces-db
Hello.
I'm just starting out with pisces.

I was given css3.0 flat files to import.
The site file ('1882.site') looks like:

AML     1994242       -1   42.1311   73.6941    3.4000 Almayashu, Kyrgyzstan                              ss   AML       0.0000    0.0000 20020807 13:52:27

Following the doc examples, I tried to read it via:

import sqlalchemy as sa
import pisces.schema.css3 as css

class Site(css.Site):
    __tablename__ = 'site'

with open('1882.site') as ffsite:
    for line in ffsite:
        isite = Site.from_string(line)

The problem is that the line has a load time (hh:mm:ss) as well as a load date (lddate), so the from_string conversion bombs.
I can hand edit the file, but I think this might be a recurring issue.

My questions are:

1. Where do I find the fixed-length fields description for each table ? 
    i.e, for the site table, I can look up the fields in oracle by doing SQL>desc static.site;
    So I can see that LDDATE is of type 'DATE', but I imagine this can take many different formats.

2. Is there a way to override the flat file field descriptions, say to give it the actual DATE format for this case ?

3. More fundamental: I'm unclear on how the schemas are being used.
    For instance, why is it necessary to define the Site class as above ?
    I would've thought that the css3.0 schema is predefined and there would be a method already available to
    generate an instance of a class(table), along with a method to populate the attributes via the flat file line ?

e.g., if instead of using flat files, I connect to an oracle db:
session = ps.db_connect(or_string)
Site, Affil = ps.get_tables(session.bind, ['static.site', 'static.affiliation'])

It returns a Site object without me having to subclass it.

Sorry for the questions, just want to understand how the API is meant to work.

Thanks!
-Mike


Jonathan MacCarthy

unread,
Dec 27, 2017, 11:26:34 PM12/27/17
to hage...@bc.edu, pisces-db
Mike,

These are great questions.

1. When you create a Site class the way you did, the class has a ._format_string attribute that is used to parse the argument to .from_string .  I think the lddate format is shown there, and you may be able to manipulate that string to get what you want.  Alternately, you can manipulate the line to match the date format in your loop.

2. This isn’t ideal, and you’re right to be looking for a better way, but there isn’t one at the moment.  It wouldn’t be too hard to make this configurable for users, though.  Would you mind filing a GitHub issue for this, and I’ll try to make this possible when I return from break.  I should mention, though, Pisces master on GitHub is currently Python 3 only.  Is this ok for you?

3.  You’re right that the intended API isn’t always clear.  Some of these functions are experiments to see what is possible or feels natural, so I apologize for any confusion.  Defining classes is the cleanest and safest way to get a table class with attributes like .from_string that work as expected.  get_tables will reflect arbitrary database tables and makes guesses at the schema (css vs kbcore vs antelope), whereas “from pisces.schema.css import Site” is explicit.  The css3.0 schema is defined in pisces.schema.css3, but it doesn’t know what your actual SQL tables are named.  Subclassing allows you to get the table structure and convenience methods defined in the parent class, and allows you to name the specific target SQL table.  Instances with vanilla table names are available in pisces.tables.css3 (e.g. a subclass of Site with a SQL table name of “site”).

I hope these responses are helpful.  I’ll try to make the documentation more clear on these points, as well.

Best,
Jonathan


From: pisc...@googlegroups.com <pisc...@googlegroups.com> on behalf of hage...@bc.edu <hage...@bc.edu>
Sent: Wednesday, December 27, 2017 11:25:47 AM
To: pisces-db
Subject: Import css3.0 flat files - general questions
 
--
You received this message because you are subscribed to the Google Groups "pisces-db" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pisces-db+...@googlegroups.com.
To post to this group, send email to pisc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pisces-db/374f10df-92b8-4144-b41a-65126d72a2ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages