Fixed Width Text Custom Tabular Output

11 views
Skip to first unread message

James Grubb

unread,
Jan 29, 2018, 9:57:57 AM1/29/18
to OpenRefine
Hi,

I'm new to OpenRefine and have a question.

I'm an amateur astronomer learning how to help the Minor Planet Center take follow-up measurements of newly discovered asteroids.  This is citizen science that helps professional astronomers refine orbits for these objects.

This requires that I download data in the form of a fixed width flat file or JSON file, refine the file to a select group of targets and then output it back to a flat file(s) for several different astronomy applications.  I then take a series of images with my telescope, perform astrometry measurements, and then upload the results back to the Minor Planet Center.  The astronomy applications I use can import fixed width data, however each application requires a different fixed with format.  Today I do this as a manual process and am trying to figure out how to automate it.

By increasing memory I was able to open the flat file, (and JSON file), in OpenRefine (2.8.1) on my Mac and import it using the fixed with text parser.  When refining the data using facets I have to transform text to number and then back to text before outputting.  When I do this the length (padding) changes and it looses it's fixed width.  This is a problem for decimal numbers as they have to be lined up in the fixed width output file.  I'm trying to figure out how to do this with a custom tabular output and am having a hard time understanding GREL and RegEx and how they work together on custom output.  I'm new to Regex as well so this isn't helping.

So far, I have figured out how to output to a text file using:

{{cells["column1"].value}}{{cells["column2"].value}}   {{cells["column3"].value}}

This works and I can pad by using spaces i.e. between column2 and column3 above, however this does not allow me to account for different lengths of decimal numbers and ends up creating an output without the correct fixed width.

Here is an example of my output with just a few rows and only three columns:

00001   3.34   309.49412
00002   4.13   291.65136
00003   5.33   259.23491
00004   3.2   292.71034
00005   16.85   139.06218
00006   5.71   289.56753
00007   5.51   339.84951

I need it to output like this:

00001   3.34   309.49412
00002   4.13   291.65136
00003   5.33   259.23491
00004   3.2    292.71034
00005  16.85   139.06218
00006   5.71   289.56753
00007   5.51   339.84951

Can I do this with GREL or with a combination of GREL and RegEx?

Here is an example of the original flat file with two records:

00002    4.13  0.11 K172G 248.96907  309.99722  173.08842   34.84039  0.2307044  0.21344212   2.7729963  0 MPO394986  7871 107 1821-2016 0.58 M-v 28h MPC        0000              (2) Pallas
00003    5.33  0.32 K172G 213.98291  248.22306  169.85927   12.98994  0.2567897  0.22610725   2.6684540  0 MPO389571  6845 103 1821-2016 0.60 M-v 38h MPC        0000              (3) Juno

This is an example of the transformation for one of my astronomy applications, (Software Bisque, TheSky):

     2 Pallas        |2017 02 16.000|0.230704  |2.772996| 34.8404|173.0884 |309.9972 | 2000|248.9691  | 4.13| 0.11|   0.00
     3 Juno          |2017 02 16.000|0.256790  |2.668454| 12.9899|169.8593 |248.2231 | 2000|213.9829  | 5.33| 0.32|   0.00
     
This is an example of transformation for a second astronomy app, (Cartes du Ciel):

      2) Pallas                                2017 02 16.0 248.9691  2.772996 0.230704 309.9972 173.0884  34.8404   4.13  0.11
      3) Juno                                  2017 02 16.0 213.9829  2.668454 0.256790 248.2231 169.8593  12.9899   5.33  0.32

If anyone can point me in the right direction, I'd appreciate it.  I have the "Using OpenRefine" book and have tried to figure it out from that and the online documentation with out any luck so far.

Thank you,
Jim 
Reply all
Reply to author
Forward
0 new messages