Feedback on "Camera Report Interchange Format Specification"

28 views
Skip to first unread message

Robert Shaver

unread,
Mar 19, 2014, 8:25:05 PM3/19/14
to ves-tech-ca...@googlegroups.com
Hi All,

First let me point out my background. I'm a software/hardware engineer with an interest in filmmaking. I have considerable experience with creating specifications and writing the software to implement them.

With this in mind, I have a few suggestions that may improve the clarity of this specification.

EXCHANGE FORMAT
The choice of CSV is fine as long as you don't complicate it much. You have already run into that problem and solved it by prefacing field names with tk.  and makes extensions to the specification easier. XML makes the the meanings of things explicit

GENERAL COMMENTS
  1. BE EXPLICIT ABOUT THE INTERCHANGE FILE FORMAT: I think the CSV file format is a good choice. However, to insure that all implementations of the interchange format are compatible, the spec needs to be quite explicit so that there is little room for incompatible interpretation of the field values.
    1. Numeric fields come in three possibilities: signed integers, unsigned integers and real numbers. Integers are exact but not all real numbers get interpreted in the same way on all platforms. This is a place where incompatibility can creep in.
    2. When an optional field is left blank, how will that appear in the CSV file? A string can be shown as two commas in a row, but what about a number or a timestamp? I would suggest that, when exporting, any value in the CSV that appears as two commas in a row be entered into the database as a NULL. When importing, and value with either ,, or ,"", be entered into the database as NULL. (NULL is a special value used in the database to explicitly indicate that the field contains no value.)
    3. Specify the maximum field size for all strings. This could be a blanket statement that applies to all strings in the CSV, but keep in mind that very long strings can cause implementation problems on some platforms.
    4. Specify both import behavior and export behavior. This is especially important for handling exceptions and creating a robust specification that tolerates variations in a predictable way. For example, if a database is being exported to the CSV interchange format and the required field "Camera letter" is NULL, specify that it be exported as the letter 'A'. If importing a CSV and the "Camera letter" is missing then specify that it be imported as the letter 'A'. The other option is to fail the import or at least skip that record. (A log of these exceptions should be generated so that someone can review the exceptions.)
    5. Date/time values very from platform to platform. When exporting a timestamp, with no time part, export it with a time of 00:00:00. When importing a timestamp with no time value, add a time value of 00:00:00.
  2. SEPARATE THE INTERCHANGE FILE FORMAT FROM INTERFACE: It is useful to keep the file format specification separate from user interface issues. You might even consider having two specs or two sections in this specification; one for the CSV file format and one for the interface requirements.
  3. USER INTERFACE SPECIFICATION: The fields in the spec with types "Multi-select" and "Enum" are really type string in the CSV file. "Multi-select" and "Enum" really only applies to how the user will specify the values to be entered in these fields. Required vs Optional fields are also principally about the user interface. If a CSV file is being imported into a database and does not have one or more of the required fields, then an error would be reported.
    1. FIELD RANGE CHECK: It may be useful to specify the range of values that are correct in a field. This could apply to numeric fields such as shutter angle, tilt angle, etc. Perhaps even camera letter should be constrained to be s single capital letter. Some range checks depend on data in other fields such as all that have the attribute "Required for stereo".
    2. DEFAULT VALUES: Should some fields have default values? For example the camera letter could default to 'A'.
    3. Where will the values be obtained for the various "Multi-select" and "Enum" field types? It is possible to put them into the CSV file using the same technique as you used for the takes (prefacing field names with tk) but that complicates things. This is where XML is much more flexible.
That's all that comes to mind right now. Please feel free to ask me any questions or let me know how I can be of assistance.

Best regards,

Rob:-]
---
Robert (Rob) Shaver

Robert Shaver

unread,
Mar 20, 2014, 3:02:46 PM3/20/14
to ves-tech-ca...@googlegroups.com
I see that I left my thoughts incomplete about the exchange format it the above post.

EXCHANGE FORMAT
The choice of CSV is fine as long as you don't complicate the requirements much. You have already run into that problem and solved it by prefacing field names with tk. XML makes the the meanings of things more explicit and makes extensions to the specification easier. XML gives the spec much more flexibility to adapt to changing requirements and there are lots of tools in every platform to make it easy to deal with. CSV is great for capturing data in a table but begins to unravel once you start adding other features like Enum and Multi-select.

Sam Richards

unread,
Mar 24, 2014, 2:17:12 PM3/24/14
to ves-tech-ca...@googlegroups.com
Thanks for your detailed email, sorry it took me a few days to reply. 

I will answer your points below, but I do want to stress that my view of the camera reports as a long term direction is that they will be adding to information that is recorded in the plate metadata, and as such are really just a set of notes, rather than a formal set of fields. So you will notice that almost all the fields are strings, even when the values typically will be numeric values. I know this goes against most software developers best practices, but when the poor data-wrangler is in the line of fire, they want quick ways of entering the values any way they can, and forcing them into a numeric field is a big problem.

For more accurate reliable data, we ultimately will be relying on the machine captured data from the plate, and then augmenting it with information that cannot be captured automatically.


On Wed, Mar 19, 2014 at 5:25 PM, Robert Shaver <robert....@gmail.com> wrote:

GENERAL COMMENTS
  1. BE EXPLICIT ABOUT THE INTERCHANGE FILE FORMAT: I think the CSV file format is a good choice. However, to insure that all implementations of the interchange format are compatible, the spec needs to be quite explicit so that there is little room for incompatible interpretation of the field values.
    1. Numeric fields come in three possibilities: signed integers, unsigned integers and real numbers. Integers are exact but not all real numbers get interpreted in the same way on all platforms. This is a place where incompatibility can creep in.
Good point. 
    1. When an optional field is left blank, how will that appear in the CSV file? A string can be shown as two commas in a row, but what about a number or a timestamp? I would suggest that, when exporting, any value in the CSV that appears as two commas in a row be entered into the database as a NULL. When importing, and value with either ,, or ,"", be entered into the database as NULL. (NULL is a special value used in the database to explicitly indicate that the field contains no value.)
    2. Specify the maximum field size for all strings. This could be a blanket statement that applies to all strings in the CSV, but keep in mind that very long strings can cause implementation problems on some platforms.
Good idea, there will be a couple of fields where I don't want to specify a max size, but for most cases this should be doable. 
    1. Specify both import behavior and export behavior. This is especially important for handling exceptions and creating a robust specification that tolerates variations in a predictable way. For example, if a database is being exported to the CSV interchange format and the required field "Camera letter" is NULL, specify that it be exported as the letter 'A'. If importing a CSV and the "Camera letter" is missing then specify that it be imported as the letter 'A'. The other option is to fail the import or at least skip that record. (A log of these exceptions should be generated so that someone can review the exceptions.)
I will need to think about that, in at least which fields this is critical. 
    1. Date/time values very from platform to platform. When exporting a timestamp, with no time part, export it with a time of 00:00:00. When importing a timestamp with no time value, add a time value of 00:00:00.
All the timestamp fields are required to have a time component, perhaps I should do a separate field-type breakdown for any known requirements. 
  1. SEPARATE THE INTERCHANGE FILE FORMAT FROM INTERFACE: It is useful to keep the file format specification separate from user interface issues. You might even consider having two specs or two sections in this specification; one for the CSV file format and one for the interface requirements.
Even though I do mention "filemaker" at the top, I have tried to separate some of the filemaker specifics already from the main spec, but you are right about the multi-selects and enums. The "enum" isnt really an enum in the database term, so I might change the description a little for that. Is there somewhere else that you are seeing me talk about interface requirements, since I have tried to avoid that.
  1. USER INTERFACE SPECIFICATION: The fields in the spec with types "Multi-select" and "Enum" are really type string in the CSV file. "Multi-select" and "Enum" really only applies to how the user will specify the values to be entered in these fields. Required vs Optional fields are also principally about the user interface. If a CSV file is being imported into a database and does not have one or more of the required fields, then an error would be reported.
    1. FIELD RANGE CHECK: It may be useful to specify the range of values that are correct in a field. This could apply to numeric fields such as shutter angle, tilt angle, etc. Perhaps even camera letter should be constrained to be s single capital letter. Some range checks depend on data in other fields such as all that have the attribute "Required for stereo".
This is *really* difficult, since we do need to allow for free-form text in these fields, I have considered doing at least some regex pattern matching, and/or conversion, so we could allow people to enter in imperial and convert to metric, but that's quite a challenge in filemaker.  
    1. DEFAULT VALUES: Should some fields have default values? For example the camera letter could default to 'A'.
Good idea, I am already doing that in the reference database, but it does seem reasonable to mention it in the spec. 
    1. Where will the values be obtained for the various "Multi-select" and "Enum" field types? It is possible to put them into the CSV file using the same technique as you used for the takes (prefacing field names with tk) but that complicates things. This is where XML is much more flexible.
 See comment above.

Thanks a lot for these comments, I'll see if I can work on them in the next few weeks. If you are up for it, I might do this as a new version of the document, and I can share it directly with you (and anybody else who is interested) so you can annotate directly on the document. Do let me know if that is something you would be interested in (and anybody else, please shoot me a mail directly).

Thanks again...

Sam.



Reply all
Reply to author
Forward
0 new messages