A Geosoft database stores 3D spatial data in a form that allows for both very-large volume data storage and very efficient processing. Geosoft databases are stored on a file system in a file that has extension .gdb.
The Geosoft database was first designed in 1992 and has evolved to become a de facto standard for airborne and ground geophysical data, geochemical data, and drillhole data of all kinds. All Geosoft programs and third-party applications that use the Geosoft API to read databases are able to work with databases created by any version of Geosoft since 1992, though database features and capabilities that are newer than the reading program will not be available to older programs. This cross-version stability of data has been key to the emergence of Geosoft databases as an industry standard, especially for airborne geophysical surveys.
Fundamentally, Geosoft databases store located data which has either an (x, y) location for 2D data, or an (x, y, z) location for 3D data such that each location may have any number of fields of associated information. Information fields are referred to as 'channels' in the context of a Geosoft Database, and channels may contain numeric data, strings or arrays of numeric data to store things like time-series. For example an airborne magnetic survey may have (x, y, elevation, altimeter, mag) data, and a ground geochemical survey may contain (x, y, elevation, Cu, Au, Hg).
Data in a database is grouped into the concept of 'lines' of data such that each 'line' contains the data from a single airborne survey line, a single ground survey line, a marine ship track, a set of data from a single drill hole, or from any other grouping construct that is suitable for the stored data. Geosoft's desktop application, Oasis montaj, displays a single line of data in a spreadsheet-like form to the user, allowing a user to change lines and change which channels are displayed and the order channels are displayed. Simple databases might place all data into a single line. All lines in a Geosoft database share the channel definitions, of course with data different for each line.
In this example, the first line is simply a comment line that identified the spatial coordinate system for the located data.The second line identified the names of the data columns, and data follows starting in line 3. Note that columns of information after the first line are comma-delimited, and each line has 4 columns of information.
In the database line plot above we see that the data in Line 0 (L0E) would be better organized as separate survey lines rather than a single contuous lines. In this case the lines are horizontal, 200 m apart with sampled every 50 m along each line:
Lets modify the script by adding some processing that will use a Geosoft method to split a line into separate lines based on a change in line direction. To split a line into sections we will use a processing function from the GXAPI, which exposes the full Geosoft API.
We will use the geosoft.gxapi.GXDU.split_line_xy2 function (GXDU contains many Database Utility functions). This is part of the low-level geosoft.gxapi so a bit more care is required. Database functions often require locks on database symbols (handles to lines, channels and other database objects), which is the case for the Line, _xch and _ych arguments of split_line_xy2. Because Geosoft databases allow for concurrent access by multiple applications, symbols must be locked and unlocked as they are used. The geosoft.gxpy pythonic api will do this for you, but the geosoft.gxapi requires that you directly manage locks. Terminating a script removes any locks that may have remained on any symbols.
Geosoft databases and data models are designed around the concept of data vector arrays, which support high performance data processing. Each channel in a line holds a single 1-dimensional or 2-dimentional array of data. Python's standard library for working with data arrays is numpy, and this is a standard part of any scientifically-oriented Python environment. The Geosoft_gdb class provides for reading and writing data directly to/from numpy arrays as well as Geosoft VV vector arrays.
This example shows a simple use of numpy and the next section shows the same exercise using a Geosoft VV. In the open database above we see that the mag data has values around 5000 nT, which we will subtract from the data.
This example does the same thing as the previous example, but uses the Geosoft vector processing functions from the Geosoft VV library. Geosoft VV functions will deliver even higher performance than numpy, and while you may be limited by the capabilities of the Geosoft API, there are many functions that perform geoscience-specific operations on data (see gxapi.GXVVU). Note that data in a Geosoft VV can also be exposed as a numpy array to provide numpy access when needed.
In this exercise we calculate the distance along each line from the starting point of each line. The Pythagorean expression is simple and can be implemented in a single line of numpy code. This example reads multiple channels from the database and uses numpy slicing to work with individual channel columns from a 2D numpy array.
b1e95dc632