Dbt Json Extract

0 views
Skip to first unread message

Julia Dodoo

unread,
Aug 4, 2024, 10:12:52 PM8/4/24
to gualaroucons
Youmay have source data containing JSON-encoded strings that you do not necessarily want to deserialize into a table in Athena. In this case, you can still run SQL operations on this data, using the JSON functions available in Presto.

To extract the name and projects properties from the JSON string, use the json_extract function as in the following example. The json_extract function takes the column containing the JSON string, and searches it using a JSONPath-like expression with the dot . notation.


The functions in this section perform search or comparison operations on JSON values to extract data from them, report whether data exists at a location within them, or report the path to data within them. The MEMBER OF() operator is also documented herein.


A candidate scalar is contained in a target scalar if and only if they are comparable and are equal. Two scalar values are comparable if they have the same JSON_TYPE() types, with the exception that values of types INTEGER and DECIMAL are also comparable to each other.


A candidate object is contained in a target object if and only if for each key in the candidate there is a key with the same name in the target and the value associated with the candidate key is contained in the value associated with the target key.


Returns 0 or 1 to indicate whether a JSON document contains data at a given path or paths. Returns NULL if any argument is NULL. An error occurs if the json_doc argument is not a valid JSON document, any path argument is not a valid path expression, or one_or_all is not 'one' or 'all'.


Returns data from a JSON document, selected from the parts of the document matched by the path arguments. Returns NULL if any argument is NULL or no paths locate a value in the document. An error occurs if the json_doc argument is not a valid JSON document or any path argument is not a valid path expression.


The return value consists of all values matched by the path arguments. If it is possible that those arguments could return multiple values, the matched values are autowrapped as an array, in the order corresponding to the paths that produced them. Otherwise, the return value is the single matched value.


MySQL supports the -> operator as shorthand for this function as used with 2 arguments where the left hand side is a JSON column identifier (not an expression) and the right hand side is the JSON path to be matched within the column.


The -> operator serves as an alias for the JSON_EXTRACT() function when used with two arguments, a column identifier on the left and a JSON path (a string literal) on the right that is evaluated against the JSON document (the column value). You can use such expressions in place of column references wherever they occur in SQL statements.


This is an improved, unquoting extraction operator. Whereas the -> operator simply extracts a value, the ->> operator in addition unquotes the extracted result. In other words, given a JSON column value column and a path expression path (a string literal), the following three expressions return the same value:


Returns the keys from the top-level value of a JSON object as a JSON array, or, if a path argument is given, the top-level keys from the selected path. Returns NULL if any argument is NULL, the json_doc argument is not an object, or path, if given, does not locate an object. An error occurs if the json_doc argument is not a valid JSON document or the path argument is not a valid path expression or contains a * or ** wildcard.


Compares two JSON documents. Returns true (1) if the two document have any key-value pairs or array elements in common. If both arguments are scalars, the function performs a simple equality test. If either argument is NULL, the function returns NULL.


This function serves as counterpart to JSON_CONTAINS(), which requires all elements of the array searched for to be present in the array searched in. Thus, JSON_CONTAINS() performs an AND operation on search keys, while JSON_OVERLAPS() performs an OR operation.


Returns the path to the given string within a JSON document. Returns NULL if any of the json_doc, search_str, or path arguments are NULL; no path exists within the document; or search_str is not found. An error occurs if the json_doc argument is not a valid JSON document, any path argument is not a valid path expression, one_or_all is not 'one' or 'all', or escape_char is not a constant expression.


'all': The search returns all matching path strings such that no duplicate paths are included. If there are multiple strings, they are autowrapped as an array. The order of the array elements is undefined.


Within the search_str search string argument, the % and _ characters work as for the LIKE operator: % matches any number of characters (including zero characters), and _ matches exactly one character.


To specify a literal % or _ character in the search string, precede it by the escape character. The default is \ if the escape_char argument is missing or NULL. Otherwise, escape_char must be a constant that is empty or one character.


If not specified by a RETURNING clause, the JSON_VALUE() function's return type is VARCHAR(512). When no character set is specified for the return type, JSON_VALUE() uses utf8mb4 with the binary collation, which is case-sensitive; if utf8mb4 is specified as the character set for the result, the server uses the default collation for this character set, which is not case-sensitive.


JSON_VALUE() simplifies creating indexes on JSON columns by making it unnecessary in many cases to create a generated column and then an index on the generated column. You can do this when creating a table t1 that has a JSON column by creating an index on an expression that uses JSON_VALUE() operating on that column (with a path that matches a value in that column), as shown here:


Returns true (1) if value is an element of json_array, otherwise returns false (0). value must be a scalar or a JSON document; if it is a scalar, the operator attempts to treat it as an element of a JSON array. If value or json_array is NULL, the function returns NULL.


Any JSON objects used as values to be tested or which appear in the target array must be coerced to the correct type using CAST(... AS JSON) or JSON_OBJECT(). In addition, a target array containing JSON objects must itself be cast using JSON_ARRAY. This is demonstrated in the following sequence of statements:


I've not done anything like this before and I would like to extract this information from SharePoint and potentially put it into another SharePoint list or send via email as I've had a request to get a summary of a teams holiday.


Unfortunately, that screenshot doesn't really tell us anything other than that the Parse JSON action failed. For Troubleshooting purposes please add a Compose step just above the Parse JSON inside the loop. Add the content that you are feeding into the Parse Json. That way we can see what the input of the parse JSON is. That should help us figure out why that particular loop failed.


Your screenshot shows a successful loop. Could you click on the next failed link and show us the output from a failed loop. I suspect you are getting some loops where the Content itself is a null record. you may have to add a filter array prior to starting the loop to remove those records.


By the screen captures you shared, it seems that the "approval JSON" column has different fields and values, so the structure is not always the same. For example, in a row there's a field called "comment", but in another one, this field is not provided.


Please, apply the solution by @Pstork1 to all the fields in the schema. It seems that some fields might have no value, and the Parse JSON action expects all of them to be informed. It should be something like this:


I'm trying to extract the value of the version field in package.json from the build.sh file. Is there a way to do this? Looks like i can use the below to do the extraction, but what if node is not available globally where i execute the build.sh file so looking for a generic way to exact the value from package.json into build.sh file.


If I understand where you are heading to, you are trying to version your artifact (Docker image in your case). Are you planning to manually bump the version (in package.json) every time when you have a checkin?


Instead, I would suggest you to use some automated solutions like Nebula Release Plugin (In Java World), I don't know anything similar exist in nodejs world. If none of this makes any sense to you, let me know, I will delete this answer :)


Now I want to get access to the fields in the incoming field so that I can search the data later with R. For this reason, I need something like: extract pairdelim="," kvdelim=":", but I have absolutely no idea how I can do that.


So, due to double quotes in the value of the incoming field, the default field extraction is not capturing the whole string. In this case, you'd have to setup a custom field extraction to do that. Give this a try


Does the field incoming in your event contains full json string that we see in the example? If yes, then use the spath option as suggested by @sundareshr below. If not, that needs to fixed (field extraction need to be set to capture full json string) before using spath.


Hi have a nested json file imported to excel, power query, they successfully tranforme to a table the values, but in the third column for example, there is a [Record] value, I want to write in each row the content of the json instead of the [Record], I don't want to transforme again in tables, I want the value of the record. Anyone can help to accomplish that?


wide_number_mode: Optional mandatory-named argument,which defines what happens with a number that cannot berepresented as a FLOAT64 without loss ofprecision. This argument accepts one of the two case-sensitive values:

3a8082e126
Reply all
Reply to author
Forward
0 new messages