Typically, visualizations expect data in some specific form. For example,a pie chart may expect data as two columns: a text label and a numericvalue. The data within the data source may not exactly match this structure.For example the data source may havemore than two columns, or the order of the columns may not match theorder expected by the pie chart.
The syntax of the query language is similar to SQL. Developers familiarwith SQL should be able to quickly learn and use this query language.There are manySQL tutorialsavailable on the Web.There are some differences between this query language andSQL which are described in the syntax section.
Note that data sources are not required to implement the query language, or if they do, to implement all features of the language. Unless you have reason to believe otherwise, you should not depend on a data source to implement all features of this language.
You can attach a query string to a data source request in two ways: by setting the query string from within JavaScript code, or by setting the query string as a parameter in the data source URL. If your request does not include a query string, the default behavior for a data source is to return all rows and columns using its default row/column order and formatting. You can change that by including a query string in your request to a data source.
The query string can be added to the data source URL using the tq parameter.Setting the query in the URL parameter instead of in JavaScript allows you toeasily use visualizations written by other developers, and still be able tocustomize the query.
The query string must be properly encoded as a URL parameter.You can encode a URL using the JavaScript encodeURIComponent function,or you can encode it by hand, using the encoding tool at the end of this section.
Consider the following query string for a Google Spreadsheet. (Note that column IDs in spreadsheets are always letters; the column heading text shown in the published spreadsheet are labels, not IDs. You must use the ID, not the label, in your query string.)
The Google Visualization API Query Language syntax is designed to be similarto SQL syntax. However, it is a subset of SQL, with a few features ofits own that you'll need to learn. If you're familiar with SQL, it shouldn't betoo hard to learn.
The syntax of the query language is composed of the following clauses. Each clause starts with one or two keywords. All clauses are optional. Clauses are separated by spaces. The order of the clauses must be as follows:
The select clause is used to specify the columns to return andtheir order.If this clause is not specified, or if select * is used,all of the columns of the data source table are returned, in their original order.Columns are referenced by the identifiers (not by labels). For example, in a GoogleSpreadsheet, column identifiers are the one or two character column letter (A, B,C, ...).
The where clause also supports some more complex string comparison operators. These operators take two strings as arguments; any non-string arguments (for example, dates or numbers) will be converted to strings before comparison. String matching is case sensitive (you can use upper() or lower() scalar functions to work around that).
The group by clause is used to aggregate values across rows.A single row is created for each distinct combination of values in thegroup-by clause.The data is automatically sorted by the groupingcolumns, unless otherwise specified by an order by clause.
The pivot clause is used to transform distinct values in columns into new columns. For example, a pivot by a column 'year' would produce a table with a column for each distinct year that appears in the original table. This could be useful if, for instance, a line chart visualization draws each column as a separate line. If you want to draw a separate line for each year, and 'year' is one of the columns of the original table, then a good option would be to use a pivot operation to do the necessary data transformation.
Since multiple rows may contain the same values for the pivot columns, pivot implies aggregation. Note that when using pivot without using group by, the result table will contain exactly one row. For instance, running the following query on the example table:
Using pivot together with group by can be even more useful, since it creates a table where each cell contains the result of the aggregation for the relevant row and the relevant column. For example, running the following query on the example table:
You can also use more than one column in the pivot clause. In such a case the columns of the response table are composed of all the unique combinations of values in the columns that exist in the original table. For instance, running the following query on the example table:
You can combine multiple aggregations in the select clause, multiple columns in the group by clause and multiple columns in the pivot clause. Internally, aggregation is performed by the concatenation of the columns in the group by and pivot clauses.
Columns specified in the pivot clause may not appear in the select, group by or order by clauses. When pivot is used, the order by clause cannot contain any aggregation columns. The reason for that is that for each aggregation specified in the select clause, many columns are generated in the result table. However, you can format aggregation columns when pivot is used. The result of such a format is that all of the new columns relevant to the specific aggregation, that are generated by the pivot operation, are formatted by the specified pattern. In the example above, adding format sum(salary) "some_format_string" will affect the following columns: Eng sum-salary, Marketing sum-salary and Sales sum-salary.
You can label aggregation columns. If no label is specified in the label clause, the label of a column that is produced as a result of pivoting is composed of the list of values in the pivot columns, the aggregation type (min, max, sum, ...) and the aggregated column's label. For example "Eng,12:00:00 sum Salary". If only one aggregation was specified in the select clause then the aggregation part is removed from the label, and only the list of values in the pivot columns is kept. For example "Eng,12:00:00". When a label clause specifies a label for an aggregation column, then the label requested is appended to the list of values, both when there is only one aggregation in the select clause, and when there is more than one. For example, label sum(salary) "sumsal" will result in the column labels "Eng,12:00:00 sumsal", "Eng,13:00:00 sumsal", etc.
The format clause is used to specify a formatted value for cells inone or more columns. The returned data should include both an actual value and aformatted value for each cell in a formatted column. Many visualizations use theunformatted value for calculations, but the formatted value for display. The patternsthat you specify in this clause are usually returned in thepattern propertyof the corresponding columns.
There are several kinds of operators and functions that let you manipulate or aggregate data in a single column, or compare or combine data across columns. Examples include sum() (to add all values in a column), max (to find the largest value in a column), and + (to add the values of two columns together in the same row).
Aggregation functions are passed a single column identifier, and perform an action across all values in each group (groups are specified by group by or pivot clauses, or all rows if those clauses are not used).
Scalar functions operate over zero or more parameters to produce another value. Scalar functions can be passed any expression that evaluates to the parameter of the appropriate type. Note that these types are the types defined in the Literals section of this document, which might be slightly different than the similarly named JavaScript objects.
Otherwise, your identifier does not need to be quoted. (Note that not all keywords defined by the syntax are reserved words; so, for example, you can use "max" as an identifier, without having to back-quote it.)
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Did you know that Google Sheets QUERY function is considered to be the most powerful one in spreadsheets? Its peculiar syntax favours tens of different operations. Let's try and break its parts down to learn them once and for all, shall we?
Note. Only one small reminder here established by Google: each column should contain one type of data: textual, or numeric, or boolean. If there are different types, QUERY will work with the one that occurs the most. Other types will be considered as empty cells. Strange, but keep that in mind.
Now let's dig deeper into the clauses and whatever they do.
Clauses used in Google Sheets QUERY formulasQuery language consists of 10 clauses. They may frighten at first glance, especially if you're not familiar with SQL. But I promise, once you get to know them, you will get a powerful spreadsheet weapon at your disposal.
Tip. If you're upset or worried about having to deal with such a huge number of operators again, we feel you. Our Filter and Extract Data will find all matches and build QUERY formulas in Google Sheets for you if necessary.
When I do a query with the following formula =query(Data!A2:F235,"Select D where A contains 'A8'")
it lists everything in one cell. If I take out the where portion it lists all of column D in a column, but when I add the where statement it just puts all matches into one cell with spaces in between.
A8 is a dropdown that I'm referencing. If I put ,0 before the ) it give #N/A, if I put a number there, it will give the number of matches equal to the number I put in, but they're still in one cell.