Updates to loading CSV data containing newlines within strings; Improved loading performance

2,118 views
Skip to first unread message

Michael Manoochehri

unread,
Dec 13, 2012, 2:37:52 PM12/13/12
to bigquery...@googlegroups.com
Hello BigQuery developers:

In line with our documentation, we are now enforcing that by default, source CSV files are not allowed to have newlines within string values (such as this):

"This string contains
a newline"

This default setting allows us to support much larger source file sizes and improves data loading performance tremendously. If your source CSV files do contain newlines within string values, it is still possible to load them into BigQuery. We've recently added support for loading CSV data containing newlines in strings via a new load configuration parameter (allowQuotedNewLines):
https://developers.google.com/bigquery/docs/reference/v2/jobs#importing

In order to load CSV source files that contain newlines in strings, please set this flag to true. Example:

{
"configuration": {
 "load": {
  "allowQuotedNewlines": true,
  "schema": {
   "fields": [
    {
     "name": "string_with_newlines",
     "type": "STRING"
    }
   ]
  },
  "sourceUris": [
   "gs://mybucket/my_source.csv"
  ],
  "destinationTable": {
   "datasetId": "my_dataset",
   "tableId": "my_table"
  }
 }
}
}

If you are using the BigQuery "bq" command line tool, you can specify that your CSV files contain newlines within strings using the --allow_quoted_newlines flag:
bq load --allow_quoted_newlines <destination_table> <source> <schema>

For additional information about the new size limits, please see our quota policy:
https://developers.google.com/bigquery/docs/quota-policy#import


Thank you!
The BigQuery Team

Reply all
Reply to author
Forward
0 new messages