I used TPCx-BB default data generate tool to generate 1GB raw data. Then I found the product_reviews table contains dirty data.
TPCx-BB use "|" as the field delimiter, but in product_reviews table, some records in pr_review_content column also contains "|" symbol.
For example, one record in product_reviews table in 1GB dataset is as follows:
22659|2005-05-09|22:38:43|3|15236|26449|29296|This product does the job if you like hard bed don't buy it.But if you do) Once you download path for it. It is totally disgusting. Besides the fact that this one is defective too, but over all don't think this is such great bed for ten days while on the market.||Exodus||
For this records, we can see the value in pr_review_content column also contains "|" symbol.