I’ve seen that we can use “case classes” to set the schema instead of inferring it, but since we need to read from a few collections that might get a bit messy.
Hi Isart,
It’s been a while since you posted this question, have you found a solution yet?
Specifying schema explicitly would be preferable if you know the collections schema, particularly since you said that the documents look the same.
For example to specify schema explicitly:
case class MyDocument(name: String, age: Int)
MongoSpark.load[MyObject.MyDocument](sparkSession).printSchema()
See also example SparkSQL
Alternatively, you can specify a different sample size via the read config. See ReadConfig sampleSize
Regards,
Wan.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/aeGGAO_JV_0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4624033c-3436-477a-a6eb-2e97244a01c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.