I am trying to save my dataframe aa parquet file with one partition per day. So trying to use the date column. However, I want to write one file per partition, so using repartition($"date")
, but keep getting errors:
This error "cannot resolve symbol repartition"
and "value $ is not a member of stringContext"
when I use,
DF.repartition($"date")
.write
.mode("append")
.partitionBy("date")
.parquet("s3://file-path/")
This error Type mismatch, expected column, actual string,
when I use:
DF.repartition("date")
.write
.mode("append")
.partitionBy("date")
.parquet("s3://file-path/")
However, this works fine without any error.
DF.write.mode("append").partitionBy("date").parquet("s3://file-path/")
Cant we use date type in repartition? Whats wrong here?
To use the $
symbol inplace of col()
, you need to first import spark.implicits
. spark
here is an instance of a SparkSession
, hence the import must be done after the creation of a SparkSession
. A simple example:
val spark = SparkSession.builder.getOrCreate()
import spark.implicits._
This import will also enable other functionallity such as converting RDDs to Dataframe of Datasets with toDF()
and toDS()
respectively.
Este artigo é coletado da Internet.
Se houver alguma infração, entre em [email protected] Delete.
deixe-me dizer algumas palavras