Save javaRDD as XML file

Soheila S.

Is there any way in Apache Spark to save a java RDD of text as an XML file?

What I do currently is save the RDD as a plain text file using saveAsTextFile method and then convert it to XML. I am interested to find a way to directly create the XML file from RDD.

Any tip, idea or guide will be appreciated.

FaigB

You can refer databricks xml library to read and write data from/to xml. Inferring schema from data:

import org.apache.spark.sql.SQLContext

SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
    .format("com.databricks.spark.xml")
    .option("rowTag", "book")
    .load("books.xml");

df.select("author", "_id").write()
    .format("com.databricks.spark.xml")
    .option("rootTag", "books")
    .option("rowTag", "book")
    .save("newbooks.xml");

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

TOP Ranking

HotTag

Archive