subtract two columns with null in spark dataframe

warner

I new to spark, I have dataframe df:

+----------+------------+-----------+
| Column1  | Column2    | Sub       |                          
+----------+------------+-----------+
| 1        | 2          | 1         |                                         
+----------+------------+-----------+
| 4        | null       | null      |                          
+----------+------------+-----------+
| 5        | null       | null      |                          
+----------+------------+-----------+
| 6        | 8          | 2         |                          
+----------+------------+-----------+

when subtracting two columns, one column has null so resulting column also resulting as null.

df.withColumn("Sub", col(A)-col(B))

Expected output should be:

+----------+------------+-----------+
|  Column1 | Column2    | Sub       |                          
+----------+------------+-----------+
| 1        | 2          | 1         |                                           
+----------+------------+-----------+
| 4        | null       | 4         |                          
+----------+------------+-----------+
| 5        | null       | 5         |                          
+----------+------------+-----------+
| 6        | 8          | 2         |                          
+----------+------------+-----------+

I don't want to replace the column2 to replace with 0, it should be null only. Can someone help me on this?

Ramesh Maharjan

You can use when function as

import org.apache.spark.sql.functions._
df.withColumn("Sub", when(col("Column1").isNull, lit(0)).otherwise(col("Column1")) - when(col("Column2").isNull, lit(0)).otherwise(col("Column2")))

you should have final result as

+-------+-------+----+
|Column1|Column2| Sub|
+-------+-------+----+
|      1|      2|-1.0|
|      4|   null| 4.0|
|      5|   null| 5.0|
|      6|      8|-2.0|
+-------+-------+----+

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Subtract two columns in dataframe

Concatenate two columns of spark dataframe with null values

Subtract two columns in pandas dataframe

Spark / Scala - Compare Two Columns In a Dataframe when one is NULL

subtract two columns of different Dataframe with python

How to subtract two DataFrame columns in Pandas

Subtract columns in two dataframes to get differences in Spark Scala

How to subtract two columns of pyspark dataframe and also divide?

Spark: subtract two DataFrames

subtract inside DataFrame columns

Subtract two columns

How to concatenate two columns of spark dataframe with null values but get one value

Spark dataframe not adding columns with null values

Spark DataFrame Get Null Count For All Columns

How to concatenate null columns in spark dataframe in java?

Spark DataFrame Aggregation based on two or more Columns

How to check for intersection of two DataFrame columns in Spark

Sort Spark Dataframe with two columns in different order

How to subtract columns in a multiindex dataframe?

Subtract two columns with windows function

Joining two Spark DataFrame according to size of intersection of two array columns

Subtract 2 Dataframe columns and save in new DataFrame

Subtract two columns from two tables with Group By

pandas subtract values in two dataframes with identical columns create new dataframe to store result

Scala spark - count null value in dataframe columns using accumulator

Remove Null from Array Columns in Dataframe in Scala with Spark (1.6)

Dot product between two Vector columns of a DataFrame in Spark without UDF

how to split one spark dataframe column into two columns by conditional when

How to merge two columns of a `Dataframe` in Spark into one 2-Tuple?