考虑一个零售场景,其中(K,V)
输入数组包含(产品名称,价格),如下所示。每个 Key 的价值都需要减去500
折扣优惠
使用Spark逻辑实现以上需求,
输入
{(Jeans,2000),(Smart phone,10000),(Watch,3000)}
预期产出enter code here
{(Jeans,1500),(Smart phone,9500),(Watch,2500)}
我已经尝试了下面的代码我遇到了错误请帮我修复它们 import java.util.Arrays; 导入 java.util.Iterator;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import scala.Tuple2;
public class PairRDDAgg {
public static void main(String[] args) {
// TODO Auto-generated method stub
SparkConf conf = new
SparkConf().setAppName("Line_Count").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> input =
sc.textFile("C:/Users/xxxx/Documents/retail.txt");
JavaPairRDD<String, Integer> counts = input.mapValues(new Function() {
/**
*
*/
private static final long serialVersionUID = 1L;
public Integer call(Integer i) {
return (i-500);
}
});
System.out.println(counts.collect());
sc.close();
}
}
你可以试试这个:
scala> val dataset = spark.createDataset(Seq(("Jeans",2000),("Smart phone",10000),("Watch",3000)))
dataset: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
scala> dataset.map ( x => (x._1, x._2 - 500) ).show
+-----------+----+
| _1| _2|
+-----------+----+
| Jeans|1500|
|Smart phone|9500|
| Watch|2500|
+-----------+----+
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句