我有一个字符串列表:
List<String> terms = ["Coding is great", "Search Engines are great", "Google is a nice search engine"]
如何获得列表中每个单词的频率:例如{Coding:1, Search:2, Engines:1, engine:1, ....}
这是我的代码:
Map<String, Integer> wordFreqMap = new HashMap<>();
for (String contextTerm : term.getContexTerms() )
{
String[] wordsArr = contextTerm.split(" ");
for (String word : wordsArr)
{
Integer freq = wordFreqMap.get(word); //this line is getting reset every time I goto a new COntexTerm
freq = (freq == null) ? 1: ++freq;
wordFreqMap.put(word, freq);
}
}
Java 8流的惯用解决方案:
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class SplitWordCount
{
public static void main(String[] args)
{
List<String> terms = Arrays.asList(
"Coding is great",
"Search Engines are great",
"Google is a nice search engine");
Map<String, Integer> result = terms.parallelStream().
flatMap(s -> Arrays.asList(s.split(" ")).stream()).
collect(Collectors.toConcurrentMap(
w -> w.toLowerCase(), w -> 1, Integer::sum));
System.out.println(result);
}
}
请注意,您可能必须考虑字符串的大写/小写是否起作用。这会将字符串转换为小写,并将其用作最终映射的键。结果是:
{coding=1, a=1, search=2, are=1, engine=1, engines=1,
is=2, google=1, great=2, nice=1}
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句