임의 키에 대한 Java Lambda Stream Distinct ()?
이 질문에 이미 답변이 있습니다.
- Java 8 속성 별 구별 23 답변
나는 객체의 임의의 속성이나 메서드에 대한 스트림을 구분 ()하고 싶을 때 Java 람다 식에 자주 문제가 발생했지만 해당 속성이나 메서드에 매핑하는 대신 객체를 유지하고 싶었습니다. 여기 에서 논의한대로 컨테이너를 만들기 시작 했지만 성 가시고 많은 상용구 클래스를 만들만큼 충분히 수행하기 시작했습니다.
두 가지 유형의 두 개체를 보유하고 왼쪽, 오른쪽 또는 두 개체 모두에서 키잉을 지정할 수있는이 Pairing 클래스를 함께 던졌습니다. 내 질문은 ... 어떤 종류의 주요 공급자에 대해 distinct ()에 대한 내장 람다 스트림 함수가 실제로 없습니까? 정말 놀랍습니다. 그렇지 않은 경우이 클래스가 해당 기능을 안정적으로 수행합니까?
다음은 호출 방법입니다.
BigDecimal totalShare = orders.stream().map(c -> Pairing.keyLeft(c.getCompany().getId(), c.getShare())).distinct().map(Pairing::getRightItem).reduce(BigDecimal.ZERO, (x,y) -> x.add(y));
페어링 클래스입니다
public final class Pairing<X,Y> {
private final X item1;
private final Y item2;
private final KeySetup keySetup;
private static enum KeySetup {LEFT,RIGHT,BOTH};
private Pairing(X item1, Y item2, KeySetup keySetup) {
this.item1 = item1;
this.item2 = item2;
this.keySetup = keySetup;
}
public X getLeftItem() {
return item1;
}
public Y getRightItem() {
return item2;
}
public static <X,Y> Pairing<X,Y> keyLeft(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.LEFT);
}
public static <X,Y> Pairing<X,Y> keyRight(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.RIGHT);
}
public static <X,Y> Pairing<X,Y> keyBoth(X item1, Y item2) {
return new Pairing<X,Y>(item1, item2, KeySetup.BOTH);
}
public static <X,Y> Pairing<X,Y> forItems(X item1, Y item2) {
return keyBoth(item1, item2);
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
if (keySetup.equals(KeySetup.LEFT) || keySetup.equals(KeySetup.BOTH)) {
result = prime * result + ((item1 == null) ? 0 : item1.hashCode());
}
if (keySetup.equals(KeySetup.RIGHT) || keySetup.equals(KeySetup.BOTH)) {
result = prime * result + ((item2 == null) ? 0 : item2.hashCode());
}
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Pairing<?,?> other = (Pairing<?,?>) obj;
if (keySetup.equals(KeySetup.LEFT) || keySetup.equals(KeySetup.BOTH)) {
if (item1 == null) {
if (other.item1 != null)
return false;
} else if (!item1.equals(other.item1))
return false;
}
if (keySetup.equals(KeySetup.RIGHT) || keySetup.equals(KeySetup.BOTH)) {
if (item2 == null) {
if (other.item2 != null)
return false;
} else if (!item2.equals(other.item2))
return false;
}
return true;
}
}
최신 정보:
아래에서 Stuart의 기능을 테스트했으며 훌륭하게 작동하는 것 같습니다. 아래 작업은 각 문자열의 첫 글자에 따라 다릅니다. 내가 알아 내려는 유일한 부분은 ConcurrentHashMap이 전체 스트림에 대해 하나의 인스턴스 만 유지하는 방법입니다.
public class DistinctByKey {
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
public static void main(String[] args) {
final ImmutableList<String> arpts = ImmutableList.of("ABQ","ALB","CHI","CUN","PHX","PUJ","BWI");
arpts.stream().filter(distinctByKey(f -> f.substring(0,1))).forEach(s -> System.out.println(s));
}
출력은 ...
ABQ
CHI
PHX
BWI
distinct
작업은 인 상태 파이프 라인 작업; 이 경우 상태 저장 필터입니다. 내장 된 것이 없기 때문에 직접 만드는 것은 약간 불편하지만 작은 도우미 클래스가 트릭을 수행해야합니다.
/**
* Stateful filter. T is type of stream element, K is type of extracted key.
*/
static class DistinctByKey<T,K> {
Map<K,Boolean> seen = new ConcurrentHashMap<>();
Function<T,K> keyExtractor;
public DistinctByKey(Function<T,K> ke) {
this.keyExtractor = ke;
}
public boolean filter(T t) {
return seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
}
도메인 클래스는 모르지만이 도우미 클래스를 사용하면 다음과 같이 원하는 작업을 수행 할 수 있습니다.
BigDecimal totalShare = orders.stream()
.filter(new DistinctByKey<Order,CompanyId>(o -> o.getCompany().getId())::filter)
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
불행히도 타입 추론은 표현식 내에서 충분히 멀어지지 않았기 때문에 DistinctByKey
클래스에 대한 타입 인수를 명시 적으로 지정해야했습니다 .
여기에는 Louis Wasserman 이 설명한 수집기 접근 방식 보다 더 많은 설정이 포함 되지만 수집이 완료 될 때까지 버퍼링되는 대신 고유 한 항목이 즉시 전달된다는 이점이 있습니다. 두 접근 방식 모두 스트림 요소에서 추출 된 모든 고유 키를 누적하므로 공간은 동일해야합니다.
최신 정보
K
실제로지도에 저장되는 것 외에 다른 용도로 사용되지 않기 때문에 type 매개 변수를 제거 할 수 있습니다. 그래서 Object
충분합니다.
/**
* Stateful filter. T is type of stream element.
*/
static class DistinctByKey<T> {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
Function<T,Object> keyExtractor;
public DistinctByKey(Function<T,Object> ke) {
this.keyExtractor = ke;
}
public boolean filter(T t) {
return seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
}
BigDecimal totalShare = orders.stream()
.filter(new DistinctByKey<Order>(o -> o.getCompany().getId())::filter)
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
This simplifies things a bit, but I still had to specify the type argument to the constructor. Trying to use diamond or a static factory method doesn't seem to improve things. I think the difficulty is that the compiler can't infer generic type parameters -- for a constructor or a static method call -- when either is in the instance expression of a method reference. Oh well.
(Another variation on this that would probably simplify it is to make DistinctByKey<T> implements Predicate<T>
and rename the method to eval
. This would remove the need to use a method reference and would probably improve type inference. However, it's unlikely to be as nice as the solution below.)
UPDATE 2
Can't stop thinking about this. Instead of a helper class, use a higher-order function. We can use captured locals to maintain state, so we don't even need a separate class! Bonus, things are simplified so type inference works!
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
BigDecimal totalShare = orders.stream()
.filter(distinctByKey(o -> o.getCompany().getId()))
.map(Order::getShare)
.reduce(BigDecimal.ZERO, BigDecimal::add);
You more or less have to do something like
elements.stream()
.collect(Collectors.toMap(
obj -> extractKey(obj),
obj -> obj,
(first, second) -> first
// pick the first if multiple values have the same key
)).values().stream();
A variation on Stuart Marks second update. Using a Set.
public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
Set<Object> seen = Collections.newSetFromMap(new ConcurrentHashMap<>());
return t -> seen.add(keyExtractor.apply(t));
}
We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())
To answer your question in your second update:
The only part I'm trying to figure out is how the ConcurrentHashMap maintains only one instance for the entire stream:
public static <T> Predicate<T> distinctByKey(Function<? super T,Object> keyExtractor) {
Map<Object,Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
In your code sample, distinctByKey
is only invoked one time, so the ConcurrentHashMap created just once. Here's an explanation:
The distinctByKey
function is just a plain-old function that returns an object, and that object happens to be a Predicate. Keep in mind that a predicate is basically a piece of code that can be evaluated later. To manually evaluate a predicate, you must call a method in the Predicate interface such as test
. So, the predicate
t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null
is merely a declaration that is not actually evaluated inside distinctByKey
.
The predicate is passed around just like any other object. It is returned and passed into the filter
operation, which basically evaluates the predicate repeatedly against each element of the stream by calling test
.
I'm sure filter
is more complicated than I made it out to be, but the point is, the predicate is evaluated many times outside of distinctByKey
. There's nothing special* about distinctByKey
; it's just a function that you've called one time, so the ConcurrentHashMap is only created one time.
*Apart from being well made, @stuart-marks :)
Another way of finding distinct elements
List<String> uniqueObjects = ImmutableList.of("ABQ","ALB","CHI","CUN","PHX","PUJ","BWI")
.stream()
.collect(Collectors.groupingBy((p)->p.substring(0,1))) //expression
.values()
.stream()
.flatMap(e->e.stream().limit(1))
.collect(Collectors.toList());
You can use the distinct(HashingStrategy)
method in Eclipse Collections.
List<String> list = Lists.mutable.with("ABQ", "ALB", "CHI", "CUN", "PHX", "PUJ", "BWI");
ListIterate.distinct(list, HashingStrategies.fromFunction(s -> s.substring(0, 1)))
.each(System.out::println);
If you can refactor list
to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<String> list = Lists.mutable.with("ABQ", "ALB", "CHI", "CUN", "PHX", "PUJ", "BWI");
list.distinct(HashingStrategies.fromFunction(s -> s.substring(0, 1)))
.each(System.out::println);
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.
Set.add(element)
returns true if the set did not already contain element
, otherwise false. So you can do like this.
Set<String> set = new HashSet<>();
BigDecimal totalShare = orders.stream()
.filter(c -> set.add(c.getCompany().getId()))
.map(c -> c.getShare())
.reduce(BigDecimal.ZERO, BigDecimal::add);
If you want to do this parallel, you must use concurrent map.
It can be done something like
Set<String> distinctCompany = orders.stream()
.map(Order::getCompany)
.collect(Collectors.toSet());
참고URL : https://stackoverflow.com/questions/27870136/java-lambda-stream-distinct-on-arbitrary-key
'IT Share you' 카테고리의 다른 글
집계 함수가없는 GROUP BY (0) | 2020.12.06 |
---|---|
Android Studio : UTF-8 인코딩에 매핑 할 수없는 문자 (0) | 2020.12.06 |
통합 테스트를위한 Spring-boot 기본 프로필 (0) | 2020.12.06 |
React 생성자에서 super () 호출은 무엇을합니까? (0) | 2020.12.06 |
Android Studio 3.0 버전에서 AVD 관리자를 여는 방법은 무엇입니까? (0) | 2020.12.06 |