JPA IdClass 중복저장 시 기존 엔티티 업데이트 이슈

1. Issue Description

현재 LicenseCategory 는 licenseType 과 analyzeType 을 CombinedKey 로 사용합니다. 그리고 이 두 개의 키는 LicenseCategoryId 라는 IdClass 로 선언되어 있어요.

그래서 같은 licenseType 과 analyzeType 을 가지게 된다면, 중복키 에러를 표시해야만합니다.

하지만 에러가 표시되지 않고 그대로 진행되버렸어요!

1.1 Screenshots

1.2 Error Code

테스트 코드

@Test
    @DisplayName("라이센스 카테고리 중복 저장 방지")
    void function2() {
        // given
        LocalDateTime now = LocalDateTime.now();
        LicenseCategoryId lcId = LicenseCategoryId.builder().licenseType("basic").analyzeType("악성코드").build();
        LicenseCategoryId lcIdDuplicate = LicenseCategoryId.builder().licenseType("basic").analyzeType("악성코드").build();
        LicenseCategory lc = LicenseCategory.builder().licenseType(lcId.getLicenseType()).analyzeType(lcId.getAnalyzeType()).createdAt(now).build();
        LicenseCategory lcDuplicate = LicenseCategory.builder().licenseType(lcIdDuplicate.getLicenseType()).analyzeType(lcIdDuplicate.getAnalyzeType()).createdAt(now).build();

        // when
        licenseCategoryRepository.save(lc);
        licenseCategoryRepository.save(lcDuplicate);
        // then
        assertThatThrownBy(() -> {
            licenseCategoryRepository.save(lcDuplicate);
        }).isInstanceOf(DuplicateKeyException.class);
    }

테스트 결과 오류

Expecting code to raise a throwable.
java.lang.AssertionError: 
Expecting code to raise a throwable.
	at foxee.product.mainservice.domain.repository.LicenseCategoryRepositoryTest.function2(LicenseCategoryRepositoryTest.java:63)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
	...

2. Problem

제가 예상했던 플로우는 JPA 에 첫 save() 호출 시 insert 되며, 두 번째 같은 ID 로 다른 객체를 삽입 시 마찬가지로 insert 되는 플로우에요.

하지만 JPA save() 중복 호출 시 기존 엔티티를 업데이트하게 됩니다.

save() 시 db에 같은 id 가 있으면 그대로 엔티티를 들고오고, 없다면 insert 해주고 있었어요.

왜 그럴까요? JPA 의 save() 메소드 동작과정을 보면 알 수 있어요!

3. JPA 에서 save() 동작 순서

3.1 save() 동작 코드 확인

save() 는 merge() 와 persist() 둘 중 하나로 동작하게 됩니다.

JPA save() 내부 코드

        @Transactional
	@Override
	public <S extends T> S save(S entity) {

		Assert.notNull(entity, "Entity must not be null");

		if (entityInformation.isNew(entity)) {
			em.persist(entity);
			return entity;
		} else {
			return em.merge(entity);
		}
	}

위의 코드를 보면 아시겠지만, isNew() 의 반환조건에 따라서, persist() 와 merge() 로 분기가 나뉘는것을 볼 수 있습니다.
persist : 즉시 DB 에 insert 쿼리를 전송합니다.
merge : detached 되어있는 엔티티를 manage 테이블(1차 캐시)로 가져오는데, 만약 detached 에 없을 경우에는 DB에 select 쿼리가 추가적으로 나가게 됩니다.
그렇다면 isNew() 는 무엇일까요?

isNew() 는 해당 Entity가 새롭게 만들어진 Entity인지, 혹은 기존에 사용되던 Entity 인지를 구분합니다.

그렇다면 어떤식으로 Entity가 새로운 Entity인지 아닌지를 구분할까요? 코드로 확인해보겠습니다.

       public boolean isNew(T entity) {

		ID id = getId(entity);
		Class<ID> idType = getIdType();

		if (!idType.isPrimitive()) {
			return id == null;
		}

		if (id instanceof Number) {
			return ((Number) id).longValue() == 0L;
		}

		throw new IllegalArgumentException(String.format("Unsupported primitive id type %s", idType));
	}

Primitive vs Wrapper 참고
primitive 자료형이란? : int, float, long, double, boolean 과 같은 원시적 자료형
wrapper 자료형이란? : Integer, Float, Long, Double, Boolean, UUID, 사용자 정의 클래스

getId는 엔티티를 정의할 때 사용했던 @Id 필드를 가져옵니다.
현재 우리 프로젝트는 LicenseCategoryId 를 PK 로 사용하기 위해 @id 어노테이션을 붙였습니다. 그렇다면 idType은 LicenseCategoryId 가 되겠죠?
isPrimitive 는 int, float, long, double 와 같은 하나의 primitive 자료형인지 판단하는 내장함수입니다.
현재 우리는 LicenseCategoryId 를 PK 로 사용하고 있기때문에 primitive 자료형이 아닌 wrapper 자료형입니다. 그렇다면 id==null 인지 확인하게 되겠죠? 이 때, 이미 id 값을 정해서 삽입했기때문에 false 를 반환하게 됩니다.
자! 그럼 다시 올라가서 isNew가 false 라면 어떤 과정을 가지게 될까요?

JPA save() 내부 코드-다시

  public <S extends T> S save(S entity) {
      if (entityInformation.isNew(entity)) {
          ...
      } else {
          return em.merge(entity);
      }
  }

네. 바로 merge 를 수행하게 됩니다. 즉, detached 되어있는 엔티티를 가져오게 되겠죠!
즉, save 호출시 새로운 엔티티임에도 불구하고 UUID 자료형으로 PK 를 설정했기에 항상 persist 가 아닌 merge를 호출하게 됩니다
여기서 의문이 생기죠. detached 에는 어떤애들이 존재할까?라는 의문말입니다.

An entity becomes detached (unmanaged) on following actions:
after transaction commit/rollback
by calling EntityManager.detach(entity)
by clearing the persistence context with EntityManager.clear()
by closing an entity manager with EntityManager.close()
serializing or sending an entity remotely (pass by value).
reference : https://www.logicbig.com/tutorials/java-ee-tutorial/jpa/detaching.html

위의 자료에서 눈여겨 볼 것은 after transaction commit/rollback 입니다. detach 영역에는 Transaction 내 쿼리가 끝나면 사용된 엔티티들을 detached 에 보관한다는 말이죠! 그리고 이 detached 는 준영속상태 라고 합니다.

종합하기 이전에 JPA manage 상태에 대해서 잠깐 설명해볼까해요.
이전 앞서서 merge 는 detached 상태에서 manage 상태로 변경하는 역할을 수행한다고 말씀드렸습니다. 조금 더 자세히 말하면, id 값을 가지고 영속 컨텍스트의 1차 캐시 내 id 값이 일치하는 엔티티가 존재한다면 이를 그대로 업데이트합니다. 반면 1차 캐시 내 없다면, DB에 select 쿼리하고 결과를 1차 캐시에 집어넣습니다.
이렇게 id 값을 가지고 영속 컨텍스트 내 1차 캐시에 엔티티를 넣어주는 과정이 바로 영속화 과정이며, 1차 캐시에 삽입된 상태를 manage 상태라고 부릅니다.

3.2 트랜젝션에 따른 실제 쿼리 결과 테스트

동일 트랜젝션 내 동일 Id 를 가지는 엔티티 저장

코드

       transactionTemplate.execute((status)->{
            System.out.println("첫 번째 LC 저장 시작");
            licenseCategoryRepository.save(lc);
            System.out.println("첫 번째 LC 저장 완료");
            System.out.println("두 번째 LC 저장 시작");
            licenseCategoryRepository.save(lcDuplicate);
            System.out.println("두 번째 LC 저장 완료");
            return null;
        });

결과

첫 번째 LC 저장 시작

Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0 where (l1_0.analyze_type,l1_0.license_type) in ((?,?)) 
//  이 엔티티는 1차 캐시에 삽입됩니다

첫 번째 LC 저장 완료

두 번째 LC 저장 시작
// 그리고 detached 영역과 1차 캐시 영역을 확인 하고, 1차 캐시에 동일 ID가 존재하는 것을 확인하였으니 스킵합니다

두 번째 LC 저장 완료
Hibernate: insert into license_category (created_at,analyze_type,license_type) values (?,?,?)

다른 트랜젝션 내 동일 Id 를 가지는 엔티티 저장

코드

        transactionTemplate.execute((status)->{
            System.out.println("첫 번째 LC 저장 시작");
            licenseCategoryRepository.save(lc);
            System.out.println("첫 번째 LC 저장 완료");
            return null;
        });

        transactionTemplate.execute((status)->{
            System.out.println("두 번째 LC 저장 시작");
            licenseCategoryRepository.save(lcDuplicate);
            System.out.println("두 번째 LC 저장 완료");
            return null;
        });

결과

첫 번째 LC 저장 시작

Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0 where (l1_0.analyze_type,l1_0.license_type) in ((?,?)) 
//  이 엔티티는 1차 캐시에 삽입됩니다

첫 번째 LC 저장 완료 

Hibernate: insert into license_category (created_at,analyze_type,license_type) values (?,?,?) 
//  트랜젝션이 끝나고 flush와 commit 이 실행됩니다. 그리고 이 엔티티는 1차 캐시에서 제거됩니다

두 번째 LC 저장 시작

Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0 where (l1_0.analyze_type,l1_0.license_type) in ((?,?))
// 준영속에 존재하는 애가 1차 캐시에 있는지 먼저 확인하고 없으면 다시 select 쿼리를 전송합니다

두 번째 LC 저장 완료

이 select 쿼리의 뜻은 다음과 같습니다. db 에 저장된 row 를 가져와서 변경점이 있는지 없는지 확인하고 없다면 no 쿼리, 있다면 update 쿼리를 날릴것이다!

아래는 동일 ID 의 다른 createdAt값을 넣었을 때 쿼리되는 구문들을 확인할 수 있어요.

    @Test
    @DisplayName("라이센스 카테고리 중복 저장 방지 - JPA Seperated Transaction")
    void function2() {
        // given
        LocalDateTime now = LocalDateTime.now();
        LocalDateTime now2 = LocalDateTime.now().plusDays(5);
        LicenseCategoryId lcId = LicenseCategoryId.builder().licenseType("basic0").analyzeType("악성코드").build();
        LicenseCategoryId lcIdDuplicate = LicenseCategoryId.builder().licenseType("basic0").analyzeType("악성코드").build();
        LicenseCategory lc = LicenseCategory.builder().licenseType(lcId.getLicenseType()).analyzeType(lcId.getAnalyzeType()).createdAt(now).build();
        LicenseCategory lcDuplicate = LicenseCategory.builder().licenseType(lcIdDuplicate.getLicenseType()).analyzeType(lcIdDuplicate.getAnalyzeType()).createdAt(now2).build();

        // when
        transactionTemplate.execute((status)->{
            System.out.println("첫 번째 LC 저장 시작");
            licenseCategoryRepository.save(lc);
            System.out.println("첫 번째 LC 저장 완료");
            return null;
        });

        transactionTemplate.execute((status)->{
            System.out.println("두 번째 LC 저장 시작");
            licenseCategoryRepository.save(lcDuplicate);
            System.out.println("두 번째 LC 저장 완료");
            return null;
        });
    }

Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0
첫 번째 LC 저장 시작
Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0 where (l1_0.analyze_type,l1_0.license_type) in ((?,?))
첫 번째 LC 저장 완료
Hibernate: insert into license_category (created_at,analyze_type,license_type) values (?,?,?)
두 번째 LC 저장 시작
Hibernate: select l1_0.analyze_type,l1_0.license_type,l1_0.created_at from license_category l1_0 where (l1_0.analyze_type,l1_0.license_type) in ((?,?))
두 번째 LC 저장 완료
Hibernate: update license_category set created_at=? where analyze_type=? and license_type=?

4. 위의 결과를 다시 종합해보겠습니다

첫 번째 save()는 엔티티의 @id 가 UUID 로 wrapper 자료형이고 id값을 같이 입력해주었기때문에 존재하기 때문에 em.merge(entity) 가 실행됩니다.
하지만 detached 에 존재하지 않고, 1차 캐시또한 존재하지 않기 때문에 db 에 SELECT 쿼리를 전송하여 결과를 가져옵니다.
em.persist() 의 경우 1차 캐시에 존재하는 entity로 인지하고, 일반적으로 바로 INSERT 쿼리가 나가게 됩니다.
동일 ID 를 가지는 엔티티가 db에 존재한다면, UPDATE 문을 배치하고 최신엔티티를 1차 캐시에 삽입 및 합니다.
동일 ID 가 db에 없다면 INSERT 문을 배치합니다
트랜젝션이 끝난 뒤 배치된 SQL 문을 flush, commit, detach 합니다.
두 번째 save() 또한 마찬가지로 em.merge() 가 수행됩니다.
그리고 detached 영역에 존재하는 동일 ID 엔티티를 가져옵니다.
그리고 1차 캐시에 존재하지 않기 때문에 다시 SELECT 쿼리를 전송합니다.
이 ID 는 db에 존재하는 ID 이기 때문에, UPDATE 쿼리를 전송합니다.

5. 결론은?

save() 는 isNew() 를 통해서 새로운 객체인 경우 persist(), 아니면 merge()로 작동합니다.
isNew() 는 primitive + Number type 이면 0, IdClass와 같이 un-primitive 인 경우에는 null 인지를 확인함으로써 새로운 객체인지를 확인합니다.
LicenseCategory는 un-primitive 인 IdClass 를 사용하고 값이 존재합니다.
그러므로 기존 객체를 가져와서 이를 변경하게 됩니다.

그렇다면 결론은 Persistable 인터페이스를 구현하여 isNew() 판단 조건을 변경하면 됩니다!

isNew()를 통해 Entity가 새롭게 만들어진 entity로 인지, 혹은 기존에 사용되던 Entity 인지를 구분합니다. persist() 의 경우 기존에 존재하던 entity로 인지하고, 일반적으로 바로 insert 쿼리가 나가게 됩니다. 하지만 merge() 의 경우, 밀어 넣으려는 값의 id가 테이블에 있는지를 있는지를 확인해보기 위해서 select 쿼리가 추가적으로 1회 나갈 가능성이 있습니다. (1차 캐시에 없는 경우) 이 때문에 merge 사용시 saveAll() 과 같이 N개의 엔티티를 save 하게 되는경우, 불필요한 쿼리(select) N번이 추가적으로 발생하게 되어 성능에 이슈가 될 수 있습니다. 또한 merge는 pk가 같은 기존에 entity를 대체해버리기 때문에, entity내의 필드값들이 의도치 않게 사라지거나 변경되는 사이드 이펙트가 발생할 수 있습니다.
reference : ID UUID 저장시 고려