When inserting a lot of data at once, using bulk inserts can significantly improve performance.

JDBC batch execute

When using JDBC, you can use Statement#addBatch(String sql) or PreparedStatement#addBatch method to add SQL statements to the batch list, and then execute them in batch with the executeBatch method. When inserting a lot of data at once, using bulk inserts can significantly improve performance.

reWriteBatchedInserts=true

The PostgreSQL JDBC driver supports the reWriteBatchedInserts=true join parameter to modify multiple insert/update statements into a single statement, for example: insert into test(name) values ('n'); insert into test(name) values ('m'); is modified to insert into test(name) values ('n'), ('m'); . This can provide a 2 to 3 times performance improvement.

Note: executeBatch return value

With the reWriteBatchedInserts=true parameter, the value of the int[] element returned by executeBatch execution will be -2. This is because the return value of executeBatch will be overridden to Statement#SUCCESS_NO_INFO, which indicates that the JDBC batch statement executed successfully, but the count of the number of rows affected by it is not available.

1
2
3
4
5
6
7
8
9
@Test
public void batchInsert() {
    int[] rets = jdbcTemplate.batchUpdate("insert into test(id, name) values (?, ?)", Arrays.asList(
            new Object[]{1, "Yangbajing"},
            new Object[]{2, "Yangjing"},
            new Object[]{3, "yangbajing"}
    ));
    System.out.println(Arrays.toString(rets));
}

Mybatis

Use

1
2
3
4
5
6
<insert id="batchInsert">
    INSERT INTO test (name, content) VALUES
    <foreach collection="list" item="item" separator=",">
        (#{item.name}, ${item.content})
    </foreach>
</insert>

IService with mybatis-plus

The saveBatch method of the IService enables bulk insertion, which by default will be committed every 1000 records (non-transaction commit, e.g. 3700 records will be executed in 4 executeBatchs, but still in a transaction)

Customize insertBatch to get the number of rows affected by the batch

mybatis-plus’s IService#saveBatch returns boolean by default, and you can customize an insertBatch function to return the number of rows affected by the batch execution (note: in fact, because the `saveBatch function uses things, depending on whether the execution of the argument succeeds, the batch data In fact, you don’t need a method that returns the number of rows affected. (This is a demonstration of how to customize the batch execution function).

DataIService

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import com.baomidou.mybatisplus.extension.service.IService;

import java.util.List;

public interface DataIService<T> extends IService<T> {
    int insertBatch(List<T> entityList, int batchSize);

    default boolean insert(T entity) {
        return save(entity);
    }
}

DataIServiceImpl

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import com.baomidou.mybatisplus.core.enums.SqlMethod;
import com.baomidou.mybatisplus.core.mapper.BaseMapper;
import com.baomidou.mybatisplus.core.toolkit.Assert;
import com.baomidou.mybatisplus.extension.service.impl.ServiceImpl;
import org.apache.commons.collections4.CollectionUtils;
import org.apache.ibatis.executor.BatchResult;
import org.apache.ibatis.session.SqlSession;

import java.sql.Statement;
import java.util.*;
import java.util.function.BiConsumer;

public class DataIServiceImpl<M extends BaseMapper<T>, T> 
        extends ServiceImpl<M, T>
        implements DataIService<T> {

    @Override
    public int insertBatch(List<T> entityList, int batchSize) {
        if (CollectionUtils.isEmpty(entityList)) {
            return 0;
        }
        String sqlStatement = sqlStatement(SqlMethod.INSERT_ONE);
        List<BatchResult> rets = 
            batchExecute(entityList,
                         batchSize, 
                         (sqlSession, entity) -> sqlSession.insert(sqlStatement, entity));
        return rets.stream()
                .mapToInt(result -> Arrays.stream(result.getUpdateCounts())
                .map(n -> n == Statement.SUCCESS_NO_INFO ? 1 : n).sum())
                .sum();
    }

    protected <E> List<BatchResult> batchExecute(Collection<E> list,
                                                 int batchSize,
                                                 BiConsumer<SqlSession, E> consumer) {
        Assert.isFalse(batchSize < 1, "batchSize must not be less than one");
        if (list.isEmpty()) {
            return Collections.emptyList();
        }

        final List<BatchResult> results = new LinkedList<>();
        executeBatch(sqlSession -> {
            int size = list.size();
            int i = 1;
            for (E element : list) {
                consumer.accept(sqlSession, element);
                if ((i % batchSize == 0) || i == size) {
                    List<BatchResult> rets = sqlSession.flushStatements();
                    results.addAll(rets);
                }
                i++;
            }
        });
        return results;
    }
}

When aggregating counts for List<BatchResult> rets to get the number of affected rows, be careful to determine if the int[] element returned by BatchResult#getUpdateCounts is Statement.SUCCESS_NO_INFO.


Reference https://yangbajing.me/2020/06/27/jdbc-%E6%89%B9%E9%87%8F%E6%8F%92%E5%85%A5%EF%BC%9Amybatis%E3%80%81postgresql/