Wednesday, 26 September 2012

Spring batch with Hibernate Part 1


Hibernate is a great ORM for web applications (it's traditional usage pattern), but it can also be very good at batch.  A word of warning, though, you need to be very careful as your HQL query can result in multiple SQL calls,  which would be a disaster for batch.

This post describes how hibernate can be configured to operate in a batch mode to minimizes the number of SQL calls to the underlying database.

Enabling batch mode

Configure hibernate with the following properties (see improving hibernate performacne):

hibernate.jdbc.batch_size=1000
hibernate.default_batch_fetch_size=1000

eg.

<bean id="sessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
  <property name="dataSource" ref="dataSource"/>
  <property name="packagesToScan">
    <list>
      <value>coder36.sbent.sample.domain</value>
      <value>coder36.sbent.sample.domain.stage</value>
    </list>
  </property>
  <property name="hibernateProperties">
    <props>
      <prop key="hibernate.dialect">org.hibernate.dialect.MySQLDialect</prop>
      <prop key="hibernate.jdbc.batch_size">1000</prop>
      <prop key="hibernate.default_batch_fetch_size">1000</prop>   
    </props>
  </property>
</bean

A good tip, is to also enable hibernate SQL logging, so you can see exactly what hibernate is doing behind the scenes (see here for the types of logging output) :

<prop key="hibernate.show_sql">true</prop>
<prop key="hibernate.format_sql">true</prop>
<prop key="hibernate.use_sql_comments">true</prop>


How does hibernate batch mode work?  

The best way to describe it is via an example taken from sbent.  Have the following entities:

@Entity
public class Transaction {
  ...

  @ManyToOne(fetch=FetchType.LAZY) 
  Customer customer;

  public Customer getCustomer() {
    return customer;
  }
}

and

@Entity
public class Customer {
...

@Column( nullable=false, length=60 )
private String name;

public String getName() {
return name;
}

}

To demonstrate what happens under the covers, load some data, then retrieve it with the following code:

List<Transaction> trans = 
    session.createQuery( "select t from Transaction t" ).list();
for ( Transaction t: trans ) {
  System.out.println( "getting customer");
  Customer c = t.getCustomer();
  System.out.println( "name: " + c.getName() );
}

With hibernate SQL logging turmed on, the following is sent to stdout:

getting list of Transaction's
Hibernate: 
    /* select t from Transaction t */ 
    select
        transactio0_.id as id6_,
        transactio0_.amount as amount6_,
        transactio0_.bank_id as bank4_6_,
        transactio0_.customer_id as customer5_6_,
        transactio0_.version as version6_ 
    from
        Transaction transactio0_
getting customer
Hibernate: 
    /* load coder36.sbent.sample.domain.Customer */ 
    select
        customer0_.id as id1_0_,
        customer0_.name as name1_0_,
        customer0_.nino as nino1_0_,
        customer0_.version as version1_0_ 
    from
        Customer customer0_ 
    where
        customer0_.id in ( ?, ?, ? )
name: Mark Middleton
getting customer
name: John Smith
getting customer
name: Fred Dibnah

What's happening here?

The Transaction entity is configured to lazily load its Customer.  So when the customer is first accessed, (c.getName()), the lazy load is triggered.  As batch mode is enabled, hibernate also triggers the lazy load of all Transaction.customer's for all Transaction entities in the cache - hence the customer0_.id in ( ?,?,? ) bit.

Summarize

  • Enable hibernate batch mode by setting hibernate.jdbc.batch_size and  hibernate.default_batch_fetch_size
  • Use lazy loading - ensure hibernate entities are correctly configured with  fetch=FetchType.LAZY
  • Enable SQL logging during unit testing, to prove hibernate is behaving as expected. 




No comments:

Post a Comment