Skip to main content

8 things every developer should know about the Apache Ignite caching

Any technology, no matter how advanced it is, will not be able to solve your problems if you implement it improperly. Caching, precisely when it comes to the use of a distributed caching, can only accelerate your application with the proper use and configurations of it. From this point of view, Apache Ignite is no different, and there are a few steps to consider before using it in the production environment. In this article, we describe various technics that can help you to plan and adequately use of Apache Ignite as cutting-edge caching technology.

  • Do proper capacity planning before using Ignite cluster. Do paperwork for understanding the size of the cache, number of CPUs or how many JVMs will be required. Let’s assume that you are using Hibernate as an ORM in 10 application servers and wish to use Ignite as an L2 cache. Calculate the total memory usages and the number of Ignite nodes you have to need for maintaining your SLA. An incorrect number of the Ignite nodes can become a bottleneck for your entire application. Please use the Apache Ignite official documentation for preparing a system capacity planning.
  • Select the best deployment option. You can use Ignite as an embedded or a real cluster topology. All of them contains a few pros and cons. When Ignite is running in the same JVM (in embedded mode) with the application, the network roundtrip for getting data from the cache is minimum. However, in this case, Ignite uses the same JVM resources along with the application which can impact on the application performance. Moreover, in the embedded mode, if the application dies, the Ignite node also fails. On the other hand, when Ignite node is running on a separate JVM, there is a minimal network overhead for fetching the data from the cluster. So, if you have a web application with a small memory footprint, you can consider using Ignite node in the same JVM.
  • Use on-heap caching for getting maximum performance. By default, Ignite uses Java off-heap for storing cache entries. When using off-heap to store data, there is always some overhead of de/serialization of data. To mitigate the latency and get the maximum performance you can use on-heap caching. You should also take into account that, Java heap size is almost limited and there is a GC (Garbage collection) overhead whenever using on-heap caching. Therefore, consider using on-heap caching whenever you are using a small limited size of a cache, and the cache entries are almost constants.
  • Use Atomic cache mode whenever possible. If you do not need strong data consistency, consider using the atomic mode. In an atomic mode, each DML operation will either succeed or fail and, neither Read nor Write operation will lock the data. This mode gives a better performance than the transactional mode. An example of using an atomic cache configuration is shown below.

    <property name="cacheConfiguration">
        <list>
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
                <property name="name" value="testCache" />
                <property name="atomicityMode" value="ATOMIC" />
            </bean>
        </list>
    </property>
    

  • Disable unnecessary internal event's notification. Ignite has a rich event system to notify users/nodes about various events, including cache modification, eviction, compaction, topology changes, and a lot more. Since thousands of events per second are generated, it creates an additional load on the system. This can lead to significant performance degradation. Therefore, it is highly recommended to enable only those events that your application logic requires.

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <!-- Enable events that you need and leave others disabled -->
        <property name="includeEventTypes">
            <list>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED"/>
            </list>
        </property>
    </bean>
    

  • Turn off backups copy. If you are using PARTITIONED cache and the data loss is not critical for you, consider disabling backups for the PARTITIONED cache. When backups are enabled, Ignite cache engine maintains a remote copy of each entry, which requires network exchanges. To turn off the backups copy, use the following cache configuration:

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="cacheConfiguration">
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
                <!-- Set cache mode. -->
                <property name="cacheMode" value="PARTITIONED"/>
                <!-- Set number of backups to 0-->
                <property name="backups" value="0"/>
            </bean>
        </property>
    </bean>
    
  • Synchronizing the requests for the same key. Let's explain by an example. Assume, your application has to handle 5000 requests per second. Most of them requested by one key. All the threads follow the following logic: If there is no value for the key in the cache, I query to the database. At the ends, each of the thread goes to the database and updates the value for the key into the cache. As a result, the application spends more times than if the cache was not at all. This is one of the common reasons when your application slows down whenever you are using a cache.

    However, the solution to this problem is simple: synchronizing the requests for the same keys. From version 2.1, Apache Ignite support @Cacheable annotation with sync attributes which ensure that a single thread is forming the cache value. To achieve this, you have to add the sync attribute as follows:

    @Cacheable(value = "exchangerate", sync = true)
    public String getExchangerate(String region) {
    }
    
  • Turn off or tune durable memory. Since version 2.1, Apache Ignite has its own persistence implementation. Unfortunately, persistence slows down the system. The WAL slows down the system even more. If you do not need the data durability, you can disable or turn off the WAL archiving. In Apache Ignite, starting from version 2.4, it is possible to disable WAL without restarting the entire cluster as shown below:

    ALTER TABLE tableName NOLOGGING
    ALTER TABLE tableName LOGGING
    

    By the way, you can also tune the WAL logging level according to your requirements. By default, the WAL log level is enabled on DEFAULT mode, which guaranty the highest level of data durability. You can change the log to one of the following levels:

    1. LOG_ONLY.
    2. BACKGROUND.
    3. NONE.

Caching gives enormous performance benefits, saves unnecessary network roundtrips and reduce CPU costs. Many believe that caching is such an easy way to make everything faster and cooler. However, as practice shows, most often incorrect use of caching makes thing worse. Caching is the mechanism that only gives performance boosts when you use it correctly. So, remember this before implementing it in your project, take measurements before and after on all related cases.

Don't hesitate to leave your comments or ideas if you have any. Portions of this article were taken from The Apache Ignite Book. If it got you interested, check out the rest of the book for more helpful information.

Comments

Popular posts from this blog

Tip: SQL client for Apache Ignite cache

A new SQL client configuration described in  The Apache Ignite book . If it got you interested, check out the rest of the book for more helpful information. Apache Ignite provides SQL queries execution on the caches, SQL syntax is an ANSI-99 compliant. Therefore, you can execute SQL queries against any caches from any SQL client which supports JDBC thin client. This section is for those, who feels comfortable with SQL rather than execute a bunch of code to retrieve data from the cache. Apache Ignite out of the box shipped with JDBC driver that allows you to connect to Ignite caches and retrieve distributed data from the cache using standard SQL queries. Rest of the section of this chapter will describe how to connect SQL IDE (Integrated Development Environment) to Ignite cache and executes some SQL queries to play with the data. SQL IDE or SQL editor can simplify the development process and allow you to get productive much quicker. Most database vendors have their own fron...

Load balancing and fail over with scheduler

Every programmer at least develop one Scheduler or Job in their life time of programming. Nowadays writing or developing scheduler to get you job done is very simple, but when you are thinking about high availability or load balancing your scheduler or job it getting some tricky. Even more when you have a few instance of your scheduler but only one can be run at a time also need some tricks to done. A long time ago i used some data base table lock to achieved such a functionality as leader election. Around 2010 when Zookeeper comes into play, i always preferred to use Zookeeper to bring high availability and scalability. For using Zookeeper you have to need Zookeeper cluster with minimum 3 nodes and maintain the cluster. Our new customer denied to use such a open source product in their environment and i was definitely need to find something alternative. Definitely Quartz was the next choose. Quartz makes developing scheduler easy and simple. Quartz clustering feature brings the HA and...