News

This week The apache Ignite book becomes one of the top books of leanpub

This week The apache Ignite book becomes one of the top books of leanpub.

Sunday

8 things every developer should know about the Apache Ignite caching

Any technology, no matter how advanced it is, will not be able to solve your problems if you implement it improperly. Caching, precisely when it comes to the use of a distributed caching, can only accelerate your application with the proper use and configurations of it. From this point of view, Apache Ignite is no different, and there are a few steps to consider before using it in the production environment. In this article, we describe various technics that can help you to plan and adequately use of Apache Ignite as cutting-edge caching technology.

  • Do proper capacity planning before using Ignite cluster. Do paperwork for understanding the size of the cache, number of CPUs or how many JVMs will be required. Let’s assume that you are using Hibernate as an ORM in 10 application servers and wish to use Ignite as an L2 cache. Calculate the total memory usages and the number of Ignite nodes you have to need for maintaining your SLA. An incorrect number of the Ignite nodes can become a bottleneck for your entire application. Please use the Apache Ignite official documentation for preparing a system capacity planning.
  • Select the best deployment option. You can use Ignite as an embedded or a real cluster topology. All of them contains a few pros and cons. When Ignite is running in the same JVM (in embedded mode) with the application, the network roundtrip for getting data from the cache is minimum. However, in this case, Ignite uses the same JVM resources along with the application which can impact on the application performance. Moreover, in the embedded mode, if the application dies, the Ignite node also fails. On the other hand, when Ignite node is running on a separate JVM, there is a minimal network overhead for fetching the data from the cluster. So, if you have a web application with a small memory footprint, you can consider using Ignite node in the same JVM.
  • Use on-heap caching for getting maximum performance. By default, Ignite uses Java off-heap for storing cache entries. When using off-heap to store data, there is always some overhead of de/serialization of data. To mitigate the latency and get the maximum performance you can use on-heap caching. You should also take into account that, Java heap size is almost limited and there is a GC (Garbage collection) overhead whenever using on-heap caching. Therefore, consider using on-heap caching whenever you are using a small limited size of a cache, and the cache entries are almost constants.
  • Use Atomic cache mode whenever possible. If you do not need strong data consistency, consider using the atomic mode. In an atomic mode, each DML operation will either succeed or fail and, neither Read nor Write operation will lock the data. This mode gives a better performance than the transactional mode. An example of using an atomic cache configuration is shown below.

    <property name="cacheConfiguration">
        <list>
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
                <property name="name" value="testCache" />
                <property name="atomicityMode" value="ATOMIC" />
            </bean>
        </list>
    </property>
    

  • Disable unnecessary internal event's notification. Ignite has a rich event system to notify users/nodes about various events, including cache modification, eviction, compaction, topology changes, and a lot more. Since thousands of events per second are generated, it creates an additional load on the system. This can lead to significant performance degradation. Therefore, it is highly recommended to enable only those events that your application logic requires.

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <!-- Enable events that you need and leave others disabled -->
        <property name="includeEventTypes">
            <list>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED"/>
                <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED"/>
            </list>
        </property>
    </bean>
    

  • Turn off backups copy. If you are using PARTITIONED cache and the data loss is not critical for you, consider disabling backups for the PARTITIONED cache. When backups are enabled, Ignite cache engine maintains a remote copy of each entry, which requires network exchanges. To turn off the backups copy, use the following cache configuration:

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="cacheConfiguration">
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
                <!-- Set cache mode. -->
                <property name="cacheMode" value="PARTITIONED"/>
                <!-- Set number of backups to 0-->
                <property name="backups" value="0"/>
            </bean>
        </property>
    </bean>
    
  • Synchronizing the requests for the same key. Let's explain by an example. Assume, your application has to handle 5000 requests per second. Most of them requested by one key. All the threads follow the following logic: If there is no value for the key in the cache, I query to the database. At the ends, each of the thread goes to the database and updates the value for the key into the cache. As a result, the application spends more times than if the cache was not at all. This is one of the common reasons when your application slows down whenever you are using a cache.

    However, the solution to this problem is simple: synchronizing the requests for the same keys. From version 2.1, Apache Ignite support @Cacheable annotation with sync attributes which ensure that a single thread is forming the cache value. To achieve this, you have to add the sync attribute as follows:

    @Cacheable(value = "exchangerate", sync = true)
    public String getExchangerate(String region) {
    }
    
  • Turn off or tune durable memory. Since version 2.1, Apache Ignite has its own persistence implementation. Unfortunately, persistence slows down the system. The WAL slows down the system even more. If you do not need the data durability, you can disable or turn off the WAL archiving. In Apache Ignite, starting from version 2.4, it is possible to disable WAL without restarting the entire cluster as shown below:

    ALTER TABLE tableName NOLOGGING
    ALTER TABLE tableName LOGGING
    

    By the way, you can also tune the WAL logging level according to your requirements. By default, the WAL log level is enabled on DEFAULT mode, which guaranty the highest level of data durability. You can change the log to one of the following levels:

    1. LOG_ONLY.
    2. BACKGROUND.
    3. NONE.

Caching gives enormous performance benefits, saves unnecessary network roundtrips and reduce CPU costs. Many believe that caching is such an easy way to make everything faster and cooler. However, as practice shows, most often incorrect use of caching makes thing worse. Caching is the mechanism that only gives performance boosts when you use it correctly. So, remember this before implementing it in your project, take measurements before and after on all related cases.

Don't hesitate to leave your comments or ideas if you have any. Portions of this article were taken from The Apache Ignite Book. If it got you interested, check out the rest of the book for more helpful information.

Friday

A Simple Checklist for Apache Ignite Beginners

If you're just starting with this great open source framework, don't worry, we're here to help. Check out this great resource to help get you going.


If you are running Apache Ignite for the first time, you might face some difficulties. You have just downloaded Apache Ignite, run it a few times, and got some issues. Mostly, these problems are solved in a similar fashion. Therefore, I decided to create a checklist, which provides recommendations to help you avoid issues in the development environments.

1. Configuration Files
When Ignite starts in standalone mode by executing the ignite.sh|bat file, Ignite uses the $IGNITE_HOME/config/default-config.xml configuration file. In this situation, to connect to the specified node from the Visor command line console, you should choose the default-config.xml file from the configuration file list. Most of the time, the default- config.xml file is the first file in the list.

You have to run the following command to execute an Ignite node with your own Spring configuration file:

{IGNITE_HOME}/bin/ignite.{bat|sh} FILE_PATH/my-ignite-example.xml

or copy the my-ignite-example.xml file in the $IGNITE_HOME/example/config directory and execute the ignite.{bat|sh} command as follows:

{IGNITE_HOME}/bin/ignite.{bat|sh} examples/config/my-ignite-example.xml

2. Ports

By default, Ignite uses the following local ports:


TCP/UDP Port Number Description
TCP 10800 Default port for thin client connection
TCP 11211 Default JDBC port
TCP 47100 Default local communication port
UDP 47400
TCP 47500 Default local discovery port
TCP 8080 Default port for REST API
TCP 49128 Default port for JMX connection
TCP 31100~31200 Default time server port
TCP 48100~48200 Default shared memory port


If you are using Docker/a virtual machine getting your Ignite node up and running, you should open the above ports to communicate from your host machine.

Portions of this article were taken from the book The Apache Ignite book. If it got you interested, check out the rest of the book for more helpful information.

3. Logs

Log files are tracking events that happen when running Ignite. A log file is very useful to find out what happened with the Ignite application. If you have encountered a problem, and asked a question in Ignite forums, first of all, you will be asked for the log file. Ignite logging is enabled by default, but there are a few drawbacks. In default mode, Ignite writes not so much logging information’s on the console (stdout). In the console, you see only the errors; everything else will be passed to the file. Ignite log files are located on the $IGNITE_HOME/work/logdirectory by default. Do not erase log files and keep logs as long as possible, as this will be handy for debugging any serious errors.

However, if you want to quickly find out the problems without digging into separates log files, you can execute Ignite in verbose mode.

$ ignite.sh -v

In verbose mode, Ignite writes all the logging information, both on the console and into the files. Note that Ignite runs slowly in verbose modes, and it's not recommended to use this in a production environment.

4. Network

If you encountered strange network errors, for instance, if a network could not connect or could not send the message, most often you've unfortunately been hit by the IPv6 network problem. It can’t be said that Ignite doesn’t support the IPv6 protocol, but at this moment, there are a few specific problems. The easiest solution is to disable the IPv6 protocol. To disable the IPv6 protocol, you can pass a Java option or property to the JVM as follows:

-Djava.net.preferIPv4Stack=true

The above JVM option forces the Ignite to use IPv4 protocols and solves a significant part of the problems related to the network.

5. Ghost Nodes

One of the most common problems that many people encountered several times whenever they launched a new Ignite node. You have just executed a single node and encountered that you already have two server Ignite nodes in your topology. Most often, it may happen if you are working on a home office network and one of your colleges also run the Ignite server node at the same time. The fact is that by default, Ignite uses the multicast protocol to discover and communicate with other nodes. During startup, Ignite search for all other nodes that are in the same multicast group and located in the same subnetwork. Moreover, if it does, it tries to connect to the nodes.

The easiest way to avoid this situation is to configure static IP instead of TcpDiscoveryMulti- castIpFinder. Therefore, use TcpDiscoveryVmIpFinder and write down all the IP addresses and ports to which you are going to connect. These particular configuration helps you to protect from the ghost nodes in a development environment

6. Baseline Topology

Ignite baseline topology was introduced in version 2.4.0 and became a convenient way to protect the durability of the data through native persistence. However, what’s wrong with the Ignite baseline topology? To answer that question, let’s imagine the following scenario:

  • We have launched a single Ignite node with native persistence enable (data will be written to the disk).
  • We activated the cluster because we enable native persistence for the node.
  • We have created a REPLICATED cache and loaded some data on it.
  • Next, we launched two more nodes and start manipulating with data, insert/delete some data.

At this moment, each node contains the full copy of the data and works well. After a while, we decided to restart one of the nodes. If we stop the very first node from which we start, then everything breaks, and the data is lost. The reason for this strange behavior is the Ignite baseline topology, a set of server nodes that stores persistence data. In the rest of the nodes, data will not be persisted.

A set of the server nodes is determined for the first time at the moment the cluster is activated. So, the rest of the server nodes that you added later will no longer be included in the baseline topology. Thus, in our case, the set of the baseline topology consists of only one server node and this node persists data on disk. Whenever you stop this server node, everything breaks. Therefore, to prevent this surprise, start all the cluster nodes first, and only then activate the cluster.

So, we can point out the following shortlist for the beginners:


N Check it out
1 Use proper configuration files to connect through Ignite Visor.
2 Open ports that you need to work with the Ignite node.
3 Configure and read logs.
4 Avoid IPv6.
5 Use TcpDiscoveryVmIpFinder on the home office network.
6 Keep track of the baseline topology.