Skip to main content

Posts

Book: High performance in-memory computing with apache Ignite

Being author is fun, my first book is going to be published in the end of this year. For now, a sample chapter is available for download .

Pitfalls of the MyBatis Caches with Apache Ignite

UPD1: This blog has been published in Java Dzone https://dzone.com/articles/pitfalls-of-the-mybatis-caches-with-apache-ignite UPD2: This blog also published in Habrahabr for Russian reader https://habrahabr.ru/company/at_consulting/blog/280452 UPD3: See also the sample chapter of the book "High performance in-memory computing with Apache Ignite" here . A week ago, MyBatis and Apache ignite announced of support apache ignite as a MyBatis cache (L2 cache). technically MyBatis support two levels of Caches: Local cache, which is always enable by default L2 cache, optional As Apache Ignite project is fast growing with it's various functionality, in this blog post we are going to examine the MyBatis support in some details. The second level cache stores the entity data, but NOT the entities or objects themselves. The data is stored in a 'serialised' format which looks like a hash map where the key is the entity Id, and the value is a list of primitive value...

Quick start with In memory Data Grid, Apache Ignite

UP1: For complete quick start guide, see also the sample chapter of the book "High performance in-memory computing with Apache Ignite" here . Even you can find the sample examples from the GitHub repository . IMDG or In memory data grid is not an in-memory relational database, an NoSQL database or a relational database. It is a different breed of software datastore. The data model is distributed across many servers in a single location or across multiple locations. This distribution is known as a data fabric. This distributed model is known as a ‘shared nothing’ architecture. IMDG has following characteristics: All servers can be active in each site. All data is stored in the RAM of the servers. Servers can be added or removed non-disruptively, to increase the amount of RAM available. The data model is non-relational and is object-based.  Distributed applications written on the platform independent language. The data fabric is resilient, allowing non-disruptive au...

Benchmarking high performance java collection framework

I am an ultimate fan of java high performance framework or library. Java native collection framework always works with primitive wrapper class such as Integer, Float e.t.c. Boxing and unboxing of wrapper class to primitive data type always decrease the java execution performance. Most of us, always looking for such a library or framework to works with primitive data type in collections for increasing performance of Java application. Most of the time i uses javolution framework to get better performance, however, this holiday i have read about a few new java collections frameworks and decided to do some homework benchmarking to find out, how much they could better than Java native collection framework. I have examine two new java collection framework, one of them are fastutil and another one are HPPC. For benchmarking i have used java JMH with mode Throughput. For benchmarking i took similar collection for java ArrayList, HashSet and HasMap from two above described frameworks. Col...

Load balancing and fail over with scheduler

Every programmer at least develop one Scheduler or Job in their life time of programming. Nowadays writing or developing scheduler to get you job done is very simple, but when you are thinking about high availability or load balancing your scheduler or job it getting some tricky. Even more when you have a few instance of your scheduler but only one can be run at a time also need some tricks to done. A long time ago i used some data base table lock to achieved such a functionality as leader election. Around 2010 when Zookeeper comes into play, i always preferred to use Zookeeper to bring high availability and scalability. For using Zookeeper you have to need Zookeeper cluster with minimum 3 nodes and maintain the cluster. Our new customer denied to use such a open source product in their environment and i was definitely need to find something alternative. Definitely Quartz was the next choose. Quartz makes developing scheduler easy and simple. Quartz clustering feature brings the HA and...

HighLoad++ conference 2015

UP1: Much more about high performance compution should be found in this book . This year i was invited to HighLoad++ conference in Moscow as a speaker. My session was in 2nd November in hall number 1, you can check the summary of my presentation here . I have very enjoyed my session, there are a lot of specialists came from the different sector and I was pleased to answer their questions. Even I continue my talk with participants after my session. Here you can find my full presentation in slide share . Certainly, i also listen to a few talks and I have to mention some of them. Session from company 2 sigma and Alibaba was very interesting. Company 2 sigma describe how they uses and managed their cluster using apache Mesos. Also 2 days non stop sessions from PostgresSQL, a lot of informations for developer and DBA. I have also learn a few new thing such as "competition" base machine learning which are using Avito. Also company Hawq introduce their new SQL engine for Hadoop...

Ingest data from Oracle DataBase to ElasticSearch

One of my few blog posts i have mentioned how to use and the use cases of using Oracle DataBase changed notification. When you have to need context search or count facts over your datas, you certainly need any modern Lucene based search engine. Elastic search is one of that search engine that can provide those above functionalities. However, from the previous versions of ES, elastic search river is deprecated to ingesting data to ES. Here you can get the whole history about deprecating river in ES. Any way, for now we have two option to ingest or import data from any sources to ES: 1) Implements or modify your DAO services, that can update data in ES and DataBase same time. 2) Polling, implements such a job, which will polling data in some period of time to update data in ES. First approach is the very best option to implements, however if you have any legacy DAO services or 3rd party application that you couldn't make any changes is not for you. Polling to data base frequentl...