Posted on
December 18th, 2011
Nowadays almost every application is dealing with a descently big amount of data. Irrespective of whether the application is of e-commerce, travel, learning, finance or social networking domain, they all have a good amount of customer data, feedback data, financial records etc. The origin of frameworks in nosql and distributed computing/storage field have also boosted the applications urge to store all the data it can. Nowadays, servers running on commodity hardware can very easily support storage of billions of records. So, its not at all a big deal for any application to store whatever it can.
The result of all this is a huge heap of records persisted on hard disks on various clouds hosted all around the world. And the world is happy keeping it over there. To add more and more of data every passing day. Apache Mahout can help them increase their happiness.
Apache Mahout is a machine learning framework. Don't get scared by hearing machine learning. Its simple. There is no need for you to go through all the statistical, algorithmic, mathematical formulas to use Apache Mahout. Apache Mahout implements everything for you. Tells you which algorithm can help in which business scenario, and gives you java classes and CLI's ( Command Line Invocators) to run the algorithms.
Now, lets see what all it can do.
Recommendation:
You have an application which sells books. The application stores data of who bought which book. This data will anyway be available as the customers and purchase records would be available.
Now, generally readers have similar preferences. A person buying Flex books might be interested in HTML 5 book. A person buying book on Hadoop might be interested in a book on HBase. But how would you know that? Read more »
Posted on
July 31st, 2011
In this post I will be explaining the concept of MapReduce and how Hadoop uses it. I will be talking mostly about what Hadoop is, what can it do, when to use it etc. So, lets try to associate Hadoop with the technical problem it solves.
Suppose the application you are working on has ever growing data. Let it be a search engine collecting data from crawlers. After a certain time, the data will become so huge that the database servers won’t be able to handle it. Then, you will buy more powerful database servers (machines) which can handle such data. But the data will keep growing, then you will again have to buy a more powerful and sophisticated machine and this will go on until more powerful machines are not there. These ultra masculine machines are pretty costly. They have special hardware which is not that common, this accounts for the high price.
Before knowing what Hadoop is, and how it solves this problem, its necessary to know what MapReduce is. So, lets begin.
Read more »
Posted on
June 10th, 2011
The application I am working on nowadays extensively uses JBoss Drools. The application deals with lots of objects on which complex logic is applied in sequence. This part has been developed using JBoss Drools. The keywords here are "lots of objects" and "complex logic in sequence".
These two keywords, when found together, often cause menace for developers.
However, few days back, when I profiled the application, I was a bit amazed to see that "complex logic" on "lots of objects" was not taking that much time. And trust me, this is not what I was expecting. Before profiling, I was almost sure that this "logic on lots of objects" thing is the bottleneck in application's performance. So, the discovery of it being efficient was a bit shocking.
Ok, I accept it. Its fast. The profiler can not be wrong. But how is it that fast?
To get an answer to this question, I explored JBoss Drools internals. How JBoss Drools actually works? What's the thing that brings in this efficiency in JBoss Drools? I got few answers when I explored how it works. I will share my understanding of its architecture which, I think, makes it memory and performance efficient.
JBoss Drools uses Rete's Algorithm to execute rules. Rete's algorithm is an efficient pattern matching algorithm.
JBoss Drools has its own implementation of Rete's Algorithm. A rule in Drools is represented by a Rete tree. Read more »
Posted on
April 26th, 2011
This Sunday (24th April) Xebia organized a session on Test First Development by Dr. Venkat. He is a well known personality in Agile and Extreme Programming field. He has authored several books and has conducted numerous workshops on several areas of Agile.
The session consisted of three parts, first a question answer round, then an interactive TDD workshop and then an open discussion.
Several topics were discussed throughout the day and Dr. Venkat as well as the audience shared their thoughts on them. To begin with, the discussion started with the importance of creating an environment to make people do the right thing than to force them to do it. Dr. Venkat gave huge emphasis on creating a healthy environment which can motivate people. He said that forcing people to do things without letting them know "why" are we doing this, never works in longer run. People will start doing it, then slowly lose enthusiasm as they do not know the reason behind it, and after some time, they would not do it anymore. However, when we create an environment where it looks very natural to do certain things, really helps. He cited examples from his life, where he used it and it worked. The right environment can be created through leading by example and assertive communication, to name a few. Read more »
Posted on
April 6th, 2011
Nowadays when Agile is on its peak and Junit tests hijack the technical discussions. Its very easy to find people obsessed with Junits. You can really sense the excitement in people when they talk about Junits.
I consider this as one of the major achievements of the Agile promoters. They have been really successful in transfusing Junits in the blood of developers, managers and even clients. Generally, I see questions in demo regarding the code coverage. And then, I see the team writing Junits, one after the other, to set the Junit coverage above the expectations of the client.
However, most of the people, who are into this highly geeky practice of writing Junit tests, really write Junits which are even worth a penny?
Read more »