The Rule or the Criteria Object Pattern

Enterprise software systems are a very interesting case study for Object Oriented Analysis and Design. The domain is reasonably complicated to warrant thinking in terms of objects, plus the enterprise systems tend to evolve over time. Requirements change and the task of maintaining the consistency of the entire system is quite difficult given that usually multiple teams have worked on it. Thus most of the enterprise systems that “I have seen” tend to be Object Oriented.

Enterprise systems tend to be very rule oriented. For example raise an alert if today is m days from an event are very common requirements in a business system. And these rules are very much prone to change. Rules for a desired result can be added or removed with changing requirements. Modelling such rules is a challenge.

The easiest way to express a rule is the if then else statement that most of the programming languages support. This is the easiest step in the process. But finding a natural home or object for the if statement can be quite tricky.

Business objects are never home for rules. It is best to keep the business objects as simple and straightforward as possible and add only such logic that would make sense in all the contexts in which the object is used. Example

class Message {

   private final int priority;

   public isPrioritized(){
        return priority != -1;
   }

   public shouldRaiseDeliveryStatus(){
        return isPrioritized();
   }

}

The method isPrioritized deals only with the data from this class, and is derived directly from the representation of priority (the int priority variable). Hence it is a candidate for a member function. Where as the other method shouldRaiseDeliveryStatus is not dependent upon the representation of priority, but upon the behavior of the Message (viz isPrioritized). This means that shouldRaiseDeliveryStatus is not a member of this class, but can be the member of some other class.

Lets not be architectural astronauts here and try to figure out some properties of the class to which the shouldRaiseDeliveryMessage belongs to. For one, the class should obviosly have Message as one of its state members. The question of what others members belong to the new class is a question of context. If the rule behind raising the delivery notification sufficiently complex enough, it warrants a new kind of class which I would like to call the Criteria Pattern.

Criteria Pattern as the name implies models a business rule. When thinking of any new pattern, I always try to analyze how consistent is it with known OO techniques. Criteria pattern forms a part of the stable intermediate form discussed by Grady Booch, also it is a candidate for the ubiquitous language discussed by Eric Evans. As these are the only two major texts that I have read on OOP, this is the limit of my knowledge.

Back to the example, let us say that the business rule of raising of delivery status message changes and requires that a delivery status be raised when a message isPrioritized and also when the system message queue is not full. This requires that whatever class is making the decision, should have a reference to the system queue, and Message class  is not a very good place for that. Usually, the code is moved to some service class which checks the criteria and raises the message. This would mean that the service class will have some domain logic embedded which does not bode well for the rich domain put forth by Eric Evans in DDD. A criteria which can be directly traceable back to the domain can be the new home of such a rule.

class RaiseDeliveryStatusCriteria {

        private MessageQueue messageQueue;

        @Autowired
        public RaiseDeliveryStatusCriteria(MessageQueue systemMessageQueue){
                this.messageQueue = systemMessageQueue;
        }

        public boolean shouldRaiseDeliveryStatus(Message message){
                return messageQueue.isNotFull() && message.isPrioritized();
        }

}

The importance of not Seeing

For a long time now, I have been pondering about writing crisp unit tests. On one side, it seemed that having everything you wanted to know about the test in one method was good, but experience in reading code (thats rite, reading code) has taught me that test methods with a lot of variables are hard to understand.

Honestly, I simply could not see how I could present (to myself) the importance of keeping a test or for that matter, any piece of code simple. But recently a series of “spot the” type of images that crept up in facebook gave me the breakthrough idea.

The following is a map of silicon valley. Quick question, how long does it take for you to spot the Sun Micro system logo?

 I would expect that it would have taken atleast a scan of three fourths of the picture to  spot it, if you are not already familiar with it. Another question, surely you would have noticed a logo of a company you know. And once you have spotted the logo, some facts about the company would have flashed across your mind. How many times did this happen? Every time this happens, it takes you longer to find what you are looking for.

This might seem to be something that has nothing to do with programming. As it turns out, it is alarmingly similar. Here is a piece of code, that is pretty common in unit tests. It is written in Junit and uses Mockito for mocking dependencies. But I guess the code is explanatory enough

public class IncomeTaxCalculatorTest {

        @Mock
        IncomeTaxTable incomeTaxTable;

        IncomeTaxCalculator incomeTaxCalculator = new IncomeTaxCalculator(incomeTaxTable);

        @Test
        public void shouldCalculateIncomeTax(){
                Employee employee = new Employee();
                employee.setStatus("married");

                PayPackage payPackage = new PayPackage();
                payPackage.setBasicPay(20,000);
                payPackage.setFlexiblePay(30,000);
                payPackage.setHRAPercetage(0.10);

                employee.setPayPackage(payPackage);

                int manuallyCalculatedTaxableIncome = 27,000;

                when(incomeTaxTable.getTaxPercentageFor(manuallyCalculatedTaxableIncome)).thenReturn(20);

                double result = incomeTaxCalulator.compute(employee);

                double manuallyComputedIncomeTax = 2,000;

                assertEquals(manuallyComputedIncomeTax, result);
        }

}

Now when reading this test, it would seem to me that there is no relationship between the net taxable income and the basic and flexibly pays. The argument is that, if it is not necessary to know the relationship in the test, the basic and the flexible pays should not be present in the test.

As I explained in my previous post, the primary value that the unit test adds to code is that, it represents an understanding of the class which is being coded. Tests such as the above fail in that aspect miserably.

For argument sake, let us say that the primary reason for a unit test is to verify the class under test. In that respect too, the above test fails. I can assure you if the above test passes, that the class will work for the given example inputs. But there is no way of determining if it will work for the real world inputs. There are plenty of techniques to make such assertions. But back to the topic.

Now how would it read if the test were written something like this.

public class IncomeTaxCalculatorTest {

        @Mock
        IncomeTaxTable incomeTaxTable;

        IncomeTaxCalculator incomeTaxCalculator = new IncomeTaxCalculator(incomeTaxTable);

        @Test
        public void shouldCalculateIncomeTax(){
                // Group 1 employee is married and has taxable income > 40,000 & HRA is 10 percent
                Employee employee = new Group1Employee(40,000);//The net taxable income

                when(incomeTaxTable.getTaxPercentageFor(employee)).thenReturn(20);

                double result = incomeTaxCalulator.compute(employee);

                double manuallyComputedIncomeTax = 2,000;
                assertEquals(manuallyComputedIncomeTax, result);
        }

}

The IncomeTax calculator does not care what is the actual value of the basic and flexible pays. All it cares is the net taxable income. Likewise, the IncomeTaxTable does not care about the actual value of the netTaxableIncome. It just wants to know if the employee is in Grade1 to give out the tax percentage.

The primary difference between the two pieces of code is that, both test a class that behaves similarly, but the second test, contains lot less detail.

Aha. As it is with everything else, in programming too, the “devil is in the details”. So the lesser you bombard yourselves with un-necessary details, the lesser unnecessary details you “see”, the better you understand something. True with unit test as with everything else. What if the map had only Sun’s logo on it? We could have spotted it immediately.

Writing correct software is hard. As Booch points out, in his excellent book, “Complexity is an essential property of software. We can but only master it”. And mastering complexity is what we do when we consider only the details that we immediately need, forgetting for the time being those that we dont require rite away.

Quote for the day,

“Every closed eye is not sleeping, and every open eye is not seeing.” - Bill Cosby

The Craft of Unit Testing

Tags

, , , ,

Context

This post presents the value that unit testing adds, presents that state of practice in the project that I am working on, and lays the foundation for a future post on a new way of thinking about and writing unit tests.

At Thougtworks, it is customary to follow Test Driven Development (TDD). It is believed that TDD increases the probability of creating correct, readable and maintainable code, which are valued in that order.

Minimal upfront design

XP, the process that we follow argues for design decisions to be delayed as late as it is sensible. Minimal upfront design leaves the pair working on the task to come up with as much design as they deem necessary. This involves deciding the structure and behavior of all the units necessary to implement the feature, plus refactoring other units whose design does not fit into the current context.

Overload

The need to come up with design as well as the pressure to complete the task at hand leads lot of conversations and some frustrations when beginning to play a task. Unit Testing is suggested as a solution to avoid vague and meta physical questions like “should this belong here” that tend to crop up in “design discussions”. So the primary value that is derived from unit testing, is ironically, not correctness. Unit-testing provides, a formal way of thinking about the unit and its responsibilities, which increases the probability that the unit is correct.

A Heuristic for design quality

What unit testing does, that helps a pair out in the discussions about design, is that it provides a ready heuristic to reason about the design.

The rule of thumb is that, if a pair finds it difficult to write unit tests for a class, then there is something wrong with the class.

The rationale, behind the rule goes like this. A unit test represents an “understanding” of the way a class behaves in the given context. This implies that if it is difficult to write a unit test, it is difficult to understand the unit and there by, there is a greater chance that the unit is poorly designed.

Excellent said I, Elementary said He

Difficulty to write a test, is not exactly a very good heuristic, for what is easy for one pair is not necessarily difficult for another. There is another more serious problem with difficulty as a heuristic.

Ignorance is bliss

The heuristic does not capture coverage provided by the test. Now I have to make it clear at this point, that I dont consider code coverage as an indicator for well, anything. The coverage that I’m talking about here is state coverage. Can you assure me that a class is correct for all possible states that comes under its perview. This seems to be a very big ask, but I believe that it is possible using proper abstractions, but that is a topic for another post.

The above shortcoming implies that a pair can decided not to test the unit rigorously, but still pass off the unit as being easily testable. Difficulty as a heuristic, is so subjective that using it as a heuristic is difficult to say the least.

Mind your language

Since, unit test is a language to reason out the unit under test, it is best kept, as simple as possible. The major complexities in unit test, arise around data set up and assertions. The use of good frameworks like factory-girl and hamcrest enables the creation of data and assertions at abstract levels that is necessary for keeping the language of the test simple.

Wrapping up

The use of framework not withstanding, rigorousness  and complexity involved in creating a rigorous test are major indicators of the complexity of the unit. As the complexity of the test increases, the complexity of unit increases, and hence the unit is poorly designed. Since complexity in itself is not a good heuristic, a better one is needed to make good decisions on the design of the unit.

These are some of the parameters that I have tried to reconcile for sometime, that resulted in the a new format of writing unit tests. More about it, in the subsequent posts.

Quote for the day

“Faith is mortal. Reason is Divine”. – Unknown source

Bounded Context

    Computer programming is one of my most precise endeavors. Contrary to the popular belief, programming is not an art that can be acquired easily by mere mortals like me. I believe firmly that, it will take years of practice from my side to write good programs.

     The primary difficulty I have with programming, is not so much as writing code, but coming up with steps to solve a problem. I find that during such problem solving sessions, I concentrate so much on “trying” to solve it, that I completely ignore to reflect upon “what am I trying” to do.

    In all the problems that I have attempted to solve and failed, I have found one very common thing. The basic approach I would take to solve the problem in 7 out of ten cases was correct. But upon proceeding further, I loose track of the goal and end up not solving the problem at all. In 2 out of ten cases, the  basic approach to solve the problem is so obviously wrong that I could have went back to the correct path by just realizing the fact. In the rest, 1 out ten cases, I was unable to even comprehend the problem.

    I’m currently exploring a way to address this issue. I intend to do so, by observing how the authors of good books try to explain a concept. I find that explaining a concept needs a lot of articulation, that can be applied to problem solving. I have observed, that the primary technique they use is a “basis”. Basis is a set of very simple concepts that can be used to explain a complex concept. Then they provide examples for any abstract concept or rule. Then they explore the characteristics of the concept under question in relation to the basis, This set of basis, example, comparison with basis is what I would like to call a “bounded context”.

    The creation of bounded context is a difficult but  stimulating process. When the author presents something in a book, he/she has develops a bounded context. It is highly probable that the author had a good understanding of the subject under study. This means that the author already knows the set of basics needed to describe the concept. But when you are trying to understand a concept, it is not always clear what are the set of principles needed to describe the concept.

   Currently these are the problems that I’m trying to address. But it is my firm belief that creation of bounded context is a very essential skill for problem solving. Bounded context allows us to represent what is unknown in terms of what is known. This greatly reduces the amount of information that need to be processed when trying to reason out with the new concept. This aides understanding which is very essential for problem solving, which in turn is essential for programming. After all, as Stroustrup points out, “Programming is understanding”.

Quote for the day

“Complexity is an essential property of a software system. We can but only master it” – Grady Booch

Implementation of Adherence Module Part – I

I am working on the Motech Project. We are creating a platform upon which people can build software to support operations in the Health care sector. Adherence to dosage, clinic visits etc are important data in the health care domain. This post, is a technical one which elaborates the creation of a generic module for storing adherence information.

Adherence conceptually is (number of actual occurrences)/(expected number of occurrences) of any event. The event can be visiting the clinic on a specific date, taking a dose on time etc. Though conceptually very simple, adherence proved to be a quite complex data to model.

Adherence is continuous in nature. The adherence to a dosage for example depends upon data recorded from the time a patient was enrolled into the dosage. Let us assume that the patient has to take a dose every day according to the dosage, then adherence on fifth day is (number of doses patient has actually taken over five days) / 5.

A simple implementation is to record total number of doses taken, persist the information. Then when adherence is necessary, read the number of doses taken from the db, figure out the number of doses he is supposed to take from dosage information and calculate the adherence. There were some very serious implications when this model was followed.

For one, let us assume that the system is down for some time. Then the patient could not record the adherence. But still, the system would assume that the patient has not taken the doses and there by, adherence would decrease without any fault from the patient’s side. This combined with the fact that, adherence on specific dates in the past was a point of interest, we could not go ahead with this model.

Then we decided to record the information whether the dose was taken each day. Though this did not ensure that the adherence would not decrease when the system is down, we could calculate the adherence on any given date. This was a major plus.

We still had to solve the problem that, when the system if down, adherence percentage should not decrease. To over come this problem, we decided to create a log whether the dose is taken or not taken. In which case, adherence would be the number of (taken logs)/total no of logs. When the system is down, logs would not be created, and thus adherence percentage can be maintined.

Following the above model, severly restricted the system in a different way. For the model to work, it is necessary that each log should record the adherence information of one and only one dose. If the log recorded adherence for more than one dose, the date when each dose was taken, would have to be stored in a clumsy way. This does not bode well since the the queries on the model are date based.

The above restriction combined with the fact that adherence had to be captured on daily as well as a weekly basis, with the requirement that the switch from daily to weekly should be consistent, forced to look at a completely different way to model adherence

Feedback generators

Tags

, , ,

    When programming, I have observed that the time I spend on figuring out the skeleton and the structure of a program is much longer than the time I need to write the program. Some people argue that this is a good thing, that people should think about what they do, before they actually do it. But being a aspiring craftsman,   code is what gives me the greatest pleasure. There are a lot of cases where I tried to think through before actually coding and failed miserably.

    For one, I decide to write a twitter client using adobe air just for the fun of it. I spent so much time figuring out what the application should do, I created off a repository in github and eventually, did not manage to check in a single line of code.

    There are certain myths about thinking through what you have to do, that need to be broken. For one, you can never think about or reason about something that you have no idea about. Of course, you would have some idea about programming, and I presume, that is the reason why you are reading this post. But back to the point, even though you have a sound understanding on programming, it is very difficult to come up with a structure for the program you have to write NOW. This is because each program is unique in itself, though there might exist some commonality.

    To overcome traps such as these, I find it useful to create some concrete artifacts that give you an idea of how to proceed. The following are some artifacts that I find to be really useful. I call them feedback generators.

Code : Nothing provides so much clarity than the first line of code that you write.

Unit test : Feel like code is in the wrong place? write off a unit test. You need to set up a lot of variables before you can get to the assertion? you have written the code in the wrong class.

Comment: Though code is the most rigorous and most helpful artifact, I sometimes find that enumerating a set of tasks as comments is quite helpful.

Hypothesis: I have found it to be really useful in pin-pointing the source of a defect. If it is correct, gotcha!!!, incorrect? probably you would have a better understanding of the defect by the time you found hypothesis 1 to be wrong.

The Dumbest solution: I have in some cases found it useful, to think about something that is so obviously wrong, but it eventually leads to a better solution.

Similar solution:  Solutions to similar problems are tremendously helpful resources.

Wish list: When writing a program, we create data structures. When writing algorithms around those data-structures, it is tremendously useful to create a wish list of information that you would have liked the data structure to have to make your life easier.

Some artifacts, that I have found to be tremendously wasteful.

Writing down and mind maps: These just reflect what you think, and never have helped me to figure out the solution to a problem or in writing a program.

The counter example: When I am writing a program or solving a problem, I try to find a failure case as soon as possible. The disadvantage of this is that, I don’t see the few things that I am doing correct, and end up back in square 1.

Meta physical questions: Like should the Person know about a test. Nothing is as wasteful as this. If you end up in this discussion, write off some tests that consume the information and see if it should or should not have a reference. Also I have found normalization to be a good technique to answer these kind of questions.

    The important thing to remember when using feedback generators, is that, one should start reasoning based on them, once they are created. If one goes off into a tangential line of reason, the feedback generator will be ineffective, and would have eaten up precious time.

I am trying  to apply these feedback generators in earnest whenever I write code. I will post back, with updates on how well I fare.

Quote for the Day,

If you want to survive here, have an opinion” – House

Parallels between Essay Writing and Software Writing

Good essay writing is very similar to good software writing. An Essay and code, both are very structured. An essay has words, sentences, paragraphs and themes. Code has expressions, methods, objects and modules. An essay and code, both when written correctly are a pleasure to read. The properties of a good essay are applicable to code in equal measure.

Each sentence in a good essay, makes sense in its own rite. But at the same time, it maintains an un-interrupted flow between the sentences before and after it.  The same is true for a paragraph. Each expression in code must make sense in its own rite, at the same time it should maintain an un-interrupted flow between the expressions before and after it.

Each paragraph in an essay describes one and only one concept. Each method in a good code does the same. A paragraph, that adds a snide comment on another concept described elsewhere is a pain to read. The same is true for a method. Conversely, each concept is described in a set of consecutive paragraphs. Similarly, each concept in code is described in an object.

Good essay and good code, they both flow. Neither throws stuff at the reader which he/she does not understand. If they are trying to describe a concept that the reader does not understand yet, they do so gradually, showing the reader every link in the chain from the understood to the less understood.

Quote for the day,

Programming is understanding” - Bjarne Stroustrup.

The dogmas of formal education

It is a topic that has been debated heatedly over the last 10 years. Formal education. Is it of any use? There have been many examples that exemplify both the ends. There have been people like D.Ritchie a Harvard graduate, and Bill G, who dropped out, but went on the become the richest man in the world.

I for one, have observed that formal education in 9 cases out of 10, transforms the lives of people. It is my belief therefore, that formal education is necessary. But the fact is, Initially, I found it very hard to apply whatever I learnt at college to the job at hand. But slowly, I began to realize, that it is what I took out of my schooling, that made it impossible to use what I have learnt.

In my high school, education was a very simple equation. X number of lines you remember, your marks increase by a factor of X. X number of formulas you remember, X number of problems you can solve and again your marks increase by a factor of X. X number of problems on linear equations you practice, your mark increases by a factor of X.

This approach made college very difficult for me. I still remember the first C programming exam that I took. The problem was a simple one, something that had to do with finding the minimum of three alphabets which are the keys to three columns in a 2-d array and print out the columns in corresponding order. I sat there mortified. It did not make sense. I knew the algorithm to find minimum of a list perfectly well. It was there in my observation notebook. Eh?

It just did not strike me, that the problem at hand was the same problem back in my observation notebook. This is the greatest shortcoming of my formal education. I just did not pay attention to the one thing that will be of help to me. I did not pay attention to mathematics.

Lest, I give the reader a very wrong idea, given a problem from the mathematics text book, I knew how to arrive at the solution by applying a bunch of formulas. My entire attention was on the procedure to arrive at a solution. It made the most sense to me because, 1) this was what the teacher focused on and 2) the procedure and the solutions were the one that was going to fetch me marks. But procedure and solution are of very little use in real world when you are trying to solve a problem, just because of the fact that, you do not know the procedure.

What I failed to notice was the order and method through which the procedure was created. It is all to easy to remember the derivation of E=mc^2. But this does not make anyone Einstein. There is analysis that goes into coming up with an algorithm. No one can sit and think about how to solve a problem and solve it. One must analyse the input, figure out its nature, its properties, its relation to other known concepts, its relation to the output. Only then, It is my belief, that one can see the way.

But this is what a mathematics text book always specifies. They mention the input, its relation etc, and you just apply the formulas. But in real world, no one is going to write down, the properties of the input. No one is going to specify the fact that the input is a fibonocci sequence. No one is going to write down the input as a nice and clean linear equation. No one is going to give the input in a nice and neat linked list and ask me to find the middle element. I have to figure that out for myself. That is where analysis comes in, and there should be order and method when I analyse a problem.

Time and again, I have caught myself jumping to the next step in the solution without taking time to analyze what I have in hand. It is just the way I have trained myself to look at problems, and It is my first goal to unlearn this and start analyzing the variables at hand. Towards this end, I have started watching the mit open courseware on algorithms, this time around to pay attention to the techniques that the professor uses to analyse the problem, instead of the end solution. Formal education is not a waste. I  just have to pay attention to the more worthwhile subjects.

Quote for the day,

Projects don’t succeed by jeemboomba magic” – Dr.Nadarajan

Building systems

This is what I do for a living. I build systems. I don’t just build systems for a living, building systems has provided me meaning and reason. Building systems has forced me to think. Too much is at stake when building a system, and money is just a small part of it. Just watching a system work flawlessly, with precision gives me a sort of a high. There have been systems like unix, swype and kinect which makes one wonder if they were created by God Himself. I hope that one day, the systems I build gives me the same sort of a high. But for now, there is a great deal to unlearn and learn. I have to unlearn several dogmas that formal education has instilled in me and then, learn the qualities that make a great systems engineer. I begin

Follow

Get every new post delivered to your Inbox.