Recently, I got an opportunity to drive the implementation of the reporting part of an enterprise application at Thoughtworks. This being my first such gig, and being morally responsible for the successful delivery and the performance of the application, I had to observe and apply the methods that senior folks use to ensure that the application they are building is of the highest quality. What I was really curious to know was the method that my seniors follow to “decide” to look into some aspect of the application.
A clear pattern that emerges is, most of them have specific what I would like to call “Alarm Bells” that go off during specific situations. I have been trying to come up with a catalog of such situations and I will document a few of them below.
Data Movement across systems
Data movement has the potential to cause a lot of problems in enterprise applications. When moving huge amounts of data across wire, one would need to think about problems such as performance, consistency, duplication of data, transactions etc.
Data Read and write
In all but the most trivial systems, there will be a need to read or write data over a slow resource like disk drive or a network. Such large data reads and writes have the potential to quickly become a bottleneck for performance in a large system. Software systems take advantage of a phenomenon called locality of reference in computer programs to enhance the performance of data I/O. For instance, in the case of disk drives, systems store data in such a way that the chunks of bits that are most likely to be accessed together are stored in a sequence. Disk drives being fundamentally sequential access memories, storing related data together enhances the performance of I/O significantly.
Another part of programs that has the potential to keep people up in the night debugging away is resource handling. Resources such as connections, ports, memory etc are always finite. In huge applications where several people were involved in the development, it is very easy to not release an unused resource. Over time, the system takes up all the resources and invariably goes down. In recent times, programming languages and runtimes are becoming more effective in handling resources. But this still remains largely a responsibility of the programmer.
Another interesting aspect of resource handling is the creation of logical resources like connections to external systems like databases. The creation of these connections always involve an overhead, which could be greater than the time taken to get data out of a database. Hence as a fundamental in huge systems, connections like these are always created in advance and are pooled so as to avoid this overhead.
Good system designers become very skeptical when they see a piece of code that requires operations to be performed in a specific order. This becomes a huge problem especially at higher levels of abstractions like services. A service which requires the consumers to invoke operations in a specific order is a poorly designed one as there is a high probability that one consumer might just forget to invoke the operations in the required order.
As Kent Beck mentions, “Duplication is the source of all evil in programs”. When there is duplication of artifacts, version control them, when there is duplication of logic, abstract them and when there is duplication of process, automate them. Good systems developers abhor duplication. While there is certainly a case for eliminating duplication from systems, there are some cases, it might be useful. One such case is avoiding the use of singletons in systems as they could potentially lead to a lot of problems.
As Fred Brooks points out, software is mostly invisible. Good systems make the operations of software visible by gathering and publishing metrics. The importance of making the operations of the system visible is brought out well in this post.
It is very easy as programmers to come up with complex solutions for a lot of problems. For example, when finding the anagrams in a given set of strings, you could either generate all possible permutations of the string and compare with other strings. But the simpler solution is to sort all the strings and compare them. If you find any complex computation, chances are there’s a simpler way to do that.
Two parallel programs synchronise when a set of tasks need to be done in a sequential fashion. Though synchronisation is not a problem in itself, the techniques that are used to implement it come with their own problems. For example, mutual exclusion which is one of the techniques to synchronise parallel programs, comes with hard problems like dead locks, live locks etc.
Complex systems are full of such problems. Some of the problems are artificial, imposed on the programmers by themselves, but most of them are inherent. As Grady Booch points out, “We cannot make this complexity go away, we may but master it”.