Business rules are on the up and up - IBM planted a flag early on of course, and last month, Oracle made a catch-up play. An increase in tools, deployments and algorithms is providing the technological supply, and the intuitive appeal (to senior management) and increasing articulation of business logic inside large corporations is driving demand.

The JBoss / Drools manual has a very nice description of when to use rules - basically, it boils down to having complex logic (you know that queasy feeling you get when you’re not sure whether you’ve just added a close brace to the fourth nested else block or the fifth nested then block? - that’s complex logic) and frequent modifications to that logic. But then the bit that comes next struck me:

To quote a Drools mailing list regular (Dave Hamu): “It seems to me that in the excitement of working with rules engines, that people forget that a rules engine is only one piece of a complex application or solution. Rules engines are not really intended to handle workflow or process executions nor are workflow engines or process management tools designed to do rules. Use the right tool for the job. Sure, a pair of pliers can be used as a hammering tool in a pinch, but that’s not what it’s designed for.”

I certainly agree that it’s all too easy to see some new technonugget as a panacea (programming in XML, anyone?), but I believe workflow is exactly the sort of thing that can be modelled well with rules. Of course, you need both backchaining and forwardchaining, and you need good ways of tying rule firing with state changes in the world, but there is a strong correspondence, particularly if you use some of the neat BPM tools - like JBoss’ own jBPM. Just because the Web Services and Ontology folks foundered trying to build WS-Events doesn’t mean to say that approaches that are unencumbered by the Semantic Web baggage can’t make good and rapid progress.

There is a problem, however, and it is a problem that hits every rule engine of any weight (i.e. that is seriously designed for deployment). The problem is conflict. Let’s say you’ve got one rule that says a customer is to receive preferential treatment if they have an account. Then, there’s another rule that says a customer should be dispreferred if they’re order value is below some threshold. So what do you do if you have a customer who has an account and a low order value? The standard BRE algorithms solve conflicts by attaching an integer value to rules; if two rules conflict, you pick the higher ranked rule. Acount-holders are to be preferred (rank 5/5); low-order values to be dispreferred (rank 3/5). This is a bit like listening to Mozart on your laptop’s built in speaker - you can make out the melody but you lose the subtlety. And more importantly, it fails to have the intended effect.

Perhaps we shouldn’t be surprised: the RETE algorithm that lies at the core of most efficient rule engines is, after all, 30 years old (to put it in context, that makes it a contemporary of the Appled ][, the CN Tower and Bod). LEAPS, the cutting edge heart of the new version of Drools, is only a couple of years more recent. Both techniques are rooted in GOFAI - good old fashioned AI - with all the brittleness and limitations associated with it.

The problem has long been recognised by the builders of business rule sets because they have forced BRE vendors to provide ways of allowing priorities to be changed dynamically. But the current solutions are the equivalents of turning the volume up on the laptop speaker. What is required is a way to avoid the twin gotchas of logical spaghetti and caveats ad nauseam. Both of these bite when a rule base is extended and modified (one of the motivations forthe BR approach in the first place, remember). Anyone who has a passing acquaintance with statistics will have an intuition for the idea of independence - two things are independent iff the probability of both is the product of the probabilities of each. There is an analog in rule systems: two rules are independent if the action of one firing does not affect (either positively or negatively) the firing of another. If rules are not independent, you can get conflict that needs to be managed explicitly (e.g. by introducing a new rule that combines the two dependent rules). The problem of logical spaghetti is keeping track of the ever greater links of dependence between different rules, and dealing with the instabilities they introduce. A particularly common form of interaction between rules is specialisation, and specifically, the capturing of exceptions. You might have a rule that says that parcels should normally be shipped overland. Then you want to capture the special case that urgent parcels should be shipped air freight. Except for those that are over a certain weight. Unless they’ve got special authorisation. And so on. Every time you add a new caveat you need to duplicate the old rule and develop two new versions, one with the exception and one without. So there is a risk that your ruleset grows exponentially with ever more caveats. The two gotchas together pose serious maintenance problems that restrict the application of rule based systems.

Tackling the problem of conflicts head-on is deeply hard. The solution can be captured in a single word: “usually”. If only our rules could be built around the fact that conclusions usually follow (rather than always), and if only our rule engine would process those usuallys making sure that exceptions don’t hold, we’d be saved from both gotchas. There are ways it can be done; IBM have made a start and so have we, and in part II of this post, I’ll explore some of the technical details and practical ramifications.

Because promised second parts get posted within a week or so. Usually.