Compliance for loan originations in minority communities using Apache Camel and Complykit

This was my weekend proof of concept project while I had sick kids: Let’s say you’re responsible for compliance for the mortgage lending arm of a bank. Even though your bank has tightened lending standards significantly in the wake of the Great Recession, you still sell a series of relatively high interest mortgage products for riskier borrowers. Like most other banks, you originate these loans directly as well as through a network of third-party mortgage brokers; the most successful brokers in these products operate in areas with high minority concentrations.

wellsYou’ve seen that regulators have fined other lenders for targeting minority communities numerous times. For example, regulators famously fined Countrywide for targeting black communities with risky mortgage products during the height of the foreclosure crisis when it appeared those borrowers may have qualified for lower interest products. Well Fargo faced similar allegations in and settled with Justice a few years ago. And recently, Cook County in Illinois filed a lawsuit against HSBC and BofA for for similar conduct in Chicago citing the Fair Housing Act.

While you feel you have adequate controls in place for the bank’s employees, your network of mortgage brokers consists of independent operators. So even though you’ve got compliance policies written into your contracts to deter them from steering minority borrowers into these high priced products, you don’t have full control over their actions. You don’t want to get blindsided, so you want to start monitoring how many originations the independent brokers are making in minority communities for these products.

You could just set up some more Excel spreadsheets, but spreadsheets are for suckers. You want automation.

Automation > Spreadsheets

So let’s put together a Camel route that feeds observation data into the open source Complykit hub system. In a nutshell, this Camel route will read a roll up of new originations summaries, summarized by zip codes and counts. The Camel route will read that data and then compare the counts against thresholds set up by the compliance team. The data will then be fed into the Complykit compliance repository to be used by upstream systems. Easy, right? Let’s get started then.

Prerequisites

To make this work, we’re going to need some data. For this example, we’ll need to work with your database warehouse team to set up a job that periodically rolls up origination data. You could run this daily, weekly, monthly… whatever.

origination_summaries

Next we need some data about minority counts in areas you want to monitor, which you’ll likely need to assemble manually or buy commercially. The table could look like this:

market_thresholds_table

Next, let’s talk about how we’re going to analyze this data and move it using an Apache Camel route.

Apache Camel Integration

camelCamel is a great integration technology for the enterprise. While it has far too many advantages to document here, some are worth mentioning. First, Camel is able to easily move messages from one system and technology to another without having to write a lot of custom code. In our case, we’re going to be moving information from a database to the REST-based Complykit compliance server, which is to say we’re moving a database message to a REST-based XML message without too much boilerplate code.

Second, Camel nicely follows the enterprise integration patterns propounded by Gregor Hohpe, which will make our code easy for technologists who aren’t well versed in compliance requirements easy to understand and maintain.

Third, Apache Camel is open source and it runs on many open source platforms (such as Red Hat’s JBoss Fuse and JBoss Fuse ServiceWorksnote my employer). I’ve been successful working with open source enterprise technologies for the majority of my career, so I’ll keep advocating for them in these posts.

The Camel Route

There are two ways to write Camel code. You can author your routes with XML (or through a GUI editor in Eclipse). Or you can write Java code using Camel’s domain specific language (“DSL”). I opted for the DSL because, as a coder, I think of it as easier to work with. This was the route as written in DSL:

[code lang=”java”]
JaxbDataFormat jaxbDataFormat = new JaxbDataFormat();
jaxbDataFormat.setContextPath("com.michaelrice.biascheck.model");

from("jpa:com.michaelrice.biascheck.model.OriginationSummary")
.process(new ZipCodeExtractor())
.process(new MarketAnalyzer())
.process(new DestinationProcessor())
.marshal(jaxbDataFormat)
.setHeader(Exchange.HTTP_METHOD, constant("PUT"))
.setHeader(Exchange.CONTENT_TYPE, constant("text/xml"))
.recipientList(header("ck-destination"));
[/code]

So I admit, it’s not that easy to read if you’re not a Java developer. But it’s a lot easier than the hundreds of lines of transformation, JDBC, marshalling, and HTTP code we’d have to write. Here’s a quick explanation of what’s going on:

  1. The “from” part simply says that we’re going to read OriginationSummary objects from the database I documented above. It’s basically like a poor man’s queue in that it pulls records off the database and deletes them after the route completes.
  2. We extract the ZIP code from the summary using a custom processor and place the zip code in the Camel message header. It’s not really necessary to use a custom processor, nor is this step strictly necessary but I thought it read pretty well and it makes life a little easier.
  3. Next we run another custom processor to actually compare the origination data to the thresholds that the compliance team defined; more on this later.
  4. Next we determine what Complykit bucket to route the messages to.
  5. Finally we set a few headers and send the data via REST to Complykit using the recipientList instruction.

The Custom Processors

In my version, I opted to add a number of additional custom processor beans — mostly just because my instinct is to write Java code. I know I need to be more disciplined about keeping more of the route in the Camel DSL.

The most important processor in the route is the market analyzer. Here’s the code:

[code lang=”java”]
public class MarketAnalyzer implements Processor {

public void process(Exchange exchange) throws Exception {

OriginationSummary summary = exchange.getIn().getBody(OriginationSummary.class);
String zipCode = summary.getZipCode();

EntityManager em = Persistence.createEntityManagerFactory("camel").createEntityManager();
Query q = em.createQuery("from MarketThresholds where zipCode=?", MarketThresholds.class);
q.setParameter(1, zipCode);

MarketThresholds thresholds = (MarketThresholds)q.getSingleResult();
if (thresholds == null) {
exchange.setProperty("analysis-complete", Boolean.FALSE);
} else {

double pctSold = summary.getOriginationCount().doubleValue() / thresholds.getMinorityPopulation().doubleValue();

ComplykitReport report = new ComplykitReport();

if (pctSold > thresholds.getThresholdNoncomply()) {
report.setObservationType("noncompliant");
exchange.setProperty("compliance-status", "noncompliant");
} else if (pctSold > thresholds.getThresholdWarn()) {
report.setObservationType("warn");
exchange.setProperty("compliance-status", "warn");
} else {
report.setObservationType("comply");
exchange.setProperty("compliance-status", "comply");
}

String note = String.format("%s Population: %s; originations: %s; pct: %,.4f; warn-threshold: %,.4f; noncomply-threshold: %,.4f", thresholds.getZipCode(),
thresholds.getMinorityPopulation(), summary.getOriginationCount(), pctSold, thresholds.getThresholdWarn(), thresholds.getThresholdNoncomply());

report.setNotes(note);

exchange.getIn().setBody(report);
exchange.setProperty("analysis-complete", Boolean.TRUE);
}
}
}
[/code]

There’s a lot going on there. The quick explanation is that I run a second query against the market thresholds table (via the object relational mapping technology called JPA and Hibernate). I didn’t like having to do this in a custom processor because I thought the aggregator pattern would have done it. If you can think of a way to make it work, let me know. (Also, in a production system we’d be injecting the EntityManager or getting access to it through the message context.)

All the system does then is compare the percentages against the thresholds set by the compliance team and assemble an observation to be stored in Complykit (the ComplykitReport object).

The next important processor is the one that figures out where to send the Complykit observations. Here’s that code:

[code lang=”java”]
public class DestinationProcessor implements Processor {

private static String rootPath = "http://www.complykit.org/demo/api/obligation/16/observation/";
private static Map<String, String> observationIds = new HashMap<String, String>();

static {
observationIds.put("91105", "40");
observationIds.put("91106", "41");
}

public void process(Exchange exchange) throws Exception {

String zip = (String)exchange.getProperty("zip");
if (zip != null) {
String observationId = observationIds.get(zip);
if (observationId != null)
exchange.setProperty("ck-destination", rootPath + observationId);
}

}
}
[/code]

Right now it simply derives one destination, but I set it up as a processor so we could have it possibly send to multiple destinations (a little overengineering indulgence on my part, I know). Also, we obviously wouldn’t have the hard coded values in a real system.

This is a very good point in the conversation to talk about Complykit.

Complykit – open source legal compliance hub

Complykit is an open source project that I’ve been working on from time to time for the past few years. It’s really very simple in its current permutation. All it does is give you a place to store your legal (or ethical, contractual, regulatory, policy-based, or whatever) obligations and then record observations about your company’s performance against those obligations. To get more concrete in this instance, the “obligation” that you want to live up to is that you want to avoid excessive originations of your bank’s high interest product in certain markets. The observations you want to record against those obligations are whether you’re complying with that obligation, whether you should be warned about a high number, or whether you want to flag the market as out of compliance with your policy.

Complykit stores this information by using a REST-based application programming interface (“API”), which means that it can be loosely coupled to any system in your enterprise (and Camel makes it easy to do the routing).

First, we record an obligation in the Complykit server (or hub as I like to call it) by issuing a POST HTTP request:

[code lang=”xml”]
<obligation>
<category>legal</category>
<source>caselaw</source>
<directive>Minimize actual effects of high interest,
variable products in certain minority communities</directive>
</obligation>
[/code]

We’ll also prepopulate some observations using a POST command. That way, when our Camel route starts running, it will just update the observations for each zip code by sending a PUT to a predefined route (remember the custom destination processor defined above?). The PUT data will look something like this (and it was marshalled from the ComplykitReport object above to match this):

[code lang=”xml”]
<observation>
<notes>91105 Population: 11704; originations: 130; pct: 0.0111; warn-threshold: 0.0205; noncomply-threshold: 0.0500</notes>
<obligationId>16</obligationId>
<observationType>comply</observationType>
</observation>
[/code]

So, finally, after the Camel route runs the Complykit server/hub system will automatically run to create a summary report of compliance with your obligation. To get it, just run a GET on your obligation object using the REST protocol and you’ll see this:

[code lang=”xml”]
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<obligation>
<category>caselaw</category>
<directive>Minimize actual effects of high interest,
variable products in certain minority communities</directive>
<id>16</id>
<observationCount>2</observationCount>
<observationSummary>
<id>16</id>
<obligationId>16</obligationId>
<observationType>warn</observationType>
<observations>1.0</observations>
<pctShare>0.5</pctShare>
</observationSummary>
<observationSummary>
<id>17</id>
<obligationId>16</obligationId>
<observationType>comply</observationType>
<observations>1.0</observations>
<pctShare>0.5</pctShare>
</observationSummary>
<source>caselaw</source>
</obligation>
[/code]

Whew! Lots to think about.

So we obviously covered a huge amount of ground. But now we’ve got a system in place that can reduce the spreadsheet deluge and actually have some machine-readable technology that you could wire into your lending platforms.

The code

I forgot to mention earlier in the post, that all the code I mentioned here is available on my Github page under the bias-check repository.

Leave a Reply

Your email address will not be published. Required fields are marked *