Tuesday, April 29, 2008

We are hiring again

Even though you may be reading about layoffs on Wall Street, we still have openings in Equities IT for developers and technologists who are very smart and are passionate about technology. Business experience preferred (Risk, Trading, Analytics, CEP). The positions can be in New York City, in Jersey City, or in Warren, NJ. Java or .NET.

I still have openings for a great UI developer (we are moving to WPF) who has experience with real-time systems, and a more "analytical" person who can analyze flowing equities trading and risk data in real-time and come up with interesting trading decisions.

Email me if you are interested.

©2008 Marc Adler - All Rights Reserved

I will be at the Accelerating Wall Street conference on May 8

I just got invoted to particpate on a panel about CEP. It's goign to be on Thursday, May 8th at the Grand Hyatt in NYC. The conference is titled "Accelerating Wall Street", and is being sponsored by Wall Street and Technology Magazine.

Here is a link to the conference:

http://wallstreetandtech.com/accelerate/agenda.jhtml;jsessionid=RR5ALTPMXDVHYQSNDLOSKH0CJUNN2JVN

My session is "CEP on Wall Street". Here is the blurb:

Complex Event Processing (CEP) holds the promise of major benefits for Wall Street firms, as it can speed up the interpretation of volumes of data in real time. Where is CEP being used today? Where will it be used in the future? Is it more hype than reality? In this special Get to the Point session, panelists will have 60 seconds to answer questions for the moderator and the audience.

Moderator:
Greg MacSweeney, Editor-in-Chief, Wall Street & Technology

Panelists:
Malcolm West, Chief Software Architect, HSBC's Corporate, Investment Banking and Markets division
Adam Honoré, Senior Analyst, Aite Group, LLC

©2008 Marc Adler - All Rights Reserved

Friday, April 18, 2008

Abstracting the CEP Engine

Here are two comments that I received yesterday, and my answers:

1) You bought an CEP Engine that doesn't support event clouds?

We feel that Coral8 does support event clouds, but we are looking for the best pattern to implement it. Mark, who is the CTO of Coral8, doesn't quite agree with the term "event cloud". His posting here highlights his argument. According to Mark, an "event cloud" can be represented as multiple event streams, something that Coral8 supports.


2) How and why did you abstract the CEP engine in your system?


First, the why. We want to insulate ourselves from any uncertainties concern the CEP engine, both in terms of the product itself and of the company. In this economic environment, we are concerned that some of these smallish CEP companies might be strained. Ones who are backed by Venture Capital might find their VC's getting worried and thinking that we are reliving those inglorious times from 2002 to 2003. Ones who are privately financed might find that the backers want to move into other areas. It is no secret that most firms are cutting back or delaying their software purchases, and the ones who get impacted first are the smaller niche companies.

We also want to have some flexibility in case the CEP engine itself does not function as advertised. Coral8 has given us great support, but we have not stressed it yet. We know other companies who have evaluated Coral8 who have foudn some shortcomings, things that the Coral8 staff have addressed. However, it is perfectly within the realm of possibility that we may need to consider another CEP engine should Coral8 fall on its face.

Now, the how ....

We are not using Coral8's native input and output adapters. We are not even reading databases using Coral8's PollFromDatabase and ReadFromDatabase adapters. We have an input server that is used to read static and real-time data and marshall that data into coral8 tuples. On the other side, we have an output server that takes the derived event tuples from Coral8, marshalls them into a common format, and does various kinds of alerting and visualizations.
From the days that we did evaluations of other CEP vendors, we have layers in our input and output servers that deal with Aleri and Streambase. In other words, we have our own adapters! Changing from Coral8 to Streambase or Aleri involves a simple edit to our Spring-like configuration files.

In addition, we have the ability to farm out work to other engines, such as KDB+. We can then read the derived events that are generated by other systems (as long as they are in our common format) and put them into the CEP engine's "event cloud".

In our architecture, we have introduced extra hops in order to abstract the CEP engine. But, we are not that concerned, since we are dealing with analysis and alerting rather than trading.

©2008 Marc Adler - All Rights Reserved

Wednesday, April 16, 2008

Aleri and STAC

Congrats to Aleri for being the first CEP vendor to volunteer their product to undergo scrutiny from STAC Research.

STAC first announced this program at last September's Gartner CEP Summit, but I am surprised that none of the other CEP vendors have joined this program. Aleri's participation means one of two things ---- they are very confident in the power of their CEP engine and/or they are willing to pony up the participation fee.

I am interested to see the test cases that STAC comes up with. Hopefully, these test cases will be created by an impartial party.


©2008 Marc Adler - All Rights Reserved

Sunday, April 13, 2008

More Cloudy Thoughts

It seems that my inelegance in describing my idea of the event cloud has spurred some debate between Greg and Hans. I am going to try to clarify my thoughts around the event cloud, and see if it makes more sense.

I don’t want to give any of our “secret sauce” away in my postings, so I will try to map my thinking from the Equities Trading domain to another domain that I am not familiar with …. The domain of building security. (Apologies in advance to all of the sleuths who are reading this. My use cases will probably malign your esteemed field of study.)

Let’s say that we are writing an enterprise-wide CEP application for our MegaBank that will monitor our security system so that those nosey parkers from StanLehGoldBar Inc don’t break into our building and steal our secrets.

We are going to make the distinction between Atomic Events and Derived Events. A Derived Event is generated when something interesting is detected from the monitoring of one or more atomic events.

An atomic event may be saved in our CEP engine, depending on the use case. A derived event will be definitely be saved by our CEP system. Since a derived event represents an “interesting” condition, we may want to do reporting and analysis on derived events. We may also want to do some sophisticated searching of our derived events. In my mind, the persisted atomic and derived events form our event cloud.

Two or more derived events can be combined to form a new derived event. A derived event can be combined with an atomic event to form a new derived event. Let me illustrate this.

We have a lot of atomic events that come through our CEP system. Every person that passes through MegaBank’s doors generates a PersonEnteredBuilding atomic event. (This is analogous to a market data event.) We have a static database of all of MegaBank’s employees, and when the person in a PersonEnteredBuilding event does not match an entry in our employee database, the CEP system generates a NonEmployeeEntered derived event. This derived event will be persisted in our CEP engine.

We also have a derived PersonLeftBuilding event generated whenever a person leaves the building. This event is generated

(Let’s fantasize a bit, and assume we do a retina scan of everyone entering the building. Let’s stretch our imaginations a bit and say that we have a retina scan of all StanLehGoldBar Inc employees)

Let’s say that we do a join between a NonEmployeeEntered event and the static source of StanLehGoldBar employees. We can generate a CompetitorInBuilding derived event. This derived event will be saved in the CEP engine too.

We monitor all doors to our building. Whenever someone goes through the doors of the Equities trading floor, we generate a TradingFloorEntered atomic event. If someone goes through the trading floor doors for the first time of the day after 6:00 PM, we might want to consider that action to be a suspicious activity, and we want to generate a PossibleIntruderOnTradingFloor derived event.

Now, some smart security guard who is a user of our CEP system wants to create a new derived event. He wants to see if we have a competitor roaming around our Trading Floor at night. So, the security guard goes into our CEP GUI and creates a new, custom derived event called CompetitorOnTradingFloorAfterHours that gets generated 1) if we have a CompetitorInBuilding derived event 2) that does not have a “cancelling” PersonLeftBuilding event, 3) combined with a PossibleIntruderOnTradingFloor event. The fact that various events are “floating” around our event cloud makes it possible to easily create new derived events on the fly.

The security guard might also want to do a query of our event cloud to see if the competitor was “casing the joint” prior to the intrusion into the trading floor at night. The security guard might want to see how many times that particular competitor visited MegaBank over the past month. He might want to see how many times the competitor visited the trading floor during normal business hours. This sounds like the kind of query that you would do with a Data Warehouse, not a CEP engine.

So, we need to be able to represent the event cloud in an efficient way in the CEP engine. We need to be able to create new derived events dynamically, while the CEP engine is running. We need to be able to do ad-hoc analysis of the event cloud to improve our situational awareness. We need to be able to create new and interesting visualizations that let the user peek into what is going on inside the cloud.

Interesting stuff, right?

Tim Bass gives a list of techniques that can be used to perform analysis of events. These techniques include:


  • Rule-Based Inference
  • Bayesian Belief Networks (Bayes Nets)
  • Dempster-Shafer’s Method
  • Adaptive Neural Networks
  • Cluster Analysis
  • State-Vector Estimation
I have to admit that I have not delved into these areas before. However, we just hired someone with a BioInformatics background who we hope could do this stuff in his sleep. Who would have thought that, working for MegaBank, I would be exposed to such interesting areas of study?


©2008 Marc Adler - All Rights Reserved

How to do an UPSERT in Coral8

Assume that the LastTrades window has the retention policy of KEEP LAST PER Symbol.

Here is some code (provided by Mark of Coral8) to do an upsert.

INSERT INTO
LastTrades
SELECT
SPC.symbol,
SPC.price,
If LT.volume Is Null Then 0 Else LT.volume End If
FROM
StreamPriceCorrections SPC
Left Outer Join LastTrades LT ON SPC.Symbol = LT.Symbol;


If we need to do an update rather than an upsert, this piece of code works:


INSERT INTO
LastTrades
SELECT
SPC.symbol,
SPC.price,
LT.volume
FROM
StreamPriceCorrections SPC, LastTrades LT
ON SPC.Symbol = LT.Symbol;



©2008 Marc Adler - All Rights Reserved

Addendum - Cloudy Thinking

In my previous post, I don't think that I stated my question properly.

I was asking how to actually implement the Event Cloud in a Streaming SQL-like CEP engine, so that we can build up hierarchies of events. I was looking for schemas, SQL statements, best practices, etc.

I am well aware of POSETS, but what might be easy to implement in C# or Java-based data structures is not that easy in a quasi-relational system such as CORAL8.

So, I was hunting for advice from people who might have implemented event clouds in Coral8, Streambase, and Aleri, all three which are based on SQL.

I will follow up this post with a more concrete example.

©2008 Marc Adler - All Rights Reserved

Saturday, April 12, 2008

Cloudy Thinking

One of the things that has been on my mind recently is how to represent the "event cloud".

One of the buzzphrases that comes out of the CEP movement is the term "event cloud". This is the huge amalgamation of all of the events that flows through your event processing system.

This week, I had the opportunity to talk with Mary Knox of the Gartner Group. Mary follows the world of CEP. Just like many of the executives at the various CEP companies have told me, Mary confirms that many companies in finance are using CEP for simple streams of processing .... a small algo trading system here, a pricing engine there, a portfolio analysis system in another place. Most shops seem to be "under-utilizing" their CEP engines. Our effort is probably one of the few out there that will really attempt to poke holes in the hype surrounding CEP.... either our application will validate the hype, or all of the CEP engines will flame out in a blaze of glory.

Mary has not heard of too many financial companies who are implementing the event cloud, so I wonder if we are breaking new ground in that area. We know what an event cloud is and what it is supposed to do, but how do we actually implement the event cloud in a system like Coral8 or Streambase? Are relational tables/windows the best way to represent the cloud? And, what about hierarchies of events that are derived from other events? How can be best represent that complex graph of inter-relationships with SQL-ish windows?

Is anyone out there doing similar work with the cloud?


©2008 Marc Adler - All Rights Reserved

Sunday, April 06, 2008

Coral8 is Our Choice or “How the hell did we get here?”

When we went down to Orlando last fall to attend the Gartner Summit on Complex Event Processing, we went with eyes wide open. We were new to the domain of CEP, and one our missions was to try to pick a vendor for the CEP engine that would drive our efforts to produce a major CEP system for our Equities business.

There were a bunch of event-processing systems that were not under consideration because it seemed that they had moved into the strictly vertical area of Algo Trading. These CEP systems included Truviso and Skylar. We needed a general-purpose CEP system, and we wanted to only consider systems that still had a generalist product. An exception to this rule was Aleri, a company who had just come out with a Liquidity Management System as a separate product. We thought that Aleri would still keep its focus on the core CEP engine, so it warranted inclusion of our evaluation.

Apama fell into the Algo trading vertical, but Apama still has a general purpose CEP engine. However, when we tried to evaluate Apama, we were told that we had to go through the dog-and-pony marketing show, something that we did not want to do. I am not sure if this requirement was brought on by the purchase of Apama by Progress Software, a company who I think of as being Computer Associates Lite. I was also told about some interesting experiences between Apama and a major bank by a former colleague of mine whose opinion I trust, and this also influenced by decision to evaluate Apama. This was unfortunate, as I happen to side more with Apama on the whole EPL vs Stream SQL debate.

This left four systems: Streambase, Coral8, Esper, and Aleri.

Coral8 had always been the front-runner, mostly due to recommendations by some former colleagues who were at Merrill Lynch. I had always liked the “vibe” surrounding Coral8, and their openness at giving out eval copies of their software.

Readers of my blog know that I had strong negative opinions about Streambase because of the aggressiveness of their marketing department, an opinion which was shared by a lot of people out there. Nevertheless, their new CEO, Chris Risley, contacted me personally and told me that he had addressed my concerns. After Chris and Richard Tibbetts came down to NYC to meet with me, we decided to include Streambase in the evaluation.

I really wanted to try Esper, but there were a few things that worked against them. The primary factor was that the .NET version, NEsper, was something that was developed by the hard-working Aaron Crackajaxx for his business needs, and did not seem to be part of the mainline Esper product line. We are a .NET shop here, and we needed a product that supported .NET as a first-class citizen. If Aaron decided to become disinterested in Nesper, or if he moved on to another company, then where would we be? We also preferred a product that had an entire ecosystem built around it. So, we passed on Esper. However, Esper is still very much on my radar screen, and I am interested to see how Thomas continues to develop the company and the product.

We spent a good deal of time evaluating Aleri, and most of these experiences were detailed in past entries in this blog. We really wanted to see Aleri succeed, as they were a local company, staffed with a lot of very smart and gentile ex Bell Lab-ers. However, we felt that their product was not ready for us, mostly because of what I called the “spit and polish” issues. I won’t rehash the details now, but if you are interested, please go back and read the old entries in this blog. The areas that needed improvement in the Aleri product were the Aleri Studio, the documentation, and the integration of external data sources.

I met a good deal with Don DeLoach, the CEO of Aleri, and the one positive that will come from my rejection of Aleri is a renewed focus by Aleri on the aesthetics of their product. You can already see these efforts by reading the new Aleri Blog. From what Don had told me a few months ago, their 3.0 product will start to focus on easier integration of external data sources, and will have much improved documentation. I look forward to seeing their efforts come into fruition.

Coral8 was always the front-runner in our evaluation. Their engine is written in C++. They had a decent .NET SDK that let you build out-of-process adapters in C#, and also let you interact with the internals of the Coral8 engine. Their documentation was good, although a bit obtuse at times, and the documentation was backed up by a ton of whitepapers that are available on their website. Their Coral8 Studio gives you a source code view of development, and the GUI part of the Studio is updated after every compile of the source. The CEO of the company was the person who created Crystal Reports, and knows what it takes to build a software company. But, most of all, their CTO and President interacted with us all of the time, and was extremely receptive to our ideas on improving his product. I like when a CTO and the pre-sales engineer send me mail on a Sunday morning!

Streambase was a strong contender. They had some great features in the product that Coral8 has only just come out with (ie: windows that are bucketed by column value). Their GUI is very strong, and their documentation and tutorials are first class. However, I have to say that the interest that Coral8 has shown in our success, and the availability of their CTO was what tipped the odds in Coral8’s favor.

By now, you must be saying to yourself “Where’s the meat?” Didn’t we try to soak and stress the various engines? Didn’t we have an OPRA feed running into the engines in an effort to break them? Didn’t we monitor the use of the CPU and other computing resources? Well … no. To tell you the truth, we were relying on STAC Research to try to do that job for us. STAC is only now just starting to get up to speed in the CEP world, and we will be monitoring their efforts in this space. The general feeling is that most of these CEP engines perform in roughly the same manner, and if one of the CEP engines is 5% faster than Coral8, it is not going to sway our decision, since we are not that concerned right now with super low latency. We are more concerned with the intangibles; responsiveness of the support organization, evolution of the product (and our input into the roadmap), support for .NET as a first-class citizen, stability of the company, etc.

Despite choosing Coral8, we have been careful in our architecture to abstract the specific CEP engine, and no external system will know that Coral8 is driving our CEP system. In the same way that CEP engines have abstracted datasources by using pluggable adapters, we have abstracted the CEP engine. Yes, we have chosen Coral8, and so far, we are satisfied by our choice. But, we are also keeping our eyes open for how the other products evolve in the world of Complex Event Processing.



©2008 Marc Adler - All Rights Reserved

Of Webcasts, Vendors, and Perceptions

Some people have pinged me to find out why I never appeared on the webcast with Coral8 a few weeks ago. Basically, it was because of a mess-up by the folks at Incisive Media.

You had to call a certain phone number to participate in the webcast. The operator dutifully asked me for my name and my company. I was then transferred to what turned out to be a listen-only line which enabled me to hear the speakers, but prevented them from hearing me. After the presenter from The Aite Group gave his spiel, the presenters asked if I was on the line. From my perch on the trading floor, I screamed into the phone “I’m here! I’m here!” … but since I was on a listen-only line, nobody on the other side could hear me. I furiously sent emails to the guys at Coral8 and Incisive Media, but all were oblivious. I finally hung up, called back the operators, and explained that I was one of the presenters, and the operator put me on the presenters-only line, but by that time, John Morrell had started his 30-minute pitch for Coral8. I had another meeting scheduled for noon, so I had to leave the webcast.

I have to admit that I had no idea that I was going to be part of a Coral8 infomercial, and if I did, I would not have been part of the webcast. As a good member of MegaBank, I need to be extremely careful about the PERCEPTION of a relationship between a specific vendor and myself. In fact, recently, I felt that I had to cancel the recording of a CEP podcast when I found out that the people who set up the podcast were bringing the Chief Architect of BEA Systems to the interview to ask questions. Not only do I not want to be perceived as a shill for a specific vendor, but I do not want my name on a banner next to the name of a company whose products I am not using. Even if the BEA architect’s questions to me were vendor-neutral, having my name on a podcast that has the BEA name on it as the sponsor would imperceptibly tie me into BEA.

The SOA/Web Services conference that I spoke at a few weeks ago had no vendor tie-in. I was on a panel with someone from Bank of America and from Google. There were no SOA/WebServices vendors on the panel. I am also scheduled to speak about CEP at a big conference in a few months, and I will be careful not to mention the specific technology stack that we are using. I don’t mind being mentioned as a reference customer for a vendor, under the conditions that

- I am actually using the product
- I am completely satisfied with the product
- My current employer gives me permission

Being a member of the management layer at MegaBank and blogging at the same time is a tricky proposition. If you read something here, you need to be thoroughly convinced that I am not shilling for any vendor. I disparage all vendors equally. I am not like certain consulting firms, who have relationships with every vendor and will never render a negative opinion about any of those vendors IN PUBLIC.

©2008 Marc Adler - All Rights Reserved

Friday, April 04, 2008

KDB is Free for Personal Use

I haven't blogged in about a month ... I have a ton to say, but I have just been too busy with the new employees who have joined my Complex Event Processing team (welcome Hanno, Scott and Feng). But this great news from Tenerife Joel is just too good to pass up.

On this blog a few month ago, we got on the case of KX Systems for making it virtually impossible to download, evaluate and learn if you were not a member of a large financial institution. Several people commented here that they really wanted to check out KDB, but they did not know how to get access to it.

I guess that Simon and Niall have seen the light, and are now making KDB available for personal use.

I would like to take all of the credit for this revelation, but I suspect that one of the motivating factors for KX Systems was the free, unrestricted availability of Coral8 for personal use. Any random person can download a personal-use, single-CPU version of the Coral8 server and development studio for no charge .... this version does not have any restrictions nor any time limits (are you listening, Aleri?).

During our evaluation of CEP systems, we found that Coral8 was the only company that did not put up any barriers to evaluation. Aleri had an annoying time limit. Apama wouldn't even let us download anything without having to run through their marketing gauntlet. Streambase directly gave us a copy of their product, so I am not sure if Streambase has any barriers to evaluation.

All of the CEP vendors face the conundrum of giving out unrestricted copies of their software vs fully qualifying prospective customers. I can understand this from the standpoint of support ... your technical support staff has a limited amount of time, and it usually has to be spent supporting the larger institutions, the ones who will most likely be dishing out several hundred thousand dollars for a license. But, today's independent hacker might be tomorrow's corporate developer .... and goodwill goes a long way.

So, congrats to Simon and Niall from KX. Here's hoping that a new generation of KDB developers will come out of this effort, and as a result, a wider set of tools is available for KDB.

©2008 Marc Adler - All Rights Reserved