Sunday, February 24, 2008

CEP Engines and Object Caches

I am curious if any of the CEP vendors have support for getting data from distributed object caches. It is easy to feed data into a CEP engine through JMS (Tibco EMS), sockets, flat files, standard databases (SS, Sybase, Oracle, DB2). But, has anyone integrated an object cache like Gemfire, Tangasol, and Gigaspaces with any CEP engine?

I also notice that Gemfire RTE claims to have CEP capabilities.


©2008 Marc Adler - All Rights Reserved

7 comments:

Unknown said...

Most CEP engines have an abstraction on top of how they receive their data and most of them work with JMS. With GigaSpaces, one can write a simple adapter based on your CEP JMS adapter that uses notifications done to your POJO in the cache.

Anonymous said...

Yes - TIBCO BusinessEvents includes such a cache (e.g. Tangosol).

Alex said...

Integragration of a CEP engine with distributed caches can actually go way beyond pushing stuff from one to the other (either input or output).

Let's try to list a few integration uses cases:
1/ consume from a cache, emit to a cache. Quiet trivial.
2/ deal with very large stream window : the CEP engine must allow you to directly swap in the cache as an underlying stream window backend storage, possibly with overflow to disk capabilities as well to tradeoff latency and capacity.
3/ same as 2/ but in a distributed way, so that the entire stream window is shared accross several CEP engines for n+1 HA purpose f.e.
4/ integration of streaming data with reference data that sits in the cache for a continuous join between stream and cached reference data. The CEP engine needs to provide an abstraction for that and properly support it in the event processing language (join keys etc)

Worth noting that most of distributed caches are adding some basic CEP capabilities. I consider this to really be cache listener (on put/update/delete) with no event processing language that makes it easy to deal with and abstract away though - at least for now (read ObjectGrid, Coherence, Gemfire or Gigaspaces docs)

Regarding Esper or NEsper - 1/ and 4/ are straightforward, and 4/ even tightly integrated into the event processing language extension capabilities. For 2/ and 3/ this requires the HA add-on.

Jags Ramnarayan said...

Follow this link to get a feel for the BEA event server and GemFire integration.
- complex events can now be pushed to distributed applications at a very high rate.
- complex events generation that is dependent on historical data comes from data that is partitioned across many nodes in memory enabling linear scaling of the CEP engine. Well, if not this, imagine being throttled by a RDB engine.
- Of course, the events once published will be highly available (redundant memory copies), optionally persisted to disk (shared nothing persistence - disk used in a append only mode to give very high throughput)
- Finally, one could even source the events from many sources using a single integrated data source in GemFire.

Cheers!
-- Jags Ramnarayan (jramnara@gemstone.com)
GemStone Systems

Jags Ramnarayan said...

Oops! here is the link ...
http://dev2dev.bea.com/pub/a/2007/09/event-server-caching.html

Alex said...

The use case in this paper is just the 1/ I described, with classic distributed cache value proposition and little combined value proposition. The Event Server in this case is even a SPOF as its internal state is not in the dist cache, and so isn't its stream window backends (only exist in single JVM heap RAM hence with strict size boundaries).

Unknown said...

Over these years from the date of the post, is there any change to the capabilities of CEP engine which give the administrators or application developers granular control over streaming processed events to other remote engines or giving control over the stored events for a particular query.

For Ex: If I want to distribute a complex query to a series of simple queries over many engines. Could any engine handle it.