User talk:Kumarlav

The constituents of an event-based systems:

1)means of communication :- Events and notification

2)interacting components :- producer and consumer (software components)

3)event notification service

Events : The interaction between the producer and consumer are events.They can be direct interaction or through the broker network.

Notification : It describes the event.It is created by the observer of the event and contains the information about the event.

Producer : They are the components that publish the notifications.

Consumer : They react to the notifications delivered to them by notification service.

Event notification service: It acts as the mediator between the producers and the consumers.It is responsible for conveying the notification to the all the interested clients.It provides various interfaces to achieve this:

1)subscribe: It is generated by the consumer to subscribe to a particular event.

2)unsubscribe: It is generated by the consumer to unsubscribe from a particular event.

3)notify: It is generated by the publisher to publish a new event.

4)advertise: It is generated by the publiser for advertising the future events.

The three dimensions of decoupling provided by the notification service are :

a)Space Decoupling: The interaction between the producer and the consumer is indirect.Consumer need not know the publisher of the certain event to which it has subscribed to and publisher do not know about its subscribers.All this lies with the Notification service to manage.

b)Time Decoupling : It is not necessary for the peers to be active during interaction. Publisher can publish at any moment and regardless of the time of publish the consumer is notified about the notification when it becomes active.

c)Synchronization Decoupling : Neither publisher nor consumer gets blocked while performing other activity.Publication and Notification goes asynchronously.

Models of Interaction :

1) Direct Address : Here there is direct contact between the comnsumer and the producer i.e. the consumer knows about the provider of the notification and so does the producer knows about the recipient of the notification.In this there is no space,time decoupling. The space 	decoupling can be provided by maintaining the group of consumers and producers with same subscription.Request/reply and callback model comes under this category.

2)Indirect Address : In this, neither consumer knows about the producer nor producer knows about the consumer.Both of them are connected to a central network which acts as the mediator for them.

Communication Paradigms: 1) Message Passing: This is the most trivial way of interaction. In this consumer sends the request directly to the producer.It implements Direct Addressing.

2)RPC : It allows the program on one computer to invoke a procedure on the other computer as if the procedure was local.It introduces space,synchronization and time coupling. Here client waits for the remote procedure to complete and return to the program.Modifications have been made to introduce asynchronous invocation for remote methods with no return value.But this approach leads to weak realibility as the sender in not sure that the reciever has got the call or not.Another variant is that client continues to process and request for the return value later.

3)Notifications: Here the remote invocation is split into two asynchronous:one sent by the client and other by the server to return the reply.The consumer directly subscribes to the producer and producer issues the notification whenever the subscription becomes true.It is called callback model. It is space and time coupled. Space decoupling can be introduced by making the sets of consumers and producers.

4)Shared Spaces: It provides the mechanism for the clients to communicate on the shared data.The shared space is accessible by both producer and consumer. It introduces the space and time decoupling,but does not provide synchronization decoupling as the message can not be sent while	producer is removing it.

Notification Filtering Mechanism:

1)Topic Based: It uses string-matching for filtering the notifications.Every topic is identified by a unique name, to which the notifications are compared while filtering.Some implications use the hierarchial addressing which also allows the wildcards to include the sub-tree of the give node.

2)Content Based: It is more specific than the topic-based filtering.Here filters define teh constraints in the form of tuples.The tuples are the set of attributes.Template specifies matching notification by a partial tuple.The given notifications are matched against these templates.The constraints are based on the logical operators and simple contstraints can be combined to form the coplex and compund ones.Multiple occurence of the same attributes are merged or overlapped to form a single constraint.

3)Type Based :It uses path expressions and subtype inclusion tests for filtering.

Middleware medium:

It is its task to transmit the data between producer and consumer.They can be classified as :

1)There is one central node and all the clients are connected to it.Publishers send messages to it,it stores and forwards to the interested clients.

2)Instead of single central node, nodes can be distributed on the basis of the filters such that messages pertaining to certain filter goes to a particular node called Rendezvous Nodes.

3)The heiracrchial tree is formed such that every sub-node contains the entries for all the notification for its subtree.This result in heavily loaded root node. The problem is solved by making more than one root node.

Quality of service:

Persistence: The system should guarantee that the packets send are not lost and have reached to the end-users.They also guarantees that messages are not lost upon the failure.This is done by saving the copy of the messages at the producer.

Priorities: The system should allow the functionality of providing the priority to the messages such that certain messages are processed before other based on the need.

Transaction: It prevent the system to enter into a inconsistent state.Like, if the system crashes while it was in the middle of a transaction then it should go back to its previous state rather than to be in an inconsistent state(half of instruction being executed).This is provided by ensuring the atomicity of the transactions.

Reliability: The system should be reliable i.e. once the message has been sent then the message should go to the end-user.

QoS can be affected by the following factors:

1)Dropped packets

2)Delay

3)Jitter

4)Out-of-order delivery

5)Error

References: 1) Many faces of publish/subscribe systems,Eugster,Felber,Guerraoui,Kermarrec      2)Distributed Event-Based Systems ,Muhl,Ludger,Pietzuch

[added on 1st March,2009]

QoS based routing.
Problems with QoS based routing:

Complexity:

The selection of QoS paths subject to multiple constraints and hence have high computational complexity. Various heuristics have been proposed to reduce the complexity with less loss in the optimality.

1)Bandwidth Restricted Paths: It involves ordering of the constarints and finding the best path based on the higher priority first then for the 	lower priority constraints. This is used in the Widest-shortest path, Shortest-widest path and the All hop optimal path algorithms.

2)Restricted Shortest Path

Updating Parameter:

It requires that the QOS parameters are updated frequently. It may result in the flooding of the network if the system updates this information too often. Solution may be to do this periodically, but that may result in inaccurate rotuing decisions. To overcome the above problems IndiQoS allows QoS parameters to be treated same as the other event attributes.

Inorder to update the QoS parameters without flooding the network, the architecture augment the QoS parameters with the event attributes. Thus the Subscription and advertisement contains the new field where the QoS parameters are also included.

The QoS information at routers can be made by :

1)Flooding the network with QoS messages.

2)Using QoSPF, a protocol that updates QOS information at routers.It adds messages containing information about available and reserved resources.

Thus, it triggers a new message for any change in the QoS parameters. As the cost of updates are not negligible, it results in clogging of the 	network.

3)Pruned flooding: In this approach, the QoS informations are kept local to the links. It floods the networks only when a request for reservation is made and it excludes the paths which are known to be non-optimal from previous floodings. This methods serves for the updating of the QoS parameters.

In contrast to the above complex methods IndiQoS uses DHT, which requires nodes to have the information of only O(log n) neighbors, keeps QoS information local. DHT used in IndiQoS is Bamboo,which is based on Pastry.IndiQoS uses concept of rendezvous nodes to avoid bottle-neck in the network. These nodes keep the information about the particular types. The rendezvous nodes are created automatically by the DHT and thus the load os distributed among the nodes of the network.Thus reduces the possibility of the bottle-neck. It also helps in saving the resources at the nodes as they can be shared among the different events. This results in the branching near the nodes and there is only one link between the rendezvous node and the publisher. An important feature of the rendezvous nodes is their replication i.e. DHT can allot more than one nodes as the rendezvous node for a particular type. This method helps in improving the end-to-end latency. Now, the subscriber send messages to all the rendezvous for the particular type and recieves messages for all those noded, it can now decide which to chose among themselves based on the latency. It also help in increasing the number of subscription for a particular type. It also helps incase a node goes down.

Resources: Scalable QoS-Based Event Routing in Publish-Subscribe Systems [Nuno Carvalho,Filipe Araujo, Luis Rodrigues] A Survey of QoS Routing Algorithms [Marilia Curado, Edmundo Monteiro]

Reducing Latency in Rendezvous-based systems:
An architecture GeoRendezvous based on HERMES, SCRIBE and IniQoS, focus on minimizing the latency experienced by the clients. It runs on the top of Cell Hashing Routing and uses positon-based DHT to map the rendezvous nodes to the space on the basis of their geographical locations.Thus allowing the subscriber to take advantage of positional information of the nodes and hence reducing the signalling and latency. It is a topic-based publish-subscribe middleware.

GeoRendezvous system uses Cell Hash Routing(CHR). It is a cluster-based DHT. Here the space is divided into cells and the nodes are mapped into these cells.Each cell has a virtual node which is responsible for the events related to the cell. The routing is done in CHR using the Greedy Perimeter Stateless Routing Algorithm. In this algorithm the sender assume itself as the virtual node and passes the message to any randomly picked node in the populated neighbouring cells inorder to minimize the distance of the destination cell. If the node turns out to be the local minimum then it passes the message along the perimeter until the recieving cell is not the local minimum and then it re-assumes the greedy algorithm. If the message return to a cell twice then the destination cell is empty and it lies inside this perimeter(perimeter formed by the trace of the message called "home perimeter"). As with other rendezvous systems, this also allows the sharing of the resources to reduce the traffic. Here the clients have to renew their advertisement and subscription to enable the system to react to the topological changes. The latency can be further reduced by replicting the rendezvous nodes as it allows the client to select from the group of the nodes based on the minimum hops.

Another important feature of position-based DHT is that the clients only have to send messages to the subset of nodes which are more close to it rather than to entire nodes.

References: Reducing Latency in Rendezvous-Based Publish-Subscribe Systems for Wireless Ad Hoc Networks [Nuno Carvalho,Filipe Araujo, Luis Rodrigues]

Abstract of the routing systems.
issues other than complexity:

1) storage

2) resource utilization

3) informed Failure handling

4) uninformed failure handling

5) scalability

Geo-Rendezvous systems: It makes use of the geographical positions of the clients and hence help in reducing the latency. It allows the fault tolerance and reduces the latency by replication on rendezvous systems. Main challenge here is to determine the size of the cells into which the nodes are hashed to so that it doesnot grow so big that a cell cannot licten to all nodes neighbouring cell niether so small that it there is no use of clustering.

Hermes: It is buit on the top of Pan, a pastery-like peer-to-peer routing substrate. Each node maintain a routing table and contains only the enteries of the neighbouring nodes. It provides fault tolerance by taking the longer route in that case. Its scalable as cost of adding a node is small. The entries in the routing table are small. It is more efficient in terms of hop-count.

CovAdv(Siena-like event dissemiation): It uses advertisements and subscriptions and computes covering relations between them. It uses acyclic peer-to-peer topology. As there is no redundancy there is no inherent fault tolerance. The latency increases when the systems is sparsely populated with subscribers, but decreases as the number increases.

Kyra: It reduces the implementation cost of filter-based routing while still maintaining the network efficiency. It balances the system load across the network. It is the two-level topology. All nodes are partitioned into non-overlapping zones. Separate routning trees are built in each of these zones. It results in more concise, locality-based routing tables and also the number of hops are decreased. Main challenge is to balance the load more effectively over the network, matching an event against a large number of subscriptions and decoupling of matching and rotuing step.

Multicast protocol: It uses the parallel matching algorithm for matching the events against the subscriptions. It uses link-matching algorithm to efficiently deliver the subscritpion to he clients. It allows the distribution of events to a large number of information consumers across the network. It also exploits the locality of the subscriptions. It takes the approach of replicating all the subscriptions at all the brokers in the system and matching the event at each broker. This approach trades off the complexity of maintaining a replicated subscritption set and consuming CPU time at each broker for reducing the message traffic.

Tapestry: It provides efficient and scalable routing of messages directly to nodes and objects in large, sparse space. Here each node maintains routing state logarithmically proportional to network size, hence the latency scales similarly when the new nodes are added or old nodes are deleted. It is highly resilient under dynamic conditons, recovers quickly from chnages in the topology.

Refrences:

Reducing Latency in Rendezvous-Based Publish-Subscribe Systems for Wireless Ad-Hoc Networks Peer­to­Peer Overlay Broker Networks in an Event­Based Middleware Efficient Event Routing in Content-based I Publish-Subscribe Service Networks An Efficient Multicast Protocol for Content-Based Publish-Subscribe Systems Tapestry: A Resilient Global-Scale Overlay for Service Deployment