User:Nisarg64/Apache Ambari

Apache Ambari is a software project of the Apache Software Foundation, is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. Ambari was a sub-project of Hadoop but is now a top-level project in its own right.

Ambari is used by companies including Cardinal_Health, EBay, Expedia, Kayak, Lending club, Neustar, Pandora, Priceline.com, Samsung, Shutterfly, Spotify

Overview
Ambari offers intuitive collection of tools and APIs that simplifies the operation of the clusters, thereby concealing the complexity of Hadoop. It enables system administrators to provision, manage and monitor a Hadoop cluster, and also to integrate Hadoop with the existing enterprise infrastructure. Irrespective of the size of the cluster, deployment and management of the hosts is simplified using Ambari.


 * Provision a Hadoop Cluster
 * Manage a Hadoop cluster
 * Monitor a Hadoop cluster
 * Integrate Hadoop with the Enterprise

Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of hosts involved. Ambari provides a single control point for viewing, updating and managing Hadoop service life cycles.

Provision a Hadoop Cluster
Ambari includes an intuitive web interface that provides a step-by-step wizard for installing Hadoop services across multiple hosts. This allows easy provision, configuration and testing of all the Hadoop services and core components. Ambari manages configuration of Hadoop services for the cluster.

Manage a Hadoop Cluster
Ambari acts as a center point of management for starting, stopping, and reconfiguring Hadoop services across the entire cluster. Cluster Management is further simplified by the use of tools provided by Ambari.

Monitor a Hadoop Cluster
Ambari provides a dashboard for getting an instant insight of the health and status of the Hadoop cluster. It uses Ambari Metrics System for cluster metrics collection and visualizes clusters operational data in a Web Interface. Moreover, Ambari has a pre-configured Alert System that notifies a user when attention is needed(e.g., a node goes down, remaining disk space is low, etc).

Integrate Hadoop with the Enterprise
Hadoop's provisioning, management, and monitoring capabilities can be easily integrated to their own enterprise applications using the Ambari REST APIs.

Ambari Server
Ambari server consists of an API handler which is also called coordinator. Server receives the request, generates request id and attaches it to the request. A corresponding API handler is invoked to implement the steps needed to fulfill the request.

Coordinator communicates with the Dependency Tracker to check for dependencies to be handled for the request. Dependency tracker gives prerequisite components and their required states for completion of request to coordinator. Coordinator saves these details in database. Coordinator then passes this information to the Stage Planner component. Stage Planner produces the staged sequences of operations to be performed at each node of the affected components. It uses the Manifest Generator to define the task roles for each node in each stage.

Coordinator will pass this ordered list of task to Action Manager along with request Id. Action Manager will update the state of each node component in FSM, which will show the progress of operation. FSM is also responsible to check for invalid event flow and generating failure message.

Action Manager generates Action id for each operation and adds it to the plan. Action manager picks actions from the plan and adds to queue for each affected nodes for each stage. When stage a is complete, it will pick actions from the next stage. It also start timer for scheduled actions.

Heartbeat handler receives the responses of actions and passes them to Action manager which in turn inform FSM about the change of states. Once all nodes completed their given tasks, action is considered completed. Once all actions are completed, A Stage is considered completed and next stage is begun to execute. Completion of action is also recorded in Database.

Ambari Agent
Ambari Agent communicates with Ambari server through heartbeat messages only. Every commands received from server are appended to the action queue. Action executioner picks the action from the queue and selects appropriate component to perform that action. Generated action responses are queued in message queue which is sent to server in next heartbeat.

Features of Apache Ambari

 * Wizard-Driven Web User Interface : Assists installation of Hadoop across multiple hosts
 * Granular Service Control : Accurate management of Hadoop services and component lifecycles
 * Configuration change history : Continuing management of Hadoop service configurations
 * RESTful APIs : Enables integration with enterprise systems

Source Code
Source code for Apache Ambari is available on Github