public class AppState extends Object
| Modifier and Type | Class and Description |
|---|---|
static class |
AppState.NodeCompletionResult
This is a very small class to send a multiple result back from
the completion operation
|
static class |
AppState.NodeUpdatedOutcome
Return value of the
onNodesUpdated(List) call. |
| Modifier and Type | Field and Description |
|---|---|
protected static org.slf4j.Logger |
log |
| Constructor and Description |
|---|
AppState(AbstractClusterServices recordFactory,
MetricsAndMonitoring metricsAndMonitoring)
Create an instance
|
| Modifier and Type | Method and Description |
|---|---|
void |
buildAppMasterNode(org.apache.hadoop.yarn.api.records.ContainerId containerId,
String host,
int amPort,
String nodeHttpAddress)
build up the special master node, which lives
in the live node set but has a lifecycle bonded to the AM
|
void |
buildInstance(AppStateBindingInfo binding) |
Map<Integer,String> |
buildNamingMap()
Build map of role ID-> name
|
org.apache.hadoop.yarn.api.records.Resource |
buildResourceRequirements(RoleStatus role,
org.apache.hadoop.yarn.api.records.Resource capability)
Build up the resource requirements for this role from the
cluster specification, including substituing max allowed values
if the specification asked for it.
|
RoleStatus |
buildRole(ProviderRole providerRole)
Add knowledge of a role.
|
List<AbstractRMOperation> |
cancelOutstandingAARequests()
Cancel any outstanding AA Requests, building up the list of ops to
cancel, removing them from RoleHistory structures and the RoleStatus
entries.
|
List<RoleInstance> |
cloneLiveContainerInfoList()
Clone the live container list.
|
List<RoleInstance> |
cloneOwnedContainerList()
Clone the list of active (==owned) containers
|
List<RoleStatus> |
cloneRoleStatusList()
Get a deep clone of the role status list.
|
void |
containerReleaseSubmitted(org.apache.hadoop.yarn.api.records.Container container)
Note that a container has been submitted for release; update internal state
and mark the associated ContainerInfo released field to indicate that
while it is still in the active list, it has been queued for release.
|
void |
containerStartSubmitted(org.apache.hadoop.yarn.api.records.Container container,
RoleInstance instance)
Notification called just before the NM is asked to
start a container
|
ProviderRole |
createDynamicProviderRole(String name,
MapOperations component)
Build a dynamic provider role
|
Map<String,Map<String,ClusterNode>> |
createRoleToClusterNodeMap()
Build a map of role->nodename->node-info
|
List<RoleInstance> |
enumLiveNodesInRole(String role)
Enum all nodes by role.
|
List<RoleInstance> |
enumNodesWithRoleId(int roleId,
boolean owned)
enum nodes by role ID, from either the owned or live node list
|
List<AbstractRMOperation> |
escalateOutstandingRequests()
Escalate operation as triggered by external timer.
|
ConfTreeOperations |
getAppConfSnapshot() |
ApplicationLivenessInformation |
getApplicationLivenessInformation()
get application liveness information
|
float |
getApplicationProgressPercentage()
Return the percentage done that Slider is to have YARN display in its
Web UI
|
ClusterDescription |
getClusterStatus()
Get the current view of the cluster status.
|
AtomicInteger |
getCompletionOfNodeNotInLiveListEvent() |
AtomicInteger |
getCompletionOfUnknownContainerEvent() |
Map<String,ComponentInformation> |
getComponentInfoSnapshot()
Get a snapshot of component information.
|
String |
getContainerDiagnosticInfo()
Get diagnostics info about containers
|
Map<org.apache.hadoop.yarn.api.records.ContainerId,RoleInstance> |
getFailedContainers() |
long |
getFailedCountainerCount() |
org.apache.hadoop.fs.Path |
getHistoryPath()
Get the path used for history files
|
AggregateConf |
getInstanceDefinition() |
AggregateConf |
getInstanceDefinitionSnapshot() |
ConfTreeOperations |
getInternalsSnapshot() |
Map<org.apache.hadoop.yarn.api.records.ContainerId,RoleInstance> |
getLiveContainers() |
RoleInstance |
getLiveInstanceByContainerID(String containerId)
Lookup live instance by string value of container ID
|
List<RoleInstance> |
getLiveInstancesByContainerIDs(Collection<String> containerIDs) |
protected Map<String,Integer> |
getLiveStatistics()
Get the live statistics map
|
protected String |
getLogsURLForContainer(org.apache.hadoop.yarn.api.records.Container c)
Get the URL log for a container
|
int |
getNumOwnedContainers()
Get the number of active (==owned) containers
|
RoleInstance |
getOwnedContainer(org.apache.hadoop.yarn.api.records.ContainerId id)
Look up an active container: any container that the AM has, even
if it is not currently running/live
|
RoleInstance |
getOwnedInstanceByContainerID(String containerId)
Lookup owned instance by string value of container ID
|
ConfTreeOperations |
getResourcesSnapshot() |
RoleHistory |
getRoleHistory()
Get the role history of the application
|
protected Map<String,ProviderRole> |
getRoleMap() |
Map<Integer,ProviderRole> |
getRolePriorityMap() |
RoleStatistics |
getRoleStatistics()
Get the aggregate statistics across all roles
|
Map<Integer,RoleStatus> |
getRoleStatusMap() |
long |
getSnapshotTime() |
long |
getStartedCountainerCount() |
long |
getStartFailedCountainerCount() |
AggregateConf |
getUnresolvedInstanceDefinition() |
void |
incFailedCountainerCount()
Increment the count
|
protected void |
incrementRequestCount(RoleStatus role)
Increment the request count of a role.
|
void |
incStartedCountainerCount()
Increment the count and return the new value
|
void |
incStartFailedCountainerCount()
Increment the count and return the new value
|
void |
initClusterStatus() |
RoleInstance |
innerOnNodeManagerContainerStarted(org.apache.hadoop.yarn.api.records.ContainerId containerId)
container start event handler -throwing an exception on problems
|
boolean |
isApplicationLive() |
boolean |
isShortLived(RoleInstance instance)
Is a role short lived by the threshold set for this application
|
RoleStatus |
lookupRoleStatus(org.apache.hadoop.yarn.api.records.Container c)
Look up the status entry of a container or raise an exception
|
RoleStatus |
lookupRoleStatus(int key)
Look up the status entry of a role or raise an exception
|
RoleStatus |
lookupRoleStatus(String name)
Look up a role in the map
|
void |
noteAMLaunched()
Note that the master node has been launched,
though it isn't considered live until any forked
processes are running.
|
void |
noteAMLive()
AM declares ourselves live in the cluster description.
|
protected long |
now()
Current time in milliseconds.
|
AppState.NodeCompletionResult |
onCompletedNode(org.apache.hadoop.yarn.api.records.ContainerStatus status)
handle completed node in the CD -move something from the live
server list to the completed server list.
|
void |
onContainersAllocated(List<org.apache.hadoop.yarn.api.records.Container> allocatedContainers,
List<ContainerAssignment> assignments,
List<AbstractRMOperation> operations)
Event handler for allocated containers: builds up the lists
of assignment actions (what to run where), and possibly
a list of operations to perform
|
RoleInstance |
onNodeManagerContainerStarted(org.apache.hadoop.yarn.api.records.ContainerId containerId)
container start event
|
void |
onNodeManagerContainerStartFailed(org.apache.hadoop.yarn.api.records.ContainerId containerId,
Throwable thrown)
update the application state after a failure to start a container.
|
AppState.NodeUpdatedOutcome |
onNodesUpdated(List<org.apache.hadoop.yarn.api.records.NodeReport> updatedNodes)
Handle node update from the RM.
|
ClusterDescription |
refreshClusterStatus()
Update the cluster description with the current application state
|
ClusterDescription |
refreshClusterStatus(Map<String,String> providerStatus)
Update the cluster description with the current application state
|
List<AbstractRMOperation> |
releaseAllContainers()
Release all containers.
|
List<AbstractRMOperation> |
releaseContainer(org.apache.hadoop.yarn.api.records.ContainerId containerId)
Releases a container based on container id
|
void |
resetFailureCounts()
Reset the "recent" failure counts of all roles
|
List<AbstractRMOperation> |
reviewRequestAndReleaseNodes()
Look at where the current node state is -and whether it should be changed
|
protected void |
setClusterStatus(ClusterDescription clusterDesc) |
void |
setContainerLimits(int minMemory,
int maxMemory,
int minCores,
int maxCores)
Set the container limits -the min and max values for
resource requests.
|
void |
setInitialInstanceDefinition(AggregateConf definition)
Set the instance definition -this also builds the (now obsolete)
cluster specification from it.
|
String |
toString() |
List<ProviderRole> |
updateResourceDefinitions(ConfTree resources)
The resource configuration is updated -review and update state.
|
public AppState(AbstractClusterServices recordFactory, MetricsAndMonitoring metricsAndMonitoring)
recordFactory - factory for YARN recordsmetricsAndMonitoring - metrics and monitoring servicespublic long getFailedCountainerCount()
public void incFailedCountainerCount()
public long getStartFailedCountainerCount()
public void incStartedCountainerCount()
public long getStartedCountainerCount()
public void incStartFailedCountainerCount()
public AtomicInteger getCompletionOfNodeNotInLiveListEvent()
public AtomicInteger getCompletionOfUnknownContainerEvent()
public Map<Integer,RoleStatus> getRoleStatusMap()
protected Map<String,ProviderRole> getRoleMap()
public Map<Integer,ProviderRole> getRolePriorityMap()
public Map<org.apache.hadoop.yarn.api.records.ContainerId,RoleInstance> getFailedContainers()
public Map<org.apache.hadoop.yarn.api.records.ContainerId,RoleInstance> getLiveContainers()
public ClusterDescription getClusterStatus()
Calls to refreshClusterStatus() trigger a
refresh of this field.
This is read-only to the extent that changes here do not trigger updates in the application state.
protected void setClusterStatus(ClusterDescription clusterDesc)
public void setInitialInstanceDefinition(AggregateConf definition) throws BadConfigException, IOException
definition - initial definitionBadConfigExceptionIOExceptionpublic AggregateConf getInstanceDefinition()
public RoleHistory getRoleHistory()
public org.apache.hadoop.fs.Path getHistoryPath()
public void setContainerLimits(int minMemory,
int maxMemory,
int minCores,
int maxCores)
minMemory - min memory MBmaxMemory - maximum memoryminCores - min v core countmaxCores - maximum corespublic ConfTreeOperations getResourcesSnapshot()
public ConfTreeOperations getAppConfSnapshot()
public ConfTreeOperations getInternalsSnapshot()
public boolean isApplicationLive()
public long getSnapshotTime()
public AggregateConf getInstanceDefinitionSnapshot()
public AggregateConf getUnresolvedInstanceDefinition()
public void buildInstance(AppStateBindingInfo binding) throws BadClusterStateException, BadConfigException, IOException
public void initClusterStatus()
public ProviderRole createDynamicProviderRole(String name, MapOperations component) throws BadConfigException
name - name of roleBadConfigException - bad configurationpublic List<ProviderRole> updateResourceDefinitions(ConfTree resources) throws BadConfigException, IOException
resources - updated resources specificationBadConfigExceptionIOExceptionpublic RoleStatus buildRole(ProviderRole providerRole) throws BadConfigException
providerRole - role to addBadConfigException - if a role of that priority already existspublic void buildAppMasterNode(org.apache.hadoop.yarn.api.records.ContainerId containerId,
String host,
int amPort,
String nodeHttpAddress)
containerId - the AM masterhost - hostnameamPort - portnodeHttpAddress - http address: may be nullpublic void noteAMLaunched()
public void noteAMLive()
public RoleStatus lookupRoleStatus(int key)
key - role IDRuntimeException - if the role cannot be foundpublic RoleStatus lookupRoleStatus(org.apache.hadoop.yarn.api.records.Container c)
c - containerRuntimeException - if the role cannot be foundpublic List<RoleStatus> cloneRoleStatusList()
public RoleStatus lookupRoleStatus(String name) throws org.apache.hadoop.yarn.exceptions.YarnRuntimeException
name - role nameorg.apache.hadoop.yarn.exceptions.YarnRuntimeException - if not foundpublic List<RoleInstance> cloneOwnedContainerList()
public int getNumOwnedContainers()
public RoleInstance getOwnedContainer(org.apache.hadoop.yarn.api.records.ContainerId id)
public List<RoleInstance> cloneLiveContainerInfoList()
public RoleInstance getLiveInstanceByContainerID(String containerId) throws NoSuchNodeException
containerId - container ID as a stringNoSuchNodeException - if it does not existpublic RoleInstance getOwnedInstanceByContainerID(String containerId) throws NoSuchNodeException
containerId - container ID as a stringNoSuchNodeException - if it does not existpublic List<RoleInstance> getLiveInstancesByContainerIDs(Collection<String> containerIDs)
public List<RoleInstance> enumLiveNodesInRole(String role)
role - role, or "" for all rolespublic List<RoleInstance> enumNodesWithRoleId(int roleId, boolean owned)
roleId - role the container must be inowned - flag to indicate "use owned list" rather than the smaller
"live" listpublic Map<String,Map<String,ClusterNode>> createRoleToClusterNodeMap()
public void containerStartSubmitted(org.apache.hadoop.yarn.api.records.Container container,
RoleInstance instance)
container - container to startinstance - clusterNode structurepublic void containerReleaseSubmitted(org.apache.hadoop.yarn.api.records.Container container)
throws SliderInternalStateException
container - containerSliderInternalStateException - if there is no container of that ID
on the active listprotected void incrementRequestCount(RoleStatus role)
Also updates application state counters
role - role being requested.public org.apache.hadoop.yarn.api.records.Resource buildResourceRequirements(RoleStatus role, org.apache.hadoop.yarn.api.records.Resource capability)
role - rolecapability - capability to set up. A new one may be created
during normalizationpublic RoleInstance onNodeManagerContainerStarted(org.apache.hadoop.yarn.api.records.ContainerId containerId)
containerId - container that is to be startedpublic RoleInstance innerOnNodeManagerContainerStarted(org.apache.hadoop.yarn.api.records.ContainerId containerId)
containerId - container that is to be startedRuntimeException - on problemspublic void onNodeManagerContainerStartFailed(org.apache.hadoop.yarn.api.records.ContainerId containerId,
Throwable thrown)
containerId - failing containerthrown - what was thrownpublic AppState.NodeUpdatedOutcome onNodesUpdated(List<org.apache.hadoop.yarn.api.records.NodeReport> updatedNodes)
updatedNodes - updated nodespublic boolean isShortLived(RoleInstance instance)
instance - instanceprotected long now()
public AppState.NodeCompletionResult onCompletedNode(org.apache.hadoop.yarn.api.records.ContainerStatus status)
status - the node that has just completedprotected String getLogsURLForContainer(org.apache.hadoop.yarn.api.records.Container c)
c - containerpublic float getApplicationProgressPercentage()
public ClusterDescription refreshClusterStatus()
public ClusterDescription refreshClusterStatus(Map<String,String> providerStatus)
providerStatus - status from the provider for the cluster info sectionpublic ApplicationLivenessInformation getApplicationLivenessInformation()
protected Map<String,Integer> getLiveStatistics()
StatusKeys
keylist.public RoleStatistics getRoleStatistics()
public Map<String,ComponentInformation> getComponentInfoSnapshot()
This does not include any container list, which is more expensive to create.
public List<AbstractRMOperation> reviewRequestAndReleaseNodes() throws SliderInternalStateException, TriggerClusterTeardownException
public void resetFailureCounts()
public List<AbstractRMOperation> escalateOutstandingRequests()
public List<AbstractRMOperation> cancelOutstandingAARequests()
public List<AbstractRMOperation> releaseContainer(org.apache.hadoop.yarn.api.records.ContainerId containerId) throws SliderInternalStateException
containerId - SliderInternalStateExceptionpublic List<AbstractRMOperation> releaseAllContainers()
public void onContainersAllocated(List<org.apache.hadoop.yarn.api.records.Container> allocatedContainers, List<ContainerAssignment> assignments, List<AbstractRMOperation> operations)
allocatedContainers - the containers allocatedassignments - the assignments of roles to containersoperations - any allocation or release operationspublic String getContainerDiagnosticInfo()
Copyright © 2014–2015 The Apache Software Foundation. All rights reserved.