Advertisement

YARN重点知识

阅读量:

请先看完HDFS知识 HDFS知识

目录

  • 一 Yarn 资源调度器
  • 二 YARN常用命令

一 Yarn 资源调度器

Yarn 是一个资源调度平台,承担着为运算程序提供服务器级计算资源的任务,类似于运行多个操作系统并行工作的系统架构. MapReduce等任务类程序则如同运行在操作系统的顶层应用程序. YARN 由 ResourceManager、NodeManager、ApplicationMaster 和 Container 等核心组件构成,主要负责资源的动态分配与协调管理.

在这里插入图片描述

工作机制

在这里插入图片描述

(1)MR 程序提交到客户端所在的节点。
(2)YarnRunner 向 ResourceManager 申请一个 Application。
(3)RM 将该应用程序的资源路径返回给 YarnRunner。
(4)该程序将运行所需资源提交到 HDFS 上。
(5)程序资源提交完毕后,申请运行 mrAppMaster。
(6)RM 将用户的请求初始化成一个 Task。
(7)其中一个 NodeManager 领取到 Task 任务。
(8)该 NodeManager 创建容器 Container,并产生 MRAppmaster。
(9)Container 从 HDFS 上拷贝资源到本地。
(10)MRAppmaster 向 RM 申请运行 MapTask 资源。
(11)RM 将运行 MapTask 任务分配给另外两个 NodeManager,另两个 NodeManager 分
别领取任务并创建容器。
(12)MR 向两个接收到任务的 NodeManager 发送程序启动脚本,这两个 NodeManager
分别启动 MapTask,MapTask 对数据分区排序。
(13)MrAppMaster 等待所有 MapTask 运行完毕后,向 RM 申请容器,运行 ReduceTask。
(14)ReduceTask 向 MapTask 获取相应分区的数据。
(15)程序运行完毕后,MR 会向 RM 申请注销自己。

二 YARN常用命令

(1)列出所有 Application:

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn application -list
    2021-02-06 10:21:19,238 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Total number of applications (application-types: [], states: [SUBMITTED, 
    ACCEPTED, RUNNING] and tags: []):0
     Application-Id Application-Name Application-Type 
    User Queue State Final-State Progress
     Tracking-URL

(2)基于Application状态筛选:列出所有应用的状态:ALL, NEW, New Saving State,Submitted,Accepted,Running,Finished,Failed,Killed

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn application -list -appStates 
    FINISHED
    2021-02-06 10:22:20,029 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Total number of applications (application-types: [], states: [FINISHED] 
    and tags: []):1
     Application-Id Application-Name Application-Type 
    User Queue State Final-State Progress
     Tracking-URL
     application_1612577921195_0001 word count MAPREDUCEatguigu default FINISHED SUCCEEDED 100%
    http://hadoop102:19888/jobhistory/job/job_1612577921195_0001`

(3)Terminate the Application:

(4)查询 Application 日志:yarn logs -applicationId

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn logs -applicationId 
    application_1612577921195_0001

(5)查询 Container 日志:yarn logs -applicationId -containerId

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn logs -applicationId 
    application_1612577921195_0001 -containerId 
    container_1612577921195_0001_01_000001

(6)列出所有 Application 尝试的列表:yarn applicationattempt -list

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn applicationattempt -list
    application_1612577921195_0001
    2021-02-06 10:26:54,195 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Total number of application attempts :1
     ApplicationAttempt-Id State AM- Container-Id Tracking-URL
    appattempt_1612577921195_0001_000001 FINISHED
    container_1612577921195_0001_01_000001
    http://hadoop103:8088/proxy/application_1612577921195_0001/

(7)打印 ApplicationAttemp 状态:yarn applicationattempt -status

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn applicationattempt -status 
    appattempt_1612577921195_0001_000001
    2021-02-06 10:27:55,896 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Application Attempt Report : 
    ApplicationAttempt-Id : appattempt_1612577921195_0001_000001
    State : FINISHED
    AMContainer : container_1612577921195_0001_01_000001
    Tracking-URL : 
    http://hadoop103:8088/proxy/application_1612577921195_0001/
    RPC Port : 34756
    AM Host : hadoop104
    Diagnostics :

(8)列出所有 Container:yarn container -list

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn container -list 
    appattempt_1612577921195_0001_000001
    2021-02-06 10:28:41,396 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Total number of containers :0
     Container-Id Start Time Finish Time

(9)打印 Container 状态:yarn container -status

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn container -status 
    container_1612577921195_0001_01_000001
    2021-02-06 10:29:58,554 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Container with id 'container_1612577921195_0001_01_000001' doesn't exist 
    in RM or Timeline Server.

注:请确认容器的状态信息,请您确保任务正在运行中。请查看当前目录下的Yarn项目节点列表。

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn node -list -all
    2021-02-06 10:31:36,962 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Total Nodes:3
     Node-Id Node-State Node-Http-Address Number-of-RunningContainers
    hadoop103:38168 RUNNING hadoop103:8042 
    0
    hadoop102:42012 RUNNING hadoop102:8042 
    0
    hadoop104:39702 RUNNING hadoop104:8042

加载队列配置:yarn rmadmin -refreshQueues

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn rmadmin -refreshQueues
    2021-02-06 10:32:03,331 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8033

打印队列信息:yarn queue -status

复制代码
    [atguigu@hadoop102 hadoop-3.1.3]$ yarn queue -status default
    2021-02-06 10:32:33,403 INFO client.RMProxy: Connecting to ResourceManager 
    at hadoop103/192.168.10.103:8032
    Queue Information : 
    Queue Name : default
    State : RUNNING
    Capacity : 100.0%
    Current Capacity : .0%
    Maximum Capacity : 100.0%
    Default Node Label expression : <DEFAULT_PARTITION>
    Accessible Node Labels : *
    Preemption : disabled
    Intra-queue Preemption : disabled
在这里插入图片描述

深入掌握 Mapreduce 模式的具体实现。 或者另一种优化方案是 HA 高可用模式。

全部评论 (0)

还没有任何评论哟~