Advertisement

Ubuntu14.04离线安装CDH5.6.0

阅读量:

官方安装文档:http://www.cloudera.com/documentation/enterprise/5-6-x/topics/installation.html
获取相关包的位置:
Cloudera Manager的位置:http://archive.cloudera.com/cm5/cm/5/
CDH安装包的位置:http://archive.cloudera.com/cdh5/parcels/5.6.0/

由于我们的操作系统为ubuntu14.04,需要下载以下文件:

复制代码
    CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel
    CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1 
    manifest.json 
    
      
      
      
    
    代码解读

全程采用root安装

机器配置

1. 三台机器的ip和名字为

  • 192.168.10.236 hadoop-1 (内存16G)
  • 192.168.10.237 hadoop-2 (内存8G)
  • 192.168.10.238 hadoop-3 (内存8G)

我们将hadoop-1作为主节点

2. 配置/etc/hosts,使节点间通过 hadoop-X 即可访问其他节点


3. 配置主节点root免密码登录到其他节点(不需要从节点到主节点)

3.1 在hadoop-1上执行ssh-keygen -t rsa -P ''生成无密码密钥对

3.2 将私钥插入至认证信息文件中:bash cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

为了确保主节点无需输入密码即可连接到远程节点(如hadoop-2和hadoop-3),我们需要将所有认证文件复制到/root/.ssh/目录下。


4. 配置jdk

4.1 安装oracle-j2sdk1.7版本(主从都要,根据CDH版本选择对应的jdk)

复制代码
    $ apt-get install oracle-j2sdk1.7
    $ update-alternatives --install /usr/bin/java java /usr/lib/jvm/java-7-oracle-cloudera/bin/java 300
    $ update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/java-7-oracle-cloudera/bin/javac 300
    
      
      
      
    
    代码解读

4.2 $ vim /etc/profile

在末尾添加

复制代码
    export JAVA_HOME=/usr/lib/jvm/java-7-oracle-cloudera
    export JRE_HOME=${JAVA_HOME}/jre 
    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
    export PATH=$PATH:${JAVA_HOME}/bin:{JRE_HOME}/bin:$PATH
    
      
      
      
      
    
    代码解读

4.3 $ vim /root/.bashrc

在末尾添加

复制代码
    source /etc/profile
    
      
    
    代码解读

安装MariaDB 5.5版本(关于兼容性问题,请参考官方文档的具体说明;具体操作步骤可参考链接 ubuntu14.04 安装MariaDB10.0并允许远程访问中的详细指导)

5.1 执行$ apt-get install mariadb-server-5.5

5.2 数据库设置(官方建议设置)

复制代码
    $ vim /etc/mysql/my.cnf
    
      
    
    代码解读

下面是官方建议的配置

复制代码
    [mysqld]
    transaction-isolation = READ-COMMITTED
    # Disabling symbolic-links is recommended to prevent assorted security risks;
    # to do so, uncomment this line:
    # symbolic-links = 0
    
    key_buffer = 16M
    key_buffer_size = 32M
    max_allowed_packet = 32M
    thread_stack = 256K
    thread_cache_size = 64
    query_cache_limit = 8M
    query_cache_size = 64M
    query_cache_type = 1
    
    max_connections = 550
    #expire_logs_days = 10
    #max_binlog_size = 100M
    
    #log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system
    #and chown the specified folder to the mysql user.
    log_bin=/var/lib/mysql/mysql_binary_log
    
    binlog_format = mixed
    
    read_buffer_size = 2M
    read_rnd_buffer_size = 16M
    sort_buffer_size = 8M
    join_buffer_size = 8M
    
    # InnoDB settings
    innodb_file_per_table = 1
    innodb_flush_log_at_trx_commit  = 2
    innodb_log_buffer_size = 64M
    innodb_buffer_pool_size = 4G
    innodb_thread_concurrency = 8
    innodb_flush_method = O_DIRECT
    innodb_log_file_size = 512M
    
    [mysqld_safe]
    log-error=/var/log/mysqld.log
    pid-file=/var/run/mysqld/mysqld.pid
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

重启服务,service mysql restart

5.3 创建相关数据库

复制代码
    进入mysql命令行:$ mysql -u root -p
    进入mysql命令行后,直接复制下面的整段话并粘贴:
    create database amon DEFAULT CHARACTER SET utf8;
    grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon_password';
    grant all on amon.* TO 'amon'@'CDH' IDENTIFIED BY 'amon_password';
    create database smon DEFAULT CHARACTER SET utf8;
    grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'smon_password';
    grant all on smon.* TO 'smon'@'CDH' IDENTIFIED BY 'smon_password';
    create database rman DEFAULT CHARACTER SET utf8;
    grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman_password';
    grant all on rman.* TO 'rman'@'CDH' IDENTIFIED BY 'rman_password';
    create database hmon DEFAULT CHARACTER SET utf8;
    grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'hmon_password';
    grant all on hmon.* TO 'hmon'@'CDH' IDENTIFIED BY 'hmon_password';
    create database hive DEFAULT CHARACTER SET utf8;
    grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
    grant all on hive.* TO 'hive'@'CDH' IDENTIFIED BY 'hive_password';
    create database oozie DEFAULT CHARACTER SET utf8;
    grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie_password';
    grant all on oozie.* TO 'oozie'@'CDH' IDENTIFIED BY 'oozie_password';
    create database metastore DEFAULT CHARACTER SET utf8;
    grant all on metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
    grant all on metastore.* TO 'hive'@'CDH' IDENTIFIED BY 'hive_password';
    GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'gaoying' WITH GRANT OPTION;
    flush privileges;
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

5.4 安装MariaDB jdbc 驱动

复制代码
    $ apt-get install libmysql-java
    
      
    
    代码解读

5.5 使用cloudera脚本在mysql中进行相关配置:(先完成第6.1后再配置)

复制代码
    $ /opt/cloudera-manager/cm-5.6.0/share/cmf/schema/scm_prepare_database.sh mysql -uroot -p --scm-host localhost scm scm scm_password
    
      
    
    代码解读

6. 安装Cloudera Manager Server 和 Agents

6.1 把安装包解压到主节点

创建一个名为/opt/cloudera-manager的目录,并将下载好的cloudera-manager-trusty-cm5.6.0_amd64.tar.gz文件发送至该目录中

6.2 创建用户

复制代码
    $ sudo useradd --system --home=/opt/cloudera-manager/cm-5.6.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
    //--home 指向你cloudera-scm-server的路径
    
      
      
    
    代码解读

6.3 创建cloudera manager server的本地数据存储目录(主)

复制代码
    $ sudo mkdir /var/log/cloudera-scm-server
    $ sudo chown cloudera-scm:cloudera-scm /var/log/cloudera-scm-server
    
      
      
    
    代码解读

6.4 在每个Cloudera Manager Agent 节点配置server_host(主从都要)

复制代码
    $ vim /opt/cloudera-manager/cm-5.6.0/etc/cloudera-scm-agent/config.ini
    //吧server_host改成主节点名称
    server_host=hadoop-1
    
      
      
      
    
    代码解读

6.5 将cloudera-manager发送到各从节点对应的目录下(即/opt)

复制代码
    $ scp -r /opt/cloudera-manager root@hadoop-2:/opt
    
      
    
    代码解读

6.6 创建Parcel目录

6.6.1 在主节点上:

创建安装包目录 mkdir -p /opt/cloudera/parcel-repo

将CHD5相关的Parcel包放到主节点的/opt/cloudera/parcel-repo/目录中

复制代码
    CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel
    CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1 
    manifest.json 
    
      
      
      
    
    代码解读

最后决定将CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha1文件重命名为CDH-5.6.0-1.cdh5.6.0.p0.45-trusty.parcel.sha,并且这一点必须特别注意如果不注意的话会导致系统会自动重新下载原来的CDH-5.x.y.z.parcel.sha1文件。

6.6.2 在从节点上:mkdir -p /opt/cloudera/parcels

6.7 在各节点上安装依赖

apt-get install安装以下依赖

复制代码
    lsb-base
    psmisc
    bash
    libsasl2-modules
    libsasl2-modules-gssapi-mit
    zlib1g
    libxslt1.1
    libsqlite3-0
    libfuse2
    fuse-utils or fuse
    rpcbind
    
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

6.8 启动server和agent

主节点上:

复制代码
    /opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-server start
    /opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-agent stop
    
      
      
    
    代码解读

从节点上:

复制代码
    /opt/cloudera-manager/cm-5.6.0/etc/init.d/cloudera-scm-agent stop
    
      
    
    代码解读

若启动出现问题,请检查位于/opt/cloudera-manager/cm-5.6.0/log文件夹中的日志记录。
如果没问题的话,请稍等片刻,在浏览器中访问Cloudera Manager Admin Dashboard。
我的主节点IP地址是192.168.10.236
然后访问该IP地址上的7180端口,默认的用户名和密码都是admin。


7 CDH5的安装和配置

选择免费版
这里写图片描述

在new hosts选项卡中用hadoop-[1-3]查找组成集群的主机名

这里写图片描述

勾选要安装的节点,点继续

这里写图片描述

出现以下包名,说明本地Parcel包配置无误,直接点继续

这里写图片描述

安装所有服务

这里写图片描述

采取默认值

这里写图片描述

数据库配置

这里写图片描述

这里也默认

这里写图片描述
这里写图片描述
这里写图片描述

等待安装

这里写图片描述

安装成功

这里写图片描述
这里写图片描述

8 简单测试

8.1 在hadoop上执行MapReduce job

在主节点终端执行

复制代码
    sudo -u hdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    pi 10 100
    
      
      
      
    
    代码解读

终端会输出任务执行情况

复制代码
    root@hadoop-1:~# sudo -u hdfs hadoop jar  /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
    Number of Maps  = 10
    Samples per Map = 100
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Wrote input for Map #5
    Wrote input for Map #6
    Wrote input for Map #7
    Wrote input for Map #8
    Wrote input for Map #9
    Starting Job
    16/05/18 21:26:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop-1/192.168.10.236:8032
    16/05/18 21:26:58 INFO input.FileInputFormat: Total input paths to process : 10
    16/05/18 21:26:58 INFO mapreduce.JobSubmitter: number of splits:10
    16/05/18 21:26:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1463558073107_0001
    16/05/18 21:26:58 INFO impl.YarnClientImpl: Submitted application application_1463558073107_0001
    16/05/18 21:26:59 INFO mapreduce.Job: The url to track the job: http://hadoop-1:8088/proxy/application_1463558073107_0001/
    16/05/18 21:26:59 INFO mapreduce.Job: Running job: job_1463558073107_0001
    16/05/18 21:27:05 INFO mapreduce.Job: Job job_1463558073107_0001 running in uber mode : false
    16/05/18 21:27:05 INFO mapreduce.Job:  map 0% reduce 0%
    16/05/18 21:27:10 INFO mapreduce.Job:  map 10% reduce 0%
    16/05/18 21:27:14 INFO mapreduce.Job:  map 20% reduce 0%
    16/05/18 21:27:15 INFO mapreduce.Job:  map 40% reduce 0%
    16/05/18 21:27:18 INFO mapreduce.Job:  map 50% reduce 0%
    16/05/18 21:27:20 INFO mapreduce.Job:  map 70% reduce 0%
    16/05/18 21:27:22 INFO mapreduce.Job:  map 80% reduce 0%
    16/05/18 21:27:24 INFO mapreduce.Job:  map 100% reduce 0%
    16/05/18 21:27:27 INFO mapreduce.Job:  map 100% reduce 100%
    16/05/18 21:27:27 INFO mapreduce.Job: Job job_1463558073107_0001 completed successfully
    16/05/18 21:27:27 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=96
        FILE: Number of bytes written=1272025
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=2630
        HDFS: Number of bytes written=215
        HDFS: Number of read operations=43
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=3
    Job Counters 
        Launched map tasks=10
        Launched reduce tasks=1
        Data-local map tasks=10
        Total time spent by all maps in occupied slots (ms)=37617
        Total time spent by all reduces in occupied slots (ms)=2866
        Total time spent by all map tasks (ms)=37617
        Total time spent by all reduce tasks (ms)=2866
        Total vcore-seconds taken by all map tasks=37617
        Total vcore-seconds taken by all reduce tasks=2866
        Total megabyte-seconds taken by all map tasks=38519808
        Total megabyte-seconds taken by all reduce tasks=2934784
    Map-Reduce Framework
        Map input records=10
        Map output records=20
        Map output bytes=180
        Map output materialized bytes=340
        Input split bytes=1450
        Combine input records=0
        Combine output records=0
        Reduce input groups=2
        Reduce shuffle bytes=340
        Reduce input records=20
        Reduce output records=0
        Spilled Records=40
        Shuffled Maps =10
        Failed Shuffles=0
        Merged Map outputs=10
        GC time elapsed (ms)=602
        CPU time spent (ms)=12210
        Physical memory (bytes) snapshot=4803805184
        Virtual memory (bytes) snapshot=15372648448
        Total committed heap usage (bytes)=4912578560
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=1180
    File Output Format Counters 
        Bytes Written=97
    Job Finished in 29.482 seconds
    Estimated value of Pi is 3.14800000000000000000
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

在WebUI界面的Clusters > Cluster 1 > Activities > YARN Applications

这里写图片描述

参考文献如下:
<>
http://itindex.net/detail/51928-cloudera-manager-cdh5

全部评论 (0)

还没有任何评论哟~