当前位置:   article > 正文

docker-compose部署hbase集群 —— 筑梦之路_docker hbase集群

docker hbase集群

1. 简单介绍

HBase 是一个开源的 NoSQL 列式分布式数据库,它主要基于 Hadoop 分布式文件系统(HDFS)运行。HBase 最初是由 Facebook 公司贡献,其基于 Google的 Bigtable 模型开发,在强大的水平扩展性和高可用性的基础上,提供了可以扩展垂直规模的存储。

HBase 主要特点如下:

  • 列式存储HBase 采用列式存储的方式来存储数据,它使用 HDFS 作为底层文件系统,并把数据存放到 HDFS 中的多个 Region 中,每个 Region 能够存储多行数据。这种存储方式使得 HBase 可以支持非常大的数据量,并且具有更好的写性能。

  • 分布式架构HBase 是一个分布式的系统,它支持将数据分散存放在多台机器上,通过水平扩展的方式来增加存储和计算能力,从而满足大规模数据存储和处理的需求。同时,它还能通过 RegionServer 进程的崩溃自动迁移 Region,实现高可用性。

  • 高可靠性HBase 在存储数据时,会使用多个 RegionServer 来持久化数据,这样一来,即使某个 RegionServer 崩溃或者出现故障,不会导致所有数据都丢失或无法访问,从而保证了系统的高可靠性。

  • 线性可扩展性HBase 具有非常强的线性可扩展性,可以通过添加新节点来扩展存储和计算能力,从而满足大规模数据存储和处理的需求。

总而言之,HBase 是一个非常适合处理非结构化、海量数据的 NoSQL 数据库,它具有高可用性、高可靠性、高性能等优点,能够为各类大规模数据存储和处理场景提供解决方案。

参考资料:

列式存储的分布式数据库——HBase(环境部署) 

列式存储的分布式数据库——HBase Shell与SQL实战操作(HBase Master高可用实现)

【云原生】HBase on k8s 编排部署讲解与实战操作 

2. docker创建网络

  1. docker network create hadoop-network
  2. # 查看
  3. docker network ls

3. 下载二进制文件

wget https://dlcdn.apache.org/hbase/2.5.4/hbase-2.5.4-bin.tar.gz --no-check-certificate

4. 准备配置文件

  1. cat > conf/hbase-env.sh << EOF
  2. export JAVA_HOME=/opt/apache/jdk
  3. export HBASE_CLASSPATH=/opt/apache/hbase/conf
  4. export HBASE_MANAGES_ZK=false
  5. EOF
  6. cat > conf/hbase-site.xml << EOF
  7. <configuration>
  8. <property>
  9. <name>hbase.rootdir</name>
  10. <value>hdfs://hadoop-hdfs-nn:9000/hbase</value>
  11. <!-- hdfs://ns1/hbase 对应hdfs-site.xml的dfs.nameservices属性值 -->
  12. </property>
  13. <property>
  14. <name>hbase.cluster.distributed</name>
  15. <value>true</value>
  16. </property>
  17. <property>
  18. <name>hbase.zookeeper.quorum</name>
  19. <value>zookeeper-node1,zookeeper-node2,zookeeper-node3</value>
  20. </property>
  21. <property>
  22. <name>hbase.zookeeper.property.clientPort</name>
  23. <value>2181</value>
  24. </property>
  25. <property>
  26. <name>hbase.master</name>
  27. <value>60000</value>
  28. <description>单机版需要配主机名/IP和端口,HA方式只需要配端口</description>
  29. </property>
  30. <property>
  31. <name>hbase.master.info.bindAddress</name>
  32. <value>0.0.0.0</value>
  33. </property>
  34. <property>
  35. <name>hbase.master.port</name>
  36. <value>16000</value>
  37. </property>
  38. <property>
  39. <name>hbase.master.info.port</name>
  40. <value>16010</value>
  41. </property>
  42. <property>
  43. <name>hbase.regionserver.port</name>
  44. <value>16020</value>
  45. </property>
  46. <property>
  47. <name>hbase.regionserver.info.port</name>
  48. <value>16030</value>
  49. </property>
  50. <property>
  51. <name>hbase.wal.provider</name>
  52. <value>filesystem</value> <!--也可以用multiwal-->
  53. </property>
  54. </configuration>
  55. EOF
  1. cat > conf/backup-masters << EOF
  2. hbase-master-2
  3. EOF
  1. cat > conf/regionservers << EOF
  2. hbase-regionserver-1
  3. hbase-regionserver-2
  4. hbase-regionserver-3
  5. EOF
  1. cat > conf/hadoop/core-site.xml << EOF
  2. <?xml version="1.0" encoding="UTF-8"?>
  3. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  4. <!--
  5. Licensed under the Apache License, Version 2.0 (the "License");
  6. you may not use this file except in compliance with the License.
  7. You may obtain a copy of the License at
  8. http://www.apache.org/licenses/LICENSE-2.0
  9. Unless required by applicable law or agreed to in writing, software
  10. distributed under the License is distributed on an "AS IS" BASIS,
  11. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12. See the License for the specific language governing permissions and
  13. limitations under the License. See accompanying LICENSE file.
  14. -->
  15. <!-- Put site-specific property overrides in this file. -->
  16. <configuration>
  17. <!--配置namenode的地址 -->
  18. <property>
  19. <name>fs.defaultFS</name>
  20. <value>hdfs://hadoop-hdfs-nn:9000</value>
  21. </property>
  22. <!-- 文件的缓冲区大小(128KB),默认值是4KB -->
  23. <property>
  24. <name>io.file.buffer.size</name>
  25. <value>131072</value>
  26. </property>
  27. <!-- 文件系统垃圾桶保存时间 -->
  28. <property>
  29. <name>fs.trash.interval</name>
  30. <value>1440</value>
  31. </property>
  32. <!-- 配置hadoop临时目录,存储元数据用的,请确保该目录(/opt/apache/hadoop/data/hdfs/)已被手动创建,tmp目录会自动创建 -->
  33. <property>
  34. <name>hadoop.tmp.dir</name>
  35. <value>/opt/apache/hadoop/data/hdfs/tmp</value>
  36. </property>
  37. <!--配置HDFS网页登录使用的静态用户为root-->
  38. <property>
  39. <name>hadoop.http.staticuser.user</name>
  40. <value>root</value>
  41. </property>
  42. <!--配置root(超级用户)允许通过代理访问的主机节点-->
  43. <property>
  44. <name>hadoop.proxyuser.root.hosts</name>
  45. <value>*</value>
  46. </property>
  47. <!--配置root(超级用户)允许通过代理用户所属组-->
  48. <property>
  49. <name>hadoop.proxyuser.root.groups</name>
  50. <value>*</value>
  51. </property>
  52. <!--配置root(超级用户)允许通过代理的用户-->
  53. <property>
  54. <name>hadoop.proxyuser.root.user</name>
  55. <value>*</value>
  56. </property>
  57. <!--配置hive允许通过代理访问的主机节点-->
  58. <property>
  59. <name>hadoop.proxyuser.hive.hosts</name>
  60. <value>*</value>
  61. </property>
  62. <!--配置hive允许通过代理用户所属组-->
  63. <property>
  64. <name>hadoop.proxyuser.hive.groups</name>
  65. <value>*</value>
  66. </property>
  67. <!--配置hive允许通过代理访问的主机节点-->
  68. <property>
  69. <name>hadoop.proxyuser.hadoop.hosts</name>
  70. <value>*</value>
  71. </property>
  72. <!--配置hive允许通过代理用户所属组-->
  73. <property>
  74. <name>hadoop.proxyuser.hadoop.groups</name>
  75. <value>*</value>
  76. </property>
  77. </configuration>
  78. EOF
  1. cat > conf/hadoop/hdfs-site.xml << EOF
  2. <?xml version="1.0" encoding="UTF-8"?>
  3. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  4. <!--
  5. Licensed under the Apache License, Version 2.0 (the "License");
  6. you may not use this file except in compliance with the License.
  7. You may obtain a copy of the License at
  8. http://www.apache.org/licenses/LICENSE-2.0
  9. Unless required by applicable law or agreed to in writing, software
  10. distributed under the License is distributed on an "AS IS" BASIS,
  11. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12. See the License for the specific language governing permissions and
  13. limitations under the License. See accompanying LICENSE file.
  14. -->
  15. <!-- Put site-specific property overrides in this file. -->
  16. <configuration>
  17. <!-- namenode web访问配置 -->
  18. <property>
  19. <name>dfs.namenode.http-address</name>
  20. <value>0.0.0.0:9870</value>
  21. </property>
  22. <!-- 必须将dfs.webhdfs.enabled属性设置为true,否则就不能使用webhdfs的LISTSTATUS、LISTFILESTATUS等需要列出文件、文件夹状态的命令,因为这些信息都是由namenode来保存的。 -->
  23. <property>
  24. <name>dfs.webhdfs.enabled</name>
  25. <value>true</value>
  26. </property>
  27. <property>
  28. <name>dfs.namenode.name.dir</name>
  29. <value>/opt/apache/hadoop/data/hdfs/namenode</value>
  30. </property>
  31. <property>
  32. <name>dfs.datanode.data.dir</name>
  33. <value>/opt/apache/hadoop/data/hdfs/datanode/data1,/opt/apache/hadoop/data/hdfs/datanode/data2,/opt/apache/hadoop/data/hdfs/datanode/data3</value>
  34. </property>
  35. <property>
  36. <name>dfs.replication</name>
  37. <value>3</value>
  38. </property>
  39. <!-- 设置SNN进程运行机器位置信息 -->
  40. <property>
  41. <name>dfs.namenode.secondary.http-address</name>
  42. <value>hadoop-hdfs-nn2:9868</value>
  43. </property>
  44. <property>
  45. <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
  46. <value>false</value>
  47. </property>
  48. <!-- 白名单 -->
  49. <property>
  50. <name>dfs.hosts</name>
  51. <value>/opt/apache/hadoop/etc/hadoop/dfs.hosts</value>
  52. </property>
  53. <!-- 黑名单 -->
  54. <property>
  55. <name>dfs.hosts.exclude</name>
  56. <value>/opt/apache/hadoop/etc/hadoop/dfs.hosts.exclude</value>
  57. </property>
  58. </configuration>
  59. EOF
  1. cat > bootstrap.sh << EOF
  2. #!/usr/bin/env sh
  3. wait_for() {
  4. echo Waiting for $1 to listen on $2...
  5. while ! nc -z $1 $2; do echo waiting...; sleep 1s; done
  6. }
  7. start_hbase_master() {
  8. if [ -n "$1" -a -n "$2" ];then
  9. wait_for $1 $2
  10. fi
  11. ${HBASE_HOME}/bin/hbase-daemon.sh start master
  12. tail -f ${HBASE_HOME}/logs/*master*.out
  13. }
  14. start_hbase_regionserver() {
  15. wait_for $1 $2
  16. ${HBASE_HOME}/bin/hbase-daemon.sh start regionserver
  17. tail -f ${HBASE_HOME}/logs/*regionserver*.log
  18. }
  19. case $1 in
  20. hbase-master)
  21. start_hbase_master $2 $3
  22. ;;
  23. hbase-regionserver)
  24. start_hbase_regionserver $2 $3
  25. ;;
  26. *)
  27. echo "请输入正确的服务启动命令~"
  28. ;;
  29. esac
  30. EOF

5. 构建Dockerfile

  1. FROM centos:7.9
  2. RUN rm -f /etc/localtime && ln -sv /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" > /etc/timezone
  3. RUN export LANG=zh_CN.UTF-8
  4. # 创建用户和用户组,跟yaml编排里的user: 10000:10000
  5. RUN groupadd --system --gid=10000 hadoop && useradd --system --home-dir /home/hadoop --uid=10000 --gid=hadoop hadoop -m
  6. # 安装sudo
  7. RUN yum -y install sudo net-tools telnet wget nc less tree; chmod 640 /etc/sudoers
  8. # 给hadoop添加sudo权限
  9. RUN echo "hadoop ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
  10. RUN mkdir /opt/apache/
  11. # 添加配置 JDK
  12. ADD jdk-8u212-linux-x64.tar.gz /opt/apache/
  13. ENV JAVA_HOME /opt/apache/jdk
  14. ENV PATH $JAVA_HOME/bin:$PATH
  15. RUN ln -s /opt/apache/jdk1.8.0_212 $JAVA_HOME
  16. # HBase
  17. ENV HBASE_VERSION 2.5.4
  18. ADD hbase-${HBASE_VERSION}-bin.tar.gz /opt/apache/
  19. ENV HBASE_HOME /opt/apache/hbase
  20. ENV PATH $HBASE_HOME/bin:$PATH
  21. RUN ln -s /opt/apache/hbase-${HBASE_VERSION} $HBASE_HOME
  22. # copy bootstrap.sh
  23. COPY bootstrap.sh /opt/apache/
  24. RUN chmod +x /opt/apache/bootstrap.sh
  25. RUN chown -R hadoop:hadoop /opt/apache
  26. WORKDIR $HBASE_HOME
  1. docker build -t hbase:2.5.4 . --no-cache
  2. ### 参数解释
  3. # -t:指定镜像名称
  4. # . :当前目录Dockerfile
  5. # -f:指定Dockerfile路径
  6. # --no-cache:不缓存

6. docker-compose.yml

  1. version: '3'
  2. services:
  3. hbase-master-1:
  4. image: hbase:2.5.4
  5. user: "hadoop:hadoop"
  6. container_name: hbase-master-1
  7. hostname: hbase-master-1
  8. restart: always
  9. privileged: true
  10. env_file:
  11. - .env
  12. volumes:
  13. - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
  14. - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
  15. - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
  16. - ./conf/regionservers:${HBASE_HOME}/conf/regionservers
  17. - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
  18. - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
  19. ports:
  20. - "36010:${HBASE_MASTER_PORT}"
  21. command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master"]
  22. networks:
  23. - hadoop-network
  24. healthcheck:
  25. test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"]
  26. interval: 10s
  27. timeout: 20s
  28. retries: 3
  29. hbase-master-2:
  30. image: hbase:2.5.4
  31. user: "hadoop:hadoop"
  32. container_name: hbase-master-2
  33. hostname: hbase-master-2
  34. restart: always
  35. privileged: true
  36. env_file:
  37. - .env
  38. volumes:
  39. - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
  40. - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
  41. - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
  42. - ./conf/regionservers:${HBASE_HOME}/conf/regionservers
  43. - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
  44. - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
  45. ports:
  46. - "36011:${HBASE_MASTER_PORT}"
  47. command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master hbase-master-1 ${HBASE_MASTER_PORT}"]
  48. networks:
  49. - hadoop-network
  50. healthcheck:
  51. test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"]
  52. interval: 10s
  53. timeout: 20s
  54. retries: 3
  55. hbase-regionserver-1:
  56. image: hbase:2.5.4
  57. user: "hadoop:hadoop"
  58. container_name: hbase-regionserver-1
  59. hostname: hbase-regionserver-1
  60. restart: always
  61. privileged: true
  62. env_file:
  63. - .env
  64. volumes:
  65. - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
  66. - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
  67. - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
  68. - ./conf/regionservers:${HBASE_HOME}/conf/regionservers
  69. - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
  70. - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
  71. ports:
  72. - "36030:${HBASE_REGIONSERVER_PORT}"
  73. command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
  74. networks:
  75. - hadoop-network
  76. healthcheck:
  77. test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
  78. interval: 10s
  79. timeout: 10s
  80. retries: 3
  81. hbase-regionserver-2:
  82. image: hbase:2.5.4
  83. user: "hadoop:hadoop"
  84. container_name: hbase-regionserver-2
  85. hostname: hbase-regionserver-2
  86. restart: always
  87. privileged: true
  88. env_file:
  89. - .env
  90. volumes:
  91. - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
  92. - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
  93. - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
  94. - ./conf/regionservers:${HBASE_HOME}/conf/regionservers
  95. - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
  96. - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
  97. ports:
  98. - "36031:${HBASE_REGIONSERVER_PORT}"
  99. command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
  100. networks:
  101. - hadoop-network
  102. healthcheck:
  103. test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
  104. interval: 10s
  105. timeout: 10s
  106. retries: 3
  107. hbase-regionserver-3:
  108. image: hbase:2.5.4
  109. user: "hadoop:hadoop"
  110. container_name: hbase-regionserver-3
  111. hostname: hbase-regionserver-3
  112. restart: always
  113. privileged: true
  114. env_file:
  115. - .env
  116. volumes:
  117. - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
  118. - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
  119. - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
  120. - ./conf/regionservers:${HBASE_HOME}/conf/regionservers
  121. - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
  122. - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
  123. ports:
  124. - "36032:${HBASE_REGIONSERVER_PORT}"
  125. command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
  126. networks:
  127. - hadoop-network
  128. healthcheck:
  129. test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
  130. interval: 10s
  131. timeout: 10s
  132. retries: 3
  133. # 连接外部网络
  134. networks:
  135. hadoop-network:
  136. external: true

7. 访问检查验证

访问web:http://ip:36010/

  1. docker exec -it hbase-master-1 bash
  2. hbase shell
  3. ### 查看状态
  4. status
  5. ### 简单的建表
  6. create 'user', 'info', 'data'
  7. # user是表名
  8. # info是列族1的名字
  9. # data 是列族2的名字
  10. ### 查看表信息
  11. desc 'user'

8. 常用命令

  1. # 连接shell
  2. hbase shell
  3. # 创建表
  4. create 'table_name', 'column_family1', 'column_family2', ...
  5. # 查看已有表
  6. list
  7. # 查看表结构
  8. describe 'table_name'
  9. # 插入数据
  10. put 'table_name', 'row_key', 'column_family:column', 'value'
  11. # 获取数据
  12. get 'table_name', 'row_key'
  13. # 扫描表数据
  14. scan 'table_name'
  15. # 删除数据
  16. delete 'table_name', 'row_key', 'column_family:column', 'timestamp'
  17. # 禁用表
  18. disable 'table_name'
  19. # 启用表
  20. enable 'table_name'
  21. # 删除表
  22. disable 'table_name'
  23. drop 'table_name'
  24. # 修改表
  25. alter 'table_name', {NAME => 'column_family', VERSIONS => 'new_version'}

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/article/detail/41894
推荐阅读
相关标签
  

闽ICP备14008679号