一、概述
HBase
是一个开源的 NoSQL
列式分布式数据库,它首要根据 Hadoop
分布式文件体系(HDFS)运转。HBase
开端是由 Facebook
公司奉献,其根据 Google
的 Bigtable
模型开发,在强壮的水平扩展性和高可用性的基础上,供给了能够扩展垂直规模的存储。
HBase
首要特点如下:
-
列式存储:
HBase
采用列式存储的方法来存储数据,它运用HDFS
作为底层文件体系,并把数据存放到HDFS
中的多个Region
中,每个Region
能够存储多行数据。这种存储方法使得HBase
能够支撑十分大的数据量,并且具有更好的写功能。 -
分布式架构:
HBase
是一个分布式的体系,它支撑将数据涣散存放在多台机器上,经过水平扩展的方法来增加存储和核算能力,然后满足大规模数据存储和处理的需求。同时,它还能经过RegionServer
进程的溃散主动搬迁Region
,实现高可用性。 -
高可靠性:
HBase
在存储数据时,会运用多个RegionServer
来持久化数据,这样一来,即使某个RegionServer
溃散或者出现故障,不会导致一切数据都丢失或无法拜访,然后保证了体系的高可靠性。 -
线性可扩展性:
HBase
具有十分强的线性可扩展性,能够经过增加新节点来扩展存储和核算能力,然后满足大规模数据存储和处理的需求。
总而言之,HBase
是一个十分合适处理非结构化、海量数据的 NoSQL
数据库,它具有高可用性、高可靠性、高功能等长处,能够为各类大规模数据存储和处理场景供给解决方案。
这儿仅仅讲解容器化快速布置过程,想了解更多关于hbase的知识点可重视我以下文章:
- 列式存储的分布式数据库——HBase(环境布置)
- 列式存储的分布式数据库——HBase Shell与SQL实战操作(HBase Master高可用实现)
- 【云原生】HBase on k8s 编列布置讲解与实战操作
二、前期预备
1)布置 docker
# 装置yum-config-manager装备东西
yum -y install yum-utils
# 建议运用阿里云yum源:(推荐)
#yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 装置docker-ce版别
yum install -y docker-ce
# 发动并开机发动
systemctl enable --now docker
docker --version
2)布置 docker-compose
curl -SL https://github.com/docker/compose/releases/download/v2.16.0/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
docker-compose --version
三、创立网络
# 创立,注意不能运用hadoop_network,要不然发动hs2服务的时分会有问题!!!
docker network create hadoop-network
# 检查
docker network ls
四、HBase 编列布置
1)装置 zookeeper 环境
关于 zookeeper 的快速布置能够参阅我这篇文章:【中间件】经过 docker-compose 快速布置 Zookeeper 保姆级教程
2)装置 Hadoop 环境
关于 Hadoop 的快速布置能够参阅我这篇文章:经过 docker-compose 快速布置 Hive 具体教程
3)下载 JDK
官网下载:www.oracle.com/java/techno…
百度云下载
链接:pan.baidu.com/s/1-rgW-Z-s… 提取码:
8888
4)下载 HBase
下载地址:hbase.apache.org/downloads.h…
wget https://dlcdn.apache.org/hbase/2.5.4/hbase-2.5.4-bin.tar.gz --no-check-certificate
4)装备
conf/hbase-env.sh
export JAVA_HOME=/opt/apache/jdk
export HBASE_CLASSPATH=/opt/apache/hbase/conf
export HBASE_MANAGES_ZK=false
conf/hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop-hdfs-nn:9000/hbase</value>
<!-- hdfs://ns1/hbase 对应hdfs-site.xml的dfs.nameservices特点值 -->
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zookeeper-node1,zookeeper-node2,zookeeper-node3</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.master</name>
<value>60000</value>
<description>单机版需求配主机名/IP和端口,HA方法只需求配端口</description>
</property>
<property>
<name>hbase.master.info.bindAddress</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>16010</value>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>16020</value>
</property>
<property>
<name>hbase.regionserver.info.port</name>
<value>16030</value>
</property>
<property>
<name>hbase.wal.provider</name>
<value>filesystem</value> <!--也能够用multiwal-->
</property>
</configuration>
conf/backup-masters
hbase-master-2
conf/regionservers
hbase-regionserver-1
hbase-regionserver-2
hbase-regionserver-3
conf/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!--装备namenode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-hdfs-nn:9000</value>
</property>
<!-- 文件的缓冲区巨细(128KB),默认值是4KB -->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<!-- 文件体系垃圾桶保存时间 -->
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<!-- 装备hadoop临时目录,存储元数据用的,请保证该目录(/opt/apache/hadoop/data/hdfs/)已被手动创立,tmp目录会主动创立 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/apache/hadoop/data/hdfs/tmp</value>
</property>
<!--装备HDFS网页登录运用的静态用户为root-->
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
<!--装备root(超级用户)答应经过署理拜访的主机节点-->
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<!--装备root(超级用户)答应经过署理用户所属组-->
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<!--装备root(超级用户)答应经过署理的用户-->
<property>
<name>hadoop.proxyuser.root.user</name>
<value>*</value>
</property>
<!--装备hive答应经过署理拜访的主机节点-->
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<!--装备hive答应经过署理用户所属组-->
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
<!--装备hive答应经过署理拜访的主机节点-->
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<!--装备hive答应经过署理用户所属组-->
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
conf/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- namenode web拜访装备 -->
<property>
<name>dfs.namenode.http-address</name>
<value>0.0.0.0:9870</value>
</property>
<!-- 必须将dfs.webhdfs.enabled特点设置为true,否则就不能运用webhdfs的LISTSTATUS、LISTFILESTATUS等需求列出文件、文件夹状况的指令,因为这些信息都是由namenode来保存的。 -->
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/apache/hadoop/data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/apache/hadoop/data/hdfs/datanode/data1,/opt/apache/hadoop/data/hdfs/datanode/data2,/opt/apache/hadoop/data/hdfs/datanode/data3</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 设置SNN进程运转机器方位信息 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-hdfs-nn2:9868</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<!-- 白名单 -->
<property>
<name>dfs.hosts</name>
<value>/opt/apache/hadoop/etc/hadoop/dfs.hosts</value>
</property>
<!-- 黑名单 -->
<property>
<name>dfs.hosts.exclude</name>
<value>/opt/apache/hadoop/etc/hadoop/dfs.hosts.exclude</value>
</property>
</configuration>
5)发动脚本 bootstrap.sh
#!/usr/bin/env sh
wait_for() {
echo Waiting for $1 to listen on $2...
while ! nc -z $1 $2; do echo waiting...; sleep 1s; done
}
start_hbase_master() {
if [ -n "$1" -a -n "$2" ];then
wait_for $1 $2
fi
${HBASE_HOME}/bin/hbase-daemon.sh start master
tail -f ${HBASE_HOME}/logs/*master*.out
}
start_hbase_regionserver() {
wait_for $1 $2
${HBASE_HOME}/bin/hbase-daemon.sh start regionserver
tail -f ${HBASE_HOME}/logs/*regionserver*.log
}
case $1 in
hbase-master)
start_hbase_master $2 $3
;;
hbase-regionserver)
start_hbase_regionserver $2 $3
;;
*)
echo "请输入正确的服务发动指令~"
;;
esac
6)构建镜像 Dockerfile
FROM registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/centos:7.7.1908
RUN rm -f /etc/localtime && ln -sv /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" > /etc/timezone
RUN export LANG=zh_CN.UTF-8
# 创立用户和用户组,跟yaml编列里的user: 10000:10000
RUN groupadd --system --gid=10000 hadoop && useradd --system --home-dir /home/hadoop --uid=10000 --gid=hadoop hadoop -m
# 装置sudo
RUN yum -y install sudo ; chmod 640 /etc/sudoers
# 给hadoop增加sudo权限
RUN echo "hadoop ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
RUN yum -y install install net-tools telnet wget nc less tree
RUN mkdir /opt/apache/
# 增加装备 JDK
ADD jdk-8u212-linux-x64.tar.gz /opt/apache/
ENV JAVA_HOME /opt/apache/jdk
ENV PATH $JAVA_HOME/bin:$PATH
RUN ln -s /opt/apache/jdk1.8.0_212 $JAVA_HOME
# HBase
ENV HBASE_VERSION 2.5.4
ADD hbase-${HBASE_VERSION}-bin.tar.gz /opt/apache/
ENV HBASE_HOME /opt/apache/hbase
ENV PATH $HBASE_HOME/bin:$PATH
RUN ln -s /opt/apache/hbase-${HBASE_VERSION} $HBASE_HOME
# copy bootstrap.sh
COPY bootstrap.sh /opt/apache/
RUN chmod +x /opt/apache/bootstrap.sh
RUN chown -R hadoop:hadoop /opt/apache
WORKDIR $HBASE_HOME
开端构建镜像
docker build -t registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 . --no-cache
# 为了方便小伙伴下载即可运用,我这儿将镜像文件推送到阿里云的镜像仓库
docker push registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
### 参数解说
# -t:指定镜像名称
# . :当前目录Dockerfile
# -f:指定Dockerfile途径
# --no-cache:不缓存
7)编列 docker-compose.yaml
version: '3'
services:
hbase-master-1:
image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
user: "hadoop:hadoop"
container_name: hbase-master-1
hostname: hbase-master-1
restart: always
privileged: true
env_file:
- .env
volumes:
- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
- ./conf/regionservers:${HBASE_HOME}/conf/regionservers
- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
ports:
- "36010:${HBASE_MASTER_PORT}"
command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master"]
networks:
- hadoop-network
healthcheck:
test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"]
interval: 10s
timeout: 20s
retries: 3
hbase-master-2:
image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
user: "hadoop:hadoop"
container_name: hbase-master-2
hostname: hbase-master-2
restart: always
privileged: true
env_file:
- .env
volumes:
- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
- ./conf/regionservers:${HBASE_HOME}/conf/regionservers
- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
ports:
- "36011:${HBASE_MASTER_PORT}"
command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master hbase-master-1 ${HBASE_MASTER_PORT}"]
networks:
- hadoop-network
healthcheck:
test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"]
interval: 10s
timeout: 20s
retries: 3
hbase-regionserver-1:
image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
user: "hadoop:hadoop"
container_name: hbase-regionserver-1
hostname: hbase-regionserver-1
restart: always
privileged: true
env_file:
- .env
volumes:
- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
- ./conf/regionservers:${HBASE_HOME}/conf/regionservers
- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
ports:
- "36030:${HBASE_REGIONSERVER_PORT}"
command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
networks:
- hadoop-network
healthcheck:
test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
interval: 10s
timeout: 10s
retries: 3
hbase-regionserver-2:
image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
user: "hadoop:hadoop"
container_name: hbase-regionserver-2
hostname: hbase-regionserver-2
restart: always
privileged: true
env_file:
- .env
volumes:
- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
- ./conf/regionservers:${HBASE_HOME}/conf/regionservers
- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
ports:
- "36031:${HBASE_REGIONSERVER_PORT}"
command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
networks:
- hadoop-network
healthcheck:
test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
interval: 10s
timeout: 10s
retries: 3
hbase-regionserver-3:
image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4
user: "hadoop:hadoop"
container_name: hbase-regionserver-3
hostname: hbase-regionserver-3
restart: always
privileged: true
env_file:
- .env
volumes:
- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh
- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml
- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters
- ./conf/regionservers:${HBASE_HOME}/conf/regionservers
- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml
- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml
ports:
- "36032:${HBASE_REGIONSERVER_PORT}"
command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbase-master-1 ${HBASE_MASTER_PORT}"]
networks:
- hadoop-network
healthcheck:
test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"]
interval: 10s
timeout: 10s
retries: 3
# 连接外部网络
networks:
hadoop-network:
external: true
8)开端布置
docker-compose -f docker-compose.yaml up -d
# 检查
docker-compose -f docker-compose.yaml ps
五、简略测验验证
拜访web:http://ip:36010/
docker exec -it hbase-master-1 bash
hbase shell
### 检查状况
status
### 简略的建表
create 'user', 'info', 'data'
# user是表名
# info是列族1的姓名
# data 是列族2的姓名
### 检查表信息
desc 'user'
六、常用的 HBase 客户端指令
HBase是一个开源的分布式列式数据库,用于在Apache Hadoop上存储和处理大规模结构化数据。你能够运用HBase的指令行界面或客户端来管理和操作HBase数据库。以下是一些常用的HBase客户端指令:
1)连接到HBase shell
hbase shell
2)创立表
create 'table_name', 'column_family1', 'column_family2', ...
3)检查已有表
list
4)检查表结构
describe 'table_name'
5)刺进数据
put 'table_name', 'row_key', 'column_family:column', 'value'
6)获取数据
get 'table_name', 'row_key'
7)扫描表数据
scan 'table_name'
8)删去数据
delete 'table_name', 'row_key', 'column_family:column', 'timestamp'
9)禁用表
disable 'table_name'
10)启用表
enable 'table_name'
11)删去表
disable 'table_name'
drop 'table_name'
12)修改表
alter 'table_name', {NAME => 'column_family', VERSIONS => 'new_version'}
这些指令能够让你在HBase中创立表、刺进和获取数据、扫描表数据以及对表进行管理操作。请注意,在实际运用时,你需求将指令中的’table_name’、’column_family’、’row_key’等替换为具体的表名、列族名和行键值。
到此经过 docker-compose 快速布置 HBase 保姆级教程就结束了,后续会持续更新相关技能类文章,如有疑问欢迎私信或重视公众号【大数据与云原生技能共享】加群交流或私信咨询~