HIVE是一個基于Hadoop的數據倉庫,適用于一些高延遲性的應用。如果對延遲性要求比較高,則可以選擇Hbase。
前提:需要已經安裝配置好hadoop參考:hadoop2.7.3偽分布式環境搭建詳細安裝過程
安裝mysql
- 下載安裝mysql
yum install mysql-server - 設置默認字符和引擎
vim /etc/my.cnf
在[mysqld]下添加
default-character-set=utf8
default-storage-engine=INNODB - 啟動mysql
cd /etc/init.d
./mysqld start - 進入mysql
mysql
建立配置hive數據庫
為用戶創建一個名為hive的數據庫,并設置編碼為latin1
mysql> create database hive default character set latin1;查看hive數據庫是否成功建立
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hive |
| mysql |
| test |
+--------------------+
4 rows in set (0.00 sec)
- 創建hive用戶并授權
//授權hive用戶擁有hive數據庫的所有權限
mysql> grant all privileges on hive.* to hive@'%' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
//刷新系統權限表
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
- 測試hive用戶能否鏈接到mysql
[root@cognos init.d]# mysql -u hive -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
。。。
mysql> use hive;
Database changed
mysql> show tables;
Empty set (0.00 sec)
安裝hive
- 下載
hive-2.0.1下載 - 解壓
tar -xzvf apache-hive-2.0.1-bin.tar.gz - 將解壓后的文件夾重命名并放到hadoop目錄下
mv apache-hive-2.0.1-bin hive
mv hive /opt/hadoop/ - 下載mysql驅動包并放入hive安裝目錄/lib下
我這里下載的是mysql-connector-java-5.1.36-bin.jar
配置
- 修改環境變量
vi /etc/profile
添加以下內容
#HIVE
export HIVE_HOME=/opt/hadoop/hive
export PATHA=$PATH:$HIVE_HOME/bin
source /etc/profile 使更改生效
2.修改hive配置文件
- 復制幾個配置文件
cp hive-default.xml.template hive-default.xml
cp hive-env.sh.template hive-env.sh
cp hive-log4j2.properties.template hive-log4j2.properties
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
- 修改hive-default.xml
vim hive-default.xml
通過vim編輯器的查找命令找到有vavax的位置,并對相關地方進行配置。總共四處。這四處改為之前mysql的配置信息。
#jdbc連接方式
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
#mysql連接配置
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true</value>
#mysql數據庫的用戶名
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
#用戶對應的密碼
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
redhat中vim編輯器的查找命令
:set hls //打開高亮
/XXX //往下查找
?XXX //網上查找
>####啟動
1. 啟動Hive 的 Metastore Server服務進程
hive --service metastore &
2. hive第一次登錄需要初始化
schematool -dbType mysql -initSchema
3. 登錄hive
[root@cognos conf]# hive
which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
大致意思:在Hive2.0后在Mapreduce的框架上將不再支持,希望考慮使用其它的執行引擎(如tez,spark等。)暫時不知道會有什么影響。
hive> show databases;
OK
default
Time taken: 0.728 seconds, Fetched: 1 row(s)
4. 驗證
hive配置成功后,mysql同樣可以連接到hive數據庫,并進行操作。
mysql> use hive
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
+---------------------------+
55 rows in set (0.01 sec)
>####報錯及解決方法
1. SLF4J多重綁定
which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
**解決辦法**
上述jar包有重復綁定Logger類,刪除較舊版本即可。
rm -rf /opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar
2. 沒有正常啟動Hive 的 Metastore Server服務進程。
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1550)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3080)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3108)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:543)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:516)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.reflect.InvocationTargetException
**解決方法:**
啟動Hive 的 Metastore Server服務進程,執行如下命令:
hive --service metastore &
3. mysql權限問題
```
javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true, username = hive. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Access denied for user 'hive'@'cognos' (using password: YES)
解決辦法:
將hive-default.xml文件中的jdbc:mysql://172.16.7.191:3306換成localhost:3306
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
- hive第一次登錄沒有初始化
avax.jdo.JDODataStoreException: Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.schema.autoCreateTables"
解決辦法:
hive在第一次登錄的時候需要用 schematool -dbType mysql -initSchema命令初始化。執行執行以下命令
schematool -dbType mysql -initSchema
- 不明確的路徑指代system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
原因是system:java.io.tmpdir變量在配置文件中無法獲取到實際的值,就是找不到路徑,正常情況下Hive啟動的時候會產生臨時文件和日志文件。由于文件無法被創建,所以進程就啟動不了。
解決辦法:
在配置文件default-site.xml里找"system:java.io.tmpdir"把他們都換成絕對路徑如:/opt/hadoop/hive/iotmp/
并指認一個system:user.name
<property>
<name>system:user.name</name>
<value>user_name</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hadoop/hive/iotmp/${system:user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hadoop/hive/iotmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
參考:
redhat下mysql安裝與使用
mysql 創建和刪除用戶
HIVE完全分布式集群安裝過程(元數據庫: MySQL)
[Hive]那些年踩過的Hive坑