Hive 설치
1. Hive 다운
# /opt 이동
cd /opt
# Hive 설치 (23.3.2 기준 3.1.3 최신)
sudo wget https://dlcdn.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
# tar 압축 해제
sudo tar -zxvf apache-hive-3.1.3-bin.tar.gz
# 폴더명 변경
sudo my apache-hive-3.1.3-bin hive-3.1.3
2. 환경변수
# .bashrc 수정
vi ~/.bashrc
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
HADOOP_HOME=/opt/hadoop-3.3.4
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
HIVE_HOME=/opt/hive-3.1.3
export JAVA_HOME HADOOP_HOME HADOOP_CONF_DIR HIVE_HOME
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
source ~/.bashrc
3. 설정 파일 생성 수정
# hive-site.xml 을 만들어준다.
sudo vi $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&serverTimezone=Asia/Seoul</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1234</value>
<description>hivepassword for connecting to mysql server</description>
</property>
</configuration>
4. JDBC 설치
curl -o $HIVE_HOME/lib/mysql-connector-java-8.0.22.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.22/mysql-connector-java-8.0.22.jar
5. Hadoop 에 맞는 Guava 교체
# hive guava 버전 확인
ll $HIVE_HOME/lib| grep guav
-rw-r--r-- 1 root staff 2308517 10월 19 2019 guava-19.0.jar
-rw-r--r-- 1 root staff 971309 10월 19 2019 jersey-guava-2.25.1.jar
ll $HADOOP_HOME/share/hadoop/common/lib | grep guav
-rw-r--r-- 1 hdfs hdfs 2747878 7월 29 2022 guava-27.0-jre.jar
-rw-r--r-- 1 hdfs hdfs 3362359 7월 29 2022 hadoop-shaded-guava-1.1.1.jar
-rw-r--r-- 1 hdfs hdfs 2199 7월 29 2022 listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
# guava 교체
sudo rm -rf $HIVE_HOME/lib/guava-19.0.jar
sudo cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/
- hdfs 경로 만들기
hdfs dfs -mkdir /tmp
hdfs dfs -mkdir /hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs dfs -chmod g+w /hive/warehouse
- hivemetastore 초기화
hive --service schemaTool -dbType mysql -initSchema
- hive 실행 오류
아래와 같은 에러가 나면 JDK 버전 오류이다. Hive는 기본적으로 JDK 8 버전에서 동작 되지만 Hadoop 3.x.x 는 JDK 11 버전이 권장이다. 그렇기 때문에 hadoop-env.sh 안의 JAVA_HOME을 Hadoop 동작상태에서 8로만 바꿔주면 동작하게 된다. (물론 JDK 8을 받아야 한다.)
Exception in thread "main" java.lang.ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are in module java.base of loader 'bootstrap')
at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:413)
at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:389)
at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
- hive
hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive-3.1.3/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = ca3213bf-f1b2-41a5-8835-e8ccb485bffb
Logging initialized using configuration in jar:file:/opt/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true
Hive Session ID = d9e58f1e-00b8-4670-9450-815802350d5c
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>