Hive 설치





1. Hive 다운


# /opt 이동
cd /opt

# Hive 설치 (23.3.2 기준 3.1.3 최신)
sudo wget https://dlcdn.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz

# tar 압축 해제
sudo tar -zxvf apache-hive-3.1.3-bin.tar.gz

# 폴더명 변경
sudo my apache-hive-3.1.3-bin hive-3.1.3



2. 환경변수


# .bashrc  수정

vi ~/.bashrc

JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
HADOOP_HOME=/opt/hadoop-3.3.4
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
HIVE_HOME=/opt/hive-3.1.3

export JAVA_HOME HADOOP_HOME HADOOP_CONF_DIR HIVE_HOME
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin

source ~/.bashrc



3. 설정 파일 생성 수정


# hive-site.xml 을 만들어준다.
sudo vi $HIVE_HOME/conf/hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&amp;serverTimezone=Asia/Seoul</value>
        <description>metadata is stored in a MySQL server</description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
        <description>MySQL JDBC driver class</description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>user name for connecting to mysql server</description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>1234</value>
        <description>hivepassword for connecting to mysql server</description>
    </property>
</configuration>



4. JDBC 설치


curl -o $HIVE_HOME/lib/mysql-connector-java-8.0.22.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.22/mysql-connector-java-8.0.22.jar



5. Hadoop 에 맞는 Guava 교체


# hive guava 버전 확인
ll $HIVE_HOME/lib| grep guav
-rw-r--r--  1 root staff  2308517 10월 19  2019 guava-19.0.jar
-rw-r--r--  1 root staff   971309 10월 19  2019 jersey-guava-2.25.1.jar


ll $HADOOP_HOME/share/hadoop/common/lib | grep guav
-rw-r--r-- 1 hdfs hdfs 2747878  7월 29  2022 guava-27.0-jre.jar
-rw-r--r-- 1 hdfs hdfs 3362359  7월 29  2022 hadoop-shaded-guava-1.1.1.jar
-rw-r--r-- 1 hdfs hdfs    2199  7월 29  2022 listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar

# guava 교체
sudo rm -rf $HIVE_HOME/lib/guava-19.0.jar
sudo cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/



  1. hdfs 경로 만들기


hdfs dfs -mkdir /tmp
hdfs dfs -mkdir /hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs dfs -chmod g+w /hive/warehouse



  1. hivemetastore 초기화


hive --service schemaTool -dbType mysql -initSchema



  1. hive 실행 오류


아래와 같은 에러가 나면 JDK 버전 오류이다. Hive는 기본적으로 JDK 8 버전에서 동작 되지만 Hadoop 3.x.x 는 JDK 11 버전이 권장이다. 그렇기 때문에 hadoop-env.sh 안의 JAVA_HOME을 Hadoop 동작상태에서 8로만 바꿔주면 동작하게 된다. (물론 JDK 8을 받아야 한다.)

Exception in thread "main" java.lang.ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are in module java.base of loader 'bootstrap')
	at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:413)
	at org.apache.hadoop.hive.ql.session.SessionState.<init>(SessionState.java:389)
	at org.apache.hadoop.hive.cli.CliSessionState.<init>(CliSessionState.java:60)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:236)



  1. hive


hive

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive-3.1.3/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = ca3213bf-f1b2-41a5-8835-e8ccb485bffb

Logging initialized using configuration in jar:file:/opt/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true
Hive Session ID = d9e58f1e-00b8-4670-9450-815802350d5c
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>