[Hive] Hive Architecture
Hive Architecture
Driver
- fechtes the required APIs needed for the query that are modeled on JDBC & ODBC interfaces
- convert the Hive query into MR program from the help for the Compiliier
Compiler
- convert the Hive query into MR programs
- does semantic analysis of the program
- generates a execution plan with the help of a metastore
Metastore
- database storing all the structured information of the tables, partitions, number of columns in the tables, column data types, serializers, deserializers, etc
- by default Hive uses built-in Derby SQL server as metastore, not used in production since it provides single process storage(can’t run two simultaneous instances of Hive CLI)
Execution Engine
- connected with Hadoop
- executes a plan created by the compiler
- interact with the resourcemanager & namenode to fetch desired output data from the HDFS