[Hive] Hive Architecture

less than 1 minute read

Hive Architecture

Driver

  • fechtes the required APIs needed for the query that are modeled on JDBC & ODBC interfaces
  • convert the Hive query into MR program from the help for the Compiliier

Compiler

  • convert the Hive query into MR programs
  • does semantic analysis of the program
  • generates a execution plan with the help of a metastore

Metastore

  • database storing all the structured information of the tables, partitions, number of columns in the tables, column data types, serializers, deserializers, etc
  • by default Hive uses built-in Derby SQL server as metastore, not used in production since it provides single process storage(can’t run two simultaneous instances of Hive CLI)

Execution Engine

  • connected with Hadoop
  • executes a plan created by the compiler
  • interact with the resourcemanager & namenode to fetch desired output data from the HDFS

Tags:

Categories:

Updated: