Loan

Sep 22 2017

Using Avro in MapReduce jobs with Hadoop, Pig, Hive – Michael G #hadoop-streaming, #avro #hadoop #pig #hive #mapreduce #streaming #snappy #compression #codec #data #serialization #format #tutorial #howto

# Using Avro in MapReduce Jobs With Hadoop, Pig, Hive Apache Avro is a very popular data serialization format in the Hadoop technology stack. In this article I show code examples of MapReduce jobs in Java, Hadoop Streaming, Pig and Hive that read and/or write data in Avro format. We will use a small, Twitter-like data set as input for our example MapReduce jobs. The latest version of this article and the corresponding code examples are available at avro-hadoop-starter on GitHub. Requirements The examples require the following software versions: Gradle 1.3+ (only for the Java examples) Java JDK 7 (only …