Amazon EC2 Category

Running local mrjob streaming hadoop jobs

Follow the steps below to run an local mrjob. In this example I run an mrjob to calculate word frequency. Prereq: Needs python 2.6 or 2.7 installed this to work. Step 1. Download mrjob: https://github.com/Yelp/mrjob Step 2. Navigate to Yelp/mrjob/examples in your terminal Step 3: Create a Dataset download a dataset from http://www.infochimps.com. Step 4: […]

Rate this:

Read More

Map and Reduce Python Script Example

Below is an example of your first Map, Reduce and Data Sample. Let’s look at the Mapper.py file: import sys from numpy import mat, mean, power #read input folder line by line def read_input(file): for line in file: #returns file input with training char removed (same as Trim()) yield line.rstrip() #creates a list of input […]

Rate this:

Read More

Running Map Reduce on Amazon Elastic MapReduce

Below is the steps to write your fist Map Reduce on Amazon EMR. Step 1: Register on Amazon: aws.amazon.com/free After you login we will complete the following two parts: Part 1: Create the buckets for our input file, map and reduce file Part 2: Create our map reduce job Part 1: Create the buckets for […]

Rate this:

Read More