Retrieving and downloading data

get_data – Get data from MAST or hard disk

join_quarters – Stitch multiple quarters of data together

A more advanced Reducer, using Python iterators and generators. From http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

join_quarters.main(instream=<open file '<stdin>', mode 'r'>, outstream=None)