A simple and tasty explanation of the MapReduce process:

Start with a bowl of 4 colored Jelly Beans (Red, Green, Blue, and Yellow). You don’t know exactly how many JBs are in the bowl, nor do you know how many of each JBs are in the bowl. But naturally you want to know. Because why would you not want to know? 🙂

MapReduce would process it in a similar way to this:

  • Split JBs into smaller piles of an approximate weight (1kg each pile for 4 total piles)
  • You and 3 friends work together to separate your given pile into 4 separate piles for each color (this is the mapping phase)
  • Once the four of you are done separating your piles, you now each are assigned to counting a different color, so now the piles are shuffled around on the table until each person has only the color pile they are assigned to work on (this is the sorting/shuffling phase)
  • Now that each person has their pile, you each count your individual piles (this is the reducing phase)
  • At the end of the reduce phase, you have 4 piles, each counted separately. You can now chose to reduce further and sum up all four counts to get the total count.
  • You started with a bunch of unordered, unknown quantity of jelly beans, but now they are ordered and counted!

    Now time to celebrate by eating your hard work. Num num num


    Simple parallelized processing with GIL languages (Python, Ruby, etc) Pseudo-Normalized Database Engine Concept

    Leave a Reply

    Your email address will not be published.