Part I: Data Logistics 2: Moving Data in and out of Hadoop 3: Data Serialization: Working with Text and BeyondPart II: Big Data Patterns 4: Applying MapReduce Patterns to Big Data 5: Streamlining HDFS for Big Data 6: Measuring and Optimizing Performance
Part III: Data Science 7: Utilizing Data Structures and Algorithms 8. Applying Statistics 9. Machine Learning
Part IV: Taming the Elephant 10. Hive 11. Pig 12. Crunch and Other Technologies 13. Testing and Debugging 14: Job Coordination 15. Proficient Administration
Appendixes A: Related Technologies B: Hadoop Built-in Ingress and Egress Tools C: HDFS Dissected D: Optimized MapReduce Join Frameworks
If you are new to Hadoop or a manager and want to learn how Hadoop can help solve your big data challenges then this book is for you.