As a followup to my initial post regarding Apache Pig’s inability to quickly load many small files in Pig 0.10 and newer, I wanted to share a simple fix that worked for me courtesy of in-depth research by Amazon Support Team (+ Engineers).
Basically around Pig 0.10.0, PigStorage builds a hidden schema file in an attempt to determine your file’s schema. By passing the ‘-noschema’ flag to PigStorage, we see far improved performance.
a = LOAD '/files/*' USING PigStorage('\t','-noschema') AS (field1:int, field2:chararray);
Much better.
Deprecated: Creation of dynamic property WP_Term::$cat_ID is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
378
Deprecated: Creation of dynamic property WP_Term::$category_count is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
379
Deprecated: Creation of dynamic property WP_Term::$category_description is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
380
Deprecated: Creation of dynamic property WP_Term::$cat_name is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
381
Deprecated: Creation of dynamic property WP_Term::$category_nicename is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
382
Deprecated: Creation of dynamic property WP_Term::$category_parent is deprecated in
/home/garrens3/public_html/blog/wp-includes/category.php on line
383
Categories
Default