In the Spark SQL table, there are often many small files (the size is much smaller than the HDFS block size). In this case, Spark will start more Task to process these small files. When there is a Shuffle operation in the SQL logic, will greatly increase the number of hash buckets, which will seriously affect performance.

Posted by: Pdfprep Category: H13-711-ENU Tags: , ,

In the Spark SQL table, there are often many small files (the size is much smaller than the HDFS block size). In this case, Spark will start more Task to process these small files. When there is a Shuffle operation in the SQL logic, will greatly increase the number of hash buckets, which will seriously affect performance.
A . True
B . False

Answer: A

Leave a Reply

Your email address will not be published.