spark解决 org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow

2018-01-13 11:13:47来源:oschina作者:张欢19933人点击

分享

错误


使用spark sql 处理数据时报这个错误


Exception in thread "main" java.sql.SQLException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3107 in stage 308.0 failed 4 times, most recent failure: Lost task 3107.3 in stage 308.0 (TID 620318, XXX): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 1572864, required: 3236381
Serialization trace:
values (org.apache.spark.sql.catalyst.expressions.GenericInternalRow). To avoid this, increase spark.kryoserializer.buffer.max value.
at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:355)
at com.peopleyuqing.tool.SparkJDBC.excuteQuery(SparkJDBC.java:64)
at com.peopleyuqing.main.ContentSubThree.main(ContentSubThree.java:24)

方法


val sparkConf = newSparkConf().setAppName(Constants.SPARK_NAME_APP)
.set("spark.kryoserializer.buffer.max","128m");

原因


原因分析: RDD extends scala.AnyRef withscala.Serializable ,所以在使用textFile ,读取表的数据等大量创建新的rdd,df,ds等 数据集的时候,注意把 这个值调大

最新文章

123

最新摄影

微信扫一扫

第七城市微信公众平台