edit-icon download-icon

Parameter description

Last Updated: Jan 02, 2018

The following parameter configurations can be used in Spark code:

Attribute Name Default Value Description
spark.hadoop.fs.oss.accessKeyId None AccessKey ID needed for the access to OSS (optional).
spark.hadoop.fs.oss.accessKeySecret None AccessKey Secret needed for the access to OSS (optional).
spark.hadoop.fs.oss.securityToken None STS token needed for the access to OSS (optional).
spark.hadoop.fs.oss.endpoint None The endpoint for the access to OSS (optional).
spark.hadoop.fs.oss.multipart.thread.number 5 Concurrency of OSS upload part copy.
spark.hadoop.fs.oss.copy.simple.max.byte 134217728 The size ceiling of the file for the copy within OSS using general interface.
spark.hadoop.fs.oss.multipart.split.max.byte 67108864 The size ceiling of the file slices for the copy within OSS using general interface.
spark.hadoop.fs.oss.multipart.split.number 5 The number of the file slices for the copy within OSS using general interfaces. By default, it is consistent with the copy concurrency.
spark.hadoop.fs.oss.impl com.aliyun.fs.oss.nat.NativeOssFileSystem OSS file system implementation class.
spark.hadoop.fs.oss.buffer.dirs /mnt/disk1,/mnt/disk2,… The local temporary files directory of OSS. Be default, the data disk of the cluster is used.
spark.hadoop.fs.oss.buffer.dirs.exists false Whether the temporary files directory of OSS is ensured to be existent.
spark.hadoop.fs.oss.client.connection.timeout 50000 Connection timeout of the OSS Client (Unit: millisecond).
spark.hadoop.fs.oss.client.socket.timeout 50000 The socket timeout of the OSS Client (Unit: millisecond).
spark.hadoop.fs.oss.client.connection.ttl -1 Connection survival time.
spark.hadoop.fs.oss.connection.max 1024 The maximum connection count.
spark.hadoop.job.runlocal false When OSS serves as the data source, if you want to debug and run Spark code locally, this item must be set to “true”. Otherwise, set it to “false”.
spark.logservice.fetch.interval.millis 200 The time interval of receivers getting data from LogHub.
spark.logservice.fetch.inOrder true Whether to consume shard data in order.
spark.logservice.heartbeat.interval.millis 30000 The interval of the heartbeat of the consumption process.
spark.mns.batchMsg.size 16 The number of MNS messages pulled in batch, with a maximum value of 16.
spark.mns.pollingWait.seconds 30 The wait time interval of pulling when the MNS queue is empty.
spark.hadoop.io.compression.codec.snappy.native false Identify whether Snappy file is a standard Snappy file. By default, Hadoop recognizes Hadoop-modified Snappy files.
Thank you! We've received your feedback.