Apache Parquet Java Vulnerability CVE-2025-46762 RCE Risk
A vulnerability has been identified in Apache Parquet Java, which could leave systems exposed to remote code execution (RCE) attacks. Apache Parquet contributor Gang Wu discovered, this flaw, tracked as CVE-2025-46762, in the parquet-avro module and publicly disclosed it on May 2.
This security issue impacts all versions of Apache Parquet Java up to and including version 1.15.1, allowing malicious actors to execute arbitrary code on vulnerable systems.
Technical Breakdown of CVE-2025-46762
At the core of this vulnerability is the insecure schema parsing process within the parquet-avro module. The flaw enables attackers to inject malicious code into the metadata of a Parquet file, specifically within the Avro schema. When a vulnerable system reads the file, this malicious code is automatically executed, paving the way for Remote Code Execution (RCE).
For systems utilizing the “specific” or “reflect” data models (rather than the safer “generic” model), the risk is especially pronounced. While the “generic” model remains unaffected by this vulnerability, the default configuration of trusted packages still leaves certain code execution paths open, potentially allowing the exploit to be triggered by pre-approved Java packages, such as java.util.
Affected Systems and Scope of the Issue
The impact of CVE-2025-46762 extends to all Apache Parquet Java versions up to 1.15.1. A wide range of applications, especially those leveraging the parquet-avro module in big data frameworks like Apache Spark, Hadoop, and Flink, are vulnerable to this threat. These platforms rely on the module for deserialization and schema parsing, which opens a potential attack surface if the system is reading Parquet files with malicious Avro schema data.
Apache Parquet Java 1.15.2 Release Notes (GitHub)
For organizations managing data pipelines, especially those processing Parquet files in big data ecosystems, the threat is considerable. If unpatched, an attacker could inject malicious Parquet files into the data stream, enabling exploitation through backend vulnerabilities.
Your browser does not support the video tag.
Mitigation Strategies
The Apache Software Foundation has urged all users to address this issue urgently. There are two primary mitigation strategies available:
- Upgrade to Apache Parquet Java 1.15.2: This release fully resolves the issue by tightening the boundaries on trusted packages, ensuring that malicious code cannot execute through the existing configuration.
- Patch for Users on Version 1.15.1: For those unable to immediately upgrade, it is recommended to set the JVM system property -Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES=”” to empty. This will mitigate the risk by blocking the execution of code from potentially malicious packages.
Moreover, organizations are advised to audit their data pipelines to prioritize the use of the generic Avro model, which remains impervious to vulnerability. Implementing this model wherever feasible can reduce the risk of RCE attacks via CVE-2025-46762.
Unpatched systems vulnerable to CVE-2025-46762 face not only direct attacks but also the risk of supply chain exploits, where compromised Parquet files could trigger backend execution of malicious code, leading to widespread system failures.
Security experts have highlighted the severe threat of Remote Code Execution (RCE), which can result in data breaches, unauthorized access, and other malicious activities. Given the nature of this vulnerability and its impact on large-scale data environments, quick action is essential.
Users of Apache Parquet Java versions up to 1.15.1 are strongly advised to upgrade to version 1.15.2 or apply the necessary patches to mitigate these risks, ensuring the protection of their systems against exploitation.