Skip to content

github.com/ulikunitz/xz leaks memory when decoding a corrupted multiple LZMA archives

Moderate severity GitHub Reviewed Published Aug 28, 2025 in ulikunitz/xz • Updated Aug 29, 2025

Package

gomod github.com/ulikunitz/xz (Go)

Affected versions

<= 0.5.13

Patched versions

0.5.15

Description

Summary

It is possible to put data in front of an LZMA-encoded byte stream without detecting the situation while reading the header. This can lead to increased memory consumption because the current implementation allocates the full decoding buffer directly after reading the header. The LZMA header doesn't include a magic number or has a checksum to detect such an issue according to the specification.

Note that the code recognizes the issue later while reading the stream, but at this time the memory allocation has already been done.

Mitigations

The release v0.5.15 includes following mitigations:

  • The ReaderConfig DictCap field is now interpreted as a limit for the dictionary size.
  • The default is 2 Gigabytes - 1 byte (2^31-1 bytes).
  • Users can check with the [Reader.Header] method what the actual values are in their LZMA files and set a smaller limit using ReaderConfig.
  • The dictionary size will not exceed the larger of the file size and the minimum dictionary size. This is another measure to prevent huge memory allocations for the dictionary.
  • The code supports stream sizes only up to a pebibyte (1024^5).

Note that the original v0.5.14 version had a compiler error for 32 bit platforms, which has been fixed by v0.5.15.

Methods affected

Only software that uses lzma.NewReader or lzma.ReaderConfig.NewReader is affected. There is no issue for software using the xz functionality.

I thank @GregoryBuligin for his report, which is provided below.

Summary

When unpacking a large number of LZMA archives, even in a single goroutine, if the first byte of the archive file is 0 (a zero byte added to the beginning), an error writeMatch: distance out of range occurs. Memory consumption spikes sharply, and the GC clearly cannot handle this situation.

Details

Judging by the error writeMatch: distance out of range, the problems occur in the code around this function.
https://github.com/ulikunitz/xz/blob/c8314b8f21e9c5e25b52da07544cac14db277e89/lzma/decoderdict.go#L81

PoC

Run a function similar to this one in 1 or several goroutines on a multitude of LZMA archives that have a 0 (a zero byte) added to the beginning.

const ProjectLocalPath = "some/path"
const TmpDir = "tmp"

func UnpackLZMA(lzmaFile string) error {
	file, err := os.Open(lzmaFile)
	if err != nil {
		return err
	}
	defer file.Close()

	reader, err := lzma.NewReader(bufio.NewReader(file))
	if err != nil {
		return err
	}

	tmpFile, err := os.CreateTemp(TmpDir, TmpLZMAPrefix)
	if err != nil {
		return err
	}
	defer func() {
		tmpFile.Close()
		_ = os.Remove(tmpFile.Name())
	}()

	sha256Hasher := sha256.New()
	multiWriter := io.MultiWriter(tmpFile, sha256Hasher)

	if _, err = io.Copy(multiWriter, reader); err != nil {
		return err
	}

	unpackHash := hex.EncodeToString(sha256Hasher.Sum(nil))
	unpackDir := filepath.Join(
		ProjectLocalPath, unpackHash[:2],
	)
	_ = os.MkdirAll(unpackDir, DirPerm)

	unpackPath := filepath.Join(unpackDir, unpackHash)

	return os.Rename(tmpFile.Name(), unpackPath)
}

Impact

Servers with a small amount of RAM that download and unpack a large number of unverified LZMA archives

References

@ulikunitz ulikunitz published to ulikunitz/xz Aug 28, 2025
Published to the GitHub Advisory Database Aug 28, 2025
Reviewed Aug 28, 2025
Published by the National Vulnerability Database Aug 28, 2025
Last updated Aug 29, 2025

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
None
Integrity
None
Availability
Low

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L

EPSS score

Exploit Prediction Scoring System (EPSS)

This score estimates the probability of this vulnerability being exploited within the next 30 days. Data provided by FIRST.
(16th percentile)

Weaknesses

Allocation of Resources Without Limits or Throttling

The product allocates a reusable resource or group of resources on behalf of an actor without imposing any restrictions on the size or number of resources that can be allocated, in violation of the intended security policy for that actor. Learn more on MITRE.

CVE ID

CVE-2025-58058

GHSA ID

GHSA-jc7w-c686-c4v9

Source code

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.