Skip to content

Conversation

desruisseaux
Copy link
Contributor

(Copy of #242, which was accidentally merged. The merge has been immediately cancelled)
Remove the dependency to Plexus, replaced by more reliance on java.nio.

Work items

  • Replacement of Plexus includes/excludes filters by java.nio.file.PathMatcher. A benefit is the support of different syntax, at least "glob" and "regex". If no syntax is specified, default to the "glob" syntax with modifications for reproducing the behavior of Plexus filters (see below).
  • Use java.nio.file.FileVisitor for walking over files and directory trees. Consequently, symbolic links are now followed by FileVisitor instead of by maven-clean-plugin itself. An advantage is that FileVisitor is safe against infinite loops when there is cycles. Also, file attributes (whether the file is regular, a directory or a link) are queried only once per file.
  • Changes in some logging messages. The "Deleting XYZ" message is replaced by either "Deleted XYZ" if the deletion has been successful, or "Failed to delete XYZ" in case of failure (i.e., "XYZ" is not logged twice in case of failure).
  • The IOException throws by Java is no longer wrapped in another IOException, so that the callers can catch an exception of the specific sub-type if desired. The exception is also thrown earlier, before it causes another exception. The difference can be seen in the tests: a deletion fails because unauthorized, but the error that was reported to the user was not the AccessDeniedException. Instead, it was a DirectoryNotEmptyException thrown when the plugin tried to delete the directory that contains the file that the plugin failed to delete. After this commit, the exception thrown is the original AccessDeniedException.

Matcher syntax

The "glob" syntax of java.nio.file.PathMatcher seems to have slightly different rules than Plexus. In particular, the tests suggest that the ** pattern means "0 or more directories" in Maven 3 whereas the "glob" syntax of PathMatcher seems to understand ** as "1 or more directories". This commit applies the following rules:

  • If the "glob" syntax or any other syntax was explicitly specified, use the pattern verbatim. No modification applied.
  • Otherwise, modify the pattern as below:
    • Replace all occurrences of the OS-specific separator by / (the standardized separator expected by PathMatcher).
    • For every occurrence of **, generate new patterns without the ** in order to simulate the case of 0 directory.
    • Prepend glob: to the resulting pattern.

Risk

This commit is a significant changes. While the JUnir and integration tests pass, there is still a risk for some unforeseen behavioural changes. Differences compared to the previous version will be addressed as they are reported. There is also non-trivial optimizations for deciding whether to skip and entire directory (those optimizations were already present in the previous version, but behind Plexus).

Future work

If experience shows that the new version is working well, maybe we should move the Selector class and an abstract FileVisitor base class to Maven core for allowing other plugins, such as maven-compiler-plugin, to use it.

…io`.

This commit contains the following work items:

* Replacement of Plexus includes/excludes filters by `java.nio.file.PathMatcher`.
  One benefit is the support of different syntax, at least "glob" and "regex".
  If no syntax is specified, default to the "glob" syntax with modifications
  for reproducing the behavior of Plexus filters when it differs from "glob".

* Use `java.nio.file.FileVisitor` for walking over files and directory trees.
  Consequently, the following of symbolic links is now handled by `FileVisitor`
  instead of by `maven-clean-plugin` itself. An advantage is that `FileVisitor`
  is safe against infinite loops when there is cycles in the symbolic links.
  Also, file attributes (whether the file is regular, a directory or a link)
  are queried only once per file.

* Changes in some logging messages and exceptions. "Deleting XYZ" is replaced
  by either "Deleted XYZ" if the deletion has been successful, or replaced by
  "Failed to delete XYZ" in case of failure (i.e., "XYZ" is not logged twice
  in case of failure).

* The `IOException` throws by Java is no longer wrapped in another `IOException`,
  so that the callers can catch an exception of the specific sub-type if desired.
  The exception is also thrown earlier, before it causes another exception.
  The difference can be seen in the tests: a deletion fails because unauthorized,
  but the error that was reported to the user was not the `AccessDeniedException`.
  Instead, it was a `DirectoryNotEmptyException` thrown when the plugin tried to
  delete the directory that contains the file that the plugin failed to delete.
  After this commit, the exception throws is the original `AccessDeniedException`.
@desruisseaux desruisseaux self-assigned this Mar 31, 2025
@slachiewicz slachiewicz added the enhancement New feature or request label Mar 31, 2025
@slachiewicz slachiewicz added this to the 4.0.0-beta-3 milestone Mar 31, 2025
@desruisseaux
Copy link
Contributor Author

Now all tests pass, Windows included.

@gnodet
Copy link
Contributor

gnodet commented Apr 1, 2025

Future work

If experience shows that the new version is working well, maybe we should move the Selector class and an abstract FileVisitor base class to Maven core for allowing other plugins, such as maven-compiler-plugin, to use it.

The scanner is used in maven-filtering which is used in maven-resources-plugin. The fact that the Maven 4 API exposes PathMatcher while the plexus scanner uses String[] is a problematic. So yes, definitely interested in having this piece of code reusable.

@desruisseaux
Copy link
Contributor Author

The fact that the Maven 4 API exposes PathMatcher while the plexus scanner uses String[] is a problematic.

We can also revert this API to plain strings, and complete them by a getMatcher() method which return a single, global Optional<PathMatcher>. The Selector class implements PathMatcher, so it can be seen as a single matcher doing all the includes / excludes work.

@gnodet
Copy link
Contributor

gnodet commented Apr 3, 2025

The fact that the Maven 4 API exposes PathMatcher while the plexus scanner uses String[] is a problematic.

We can also revert this API to plain strings, and complete them by a getMatcher() method which return a single, global Optional<PathMatcher>. The Selector class implements PathMatcher, so it can be seen as a single matcher doing all the includes / excludes work.

I think that would make sense. The point is that the resource plugin can also be configured from the plugin, so with Path or String, but definitely not PathMatcher.

@desruisseaux
Copy link
Contributor Author

I think that would make sense. The point is that the resource plugin can also be configured from the plugin, so with Path or String, but definitely not PathMatcher.

Okay, will prepare a pull request maybe tonight.

@gnodet
Copy link
Contributor

gnodet commented Apr 3, 2025

I think that would make sense. The point is that the resource plugin can also be configured from the plugin, so with Path or String, but definitely not PathMatcher.

Okay, will prepare a pull request maybe tonight.

Actually, maybe the problem was just in the implementation of the PathMatcher provided by Maven when building the SourceRoot, as the implementation in Maven does not deal with glob syntax slight difference. I did not investigate too much and went back to using strings when working on the maven-filtering / maven-resource-plugin.
But in order to make sure everything use a coherent syntax, it may make sense that Maven provides the PathMatcher for all plugins to leverage.
In which case, moving the Selector class from this pattern inside maven-core would make sense, and have SourceRoot return a single PathMatcher instead of getIncludes() and getExcludes()...

@gnodet
Copy link
Contributor

gnodet commented Apr 4, 2025

I think that would make sense. The point is that the resource plugin can also be configured from the plugin, so with Path or String, but definitely not PathMatcher.

Okay, will prepare a pull request maybe tonight.

Actually, maybe the problem was just in the implementation of the PathMatcher provided by Maven when building the SourceRoot, as the implementation in Maven does not deal with glob syntax slight difference. I did not investigate too much and went back to using strings when working on the maven-filtering / maven-resource-plugin. But in order to make sure everything use a coherent syntax, it may make sense that Maven provides the PathMatcher for all plugins to leverage. In which case, moving the Selector class from this pattern inside maven-core would make sense, and have SourceRoot return a single PathMatcher instead of getIncludes() and getExcludes()...

Forget the above. Even just for the sake of being able to obtain Resource objects from the project to support Maven 3 plugins, we'll need the includes and excludes as a collection of strings. So yes, we definitely need to revert {{SourceRoot} to using Strings.

@gnodet
Copy link
Contributor

gnodet commented Apr 4, 2025

I think that would make sense. The point is that the resource plugin can also be configured from the plugin, so with Path or String, but definitely not PathMatcher.

Okay, will prepare a pull request maybe tonight.

Actually, maybe the problem was just in the implementation of the PathMatcher provided by Maven when building the SourceRoot, as the implementation in Maven does not deal with glob syntax slight difference. I did not investigate too much and went back to using strings when working on the maven-filtering / maven-resource-plugin. But in order to make sure everything use a coherent syntax, it may make sense that Maven provides the PathMatcher for all plugins to leverage. In which case, moving the Selector class from this pattern inside maven-core would make sense, and have SourceRoot return a single PathMatcher instead of getIncludes() and getExcludes()...

Forget the above. Even just for the sake of being able to obtain Resource objects from the project to support Maven 3 plugins, we'll need the includes and excludes as a collection of strings. So yes, we definitely need to revert {{SourceRoot} to using Strings.

Created apache/maven#2232

@desruisseaux
Copy link
Contributor Author

We have to reviewers. Any objection that I perform the merge? If okay, I would turn the "add a force option" branch into a new pull request after this merge.

@desruisseaux desruisseaux merged commit 1620e1e into apache:master May 3, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants