datalad_next.gitpathspec

Handling of Git's pathspecs with subdirectory mangling support

This functionality can be used to add support for pathspecs to implementations that rely on Git commands that do not support submodule recursion directly.

class datalad_next.gitpathspec.GitPathSpec(spectypes: tuple[str, ...], dirprefix: str, pattern: str | None)[source]

Bases: object

Support class for patterns used to limit paths in Git commands

From the Git documentation:

Pathspecs are used on the command line of "git ls-files", "git ls-tree", "git add", "git grep", "git diff", "git checkout", and many other commands to limit the scope of operations to some subset of the tree or working tree.

Apart from providing a dedicated type for a pathspec, the main purpose of this functionality is to take a pathspec that is valid in the context of one (top-level) repository, and translate it such that the set of pathspecs given to the same command running on/in a submodule/subdirectory gives the same results, as if the initial top-level invocation reported them (if it even could). See the for_subdir() method for more.

>>> # simple stripping of leading directory
>>> ps = GitPathSpec.from_pathspec_str('dir/*.jpg')
>>> [str(i) for i in ps.for_subdir('dir')]
['*.jpg']
>>> # match against magic pathspecs
>>> ps = GitPathSpec.from_pathspec_str(':(glob)**r/*.jpg')
>>> # longest and shortest match are produced
>>> [str(i) for i in ps.for_subdir('dir')]
[':(glob)**r/*.jpg', ':(glob)*.jpg']
>>> [str(i) for i in ps.for_subdir('root/some/dir')]
[':(glob)**r/*.jpg', ':(glob)*.jpg']
>>> # support for special 'no-pathspec' pathspec
>>> ps = GitPathSpec.from_pathspec_str(':')
>>> ps.is_nopathspecs
True
__str__() str[source]

Generate normalized (long-form) pathspec

dirprefix: str

Directory prefix (pathspec up to the last slash) limiting the scope

for_subdir(subdir: str) list[GitPathSpec][source]

Translate a pathspec into the scope of a subdirectory.

The processing implemented here is purely lexical. This means that it works without matching against actual file system (or Git tree) content. Consequently, to some degree, overly broad results are produced, but at the same time use cases are supported where there is nothing (yet) to match against (e.g., a not-yet-cloned submodule).

A pathspec with a top magic is produced unmodified, as there are defined relative to the root of a repository, not relative to a base directory. As a consequence, such pathspecs will automatically refer to a submodule root when the target directory is contained in one.

Parameters:

subdir (str) -- Relative path in POSIX notation

Returns:

When an empty list is returned, this indicates that the pathsspec cannot be translated to the given subdir, because it does not match the subdir itself. If a pathspec translates to "no pathspecs" (':'), a list with a dedicated ':' pathspec is returned.

Return type:

list

classmethod from_pathspec_str(pathspec: str) GitPathSpec[source]

Parse a string-form pathspec into types, prefix, and pattern

property is_nopathspecs: bool

Whether this pathspec is the "no pathspecs" pathspec, AKA ':'

pattern: str | None

Pattern to match paths against using fnmatch

spectypes: tuple[str, ...]

Long-form pathspec type identifiers

class datalad_next.gitpathspec.GitPathSpecs(pathspecs: Iterable[str | GitPathSpec] | GitPathSpecs | None)[source]

Bases: object

Convenience container for any number of pathspecs (or none)

This class can facilitate implementing support for pathspec-constraints, including scenarios involving submodule recursion.

>>> # can except a "default" argument for no pathspecs
>>> ps = GitPathSpecs(None)
>>> not ps
True
>>> ps.arglist()
[]
>>> # deal with any number of pathspecs
>>> ps = GitPathSpecs(['*.jpg', 'dir/*.png'])
>>> ps.any_match_subdir(PurePosixPath('dummy'))
True
>>> ps.for_subdir(PurePosixPath('dir'))
GitPathSpecs(['*.jpg', '*.png'])
any_match_subdir(path: PurePosixPath) bool[source]

Returns whether any pathspec could match subdirectory content

In other words, False is returned whenever .for_subdir() would raise ValueError.

Parameters:

path (PurePosixPath) -- Relative path of the subdirectory to run the test for.

arglist() list[str][source]

Convert pathspecs to a CLI argument list

This list is suitable for use with any Git command that supports pathspecs, after a -- (that disables the interpretation of further arguments as options).

When no pathspecs are present an empty list is returned.

for_subdir(path: PurePosixPath) GitPathSpecs[source]

Translate pathspecs into the scope of a subdirectory

Raises:

ValueError -- Whenever no pathspec can be translated into the scope of the target directory.