| Title: | File System Utility Functions for 'NeuroAnatomy Toolbox' |
|---|---|
| Description: | Utility functions that may be of general interest but are specifically required by the 'NeuroAnatomy Toolbox' ('nat'). Includes functions to provide a basic make style system to update files based on timestamp information, file locking and 'touch' utility. Convenience functions for working with file paths include 'abs2rel', 'split_path' and 'common_path'. Finally there are utility functions for working with 'zip' and 'gzip' files including integrity tests. |
| Authors: | Gregory Jefferis [aut, cre] (ORCID: <https://orcid.org/0000-0002-0587-9355>) |
| Maintainer: | Gregory Jefferis <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.6.1 |
| Built: | 2026-05-11 06:25:31 UTC |
| Source: | https://github.com/natverse/nat.utils |
Remove common part of two paths, leaving relative path
abs2rel(path, stempath = getwd(), StopIfNoCommonPath = FALSE)abs2rel(path, stempath = getwd(), StopIfNoCommonPath = FALSE)
path |
Paths to make relative |
stempath |
Root to which |
StopIfNoCommonPath |
Error if no path in common |
Character vector containing relative path
jefferis
Other path_utils:
common_path(),
split_path()
path = "/Volumes/JData/JPeople/Sebastian/images" abs2rel(path,'/Volumes/JData')path = "/Volumes/JData/JPeople/Sebastian/images" abs2rel(path,'/Volumes/JData')
Find common prefix of two or more (normalised) file paths
common_path(paths, normalise = FALSE, fsep = .Platform$file.sep)common_path(paths, normalise = FALSE, fsep = .Platform$file.sep)
paths |
Character vector of file paths |
normalise |
Whether to normalise |
fsep |
Optional path separator (defaults to |
Note that for absolute paths, the common prefix will be returned
e.g. common_path(c("/a","/b")) is "/"
Note that normalizePath 1) operates according to the
conventions of the current runtime platform 2) is called with
winslash=.Platform$file.sep which means that normalised paths will
eventually end up separated by "\" by default on Windows rather than by
"//", which is normalizePath's standard behaviour.
Character vector of common prefix, "" when there is no common
prefix, or the original value of paths when fewer than 2 paths were
supplied.
Other path_utils:
abs2rel(),
split_path()
common_path(c("/a","/b")) common_path(c("/a/b/","/a/b")) common_path(c("/a/b/d","/a/b/c/d")) common_path(c("/a/b/d","/b/c/d")) common_path(c("a","b")) common_path(c("","/a")) common_path(c("~","~/")) common_path(c("~/a/b/d","~/a/b/c/d"), normalise = FALSE) common_path(c("~","~/"), normalise = FALSE)common_path(c("/a","/b")) common_path(c("/a/b/","/a/b")) common_path(c("/a/b/d","/a/b/c/d")) common_path(c("/a/b/d","/b/c/d")) common_path(c("a","b")) common_path(c("","/a")) common_path(c("~","~/")) common_path(c("~/a/b/d","~/a/b/c/d"), normalise = FALSE) common_path(c("~","~/"), normalise = FALSE)
Swap names of two files (by renaming first to a temporary file)
file.swap(f1, f2)file.swap(f1, f2)
f1, f2
|
Paths to files |
logical indicating success
jefferis
Construct paths to files in the extdata folder of a package
find_extdata(..., package = NULL, firstpath = NULL, Verbose = FALSE)find_extdata(..., package = NULL, firstpath = NULL, Verbose = FALSE)
... |
components of the path (eventually appended to location of
|
package |
The package to search |
firstpath |
An additional location to check before looking anywhere else |
Verbose |
Whether to print messages about failed paths while looking for extdata |
inst/extdata is the conventional place to store data that is
not managed directly by the standard R package mechanisms. Unfortunately
its location changes at different stages of the package build/load process,
since in the final package all folders underneath inst are moved
directly to the package root.
A character vector containing the constructed path
Other extdata:
read_nl_from_parts(),
save_nl_in_parts()
find_extdata(package='nat.utils')find_extdata(package='nat.utils')
Reads the crc from a gzip file, assuming it is the last 4 bytes of the file. First checks for a valid gzip magic number at the start of the file.
gzip.crc(f)gzip.crc(f)
f |
Path to a gzip file |
CRC32 is not a strong hash like SHA1 or even MD5, but it does provide a basic hash of the uncompressed contents of the gzip file. NB CRCs are stored in little endian byte order regardless of platform.
hexadecimal formatted
rdsfile=system.file('help/aliases.rds') gzip.crc(rdsfile)rdsfile=system.file('help/aliases.rds') gzip.crc(rdsfile)
Check if a file is a gzip file
is.gzip(f)is.gzip(f)
f |
Path to file to test |
logical indicating whether f is in gzip format (or NA
if the file cannot be accessed)
notgzipfile=tempfile() writeLines('not a gzip', notgzipfile) is.gzip(notgzipfile) con=gzfile(gzipfile<-tempfile(),open='wt') writeLines('This one is gzipped', con) close(con) is.gzip(gzipfile) unlink(c(notgzipfile,gzipfile))notgzipfile=tempfile() writeLines('not a gzip', notgzipfile) is.gzip(notgzipfile) con=gzfile(gzipfile<-tempfile(),open='wt') writeLines('This one is gzipped', con) close(con) is.gzip(gzipfile) unlink(c(notgzipfile,gzipfile))
Split inputs into a number of chunks
make_chunks(x, size = length(x), nchunks = NULL, chunksize = NULL)make_chunks(x, size = length(x), nchunks = NULL, chunksize = NULL)
x |
A vector of inputs e.g. ids, neurons etc (optional, see examples) |
size |
The number of inputs (defaults to |
nchunks |
The desired number of chunks |
chunksize |
The desired number of items per chunk |
You must specify exactly one of nchunks and chunksize.
The elements of x split into a list of chunks or (when x is
missing) a vector of integer indices in the range 1:nchunks
specifying the chunk for each input element .
make_chunks(1:11, nchunks=2) make_chunks(size=11, chunksize=2)make_chunks(1:11, nchunks=2) make_chunks(size=11, chunksize=2)
Creates a lock file on disk containing a message that should identify the current R session. Will return FALSE is someone else has already made a lockfile. In order to avoid race conditions typical on NFS mounted drives makelock appends a unique message to the lock file and then reads the file back in. Only if the unique message is the first line in the file will makelock return TRUE.
removelock displays a warning and returns false if lockfile cannot
be removed. No error message is given if the file does not exist.
makelock(lockfile, lockmsg, CreateDirectories = TRUE) removelock(lockfile)makelock(lockfile, lockmsg, CreateDirectories = TRUE) removelock(lockfile)
lockfile |
Path to lockfile |
lockmsg |
Character vector with message to be written to lockfile |
CreateDirectories |
Recursively create directories implied by lockfile path |
logical indicating success
jefferis
makelock(lock<-tempfile()) stopifnot(!makelock(lock)) removelock(lock)makelock(lock<-tempfile()) stopifnot(!makelock(lock)) removelock(lock)
Return number of cpus (or a default on failure)
ncpus(default = 1L)ncpus(default = 1L)
default |
Number of cores to assume if detectCores fails |
Integer number of cores
integer number of cores always >=1 for default values
jefferis
ncpus()ncpus()
Make a neuronlist object from two separate files
read_nl_from_parts(datapath, dfpath = NULL, package = NULL, ...)read_nl_from_parts(datapath, dfpath = NULL, package = NULL, ...)
datapath |
Path to the data object |
dfpath |
Path to the data.frame object (constructed from |
package |
Character vector naming a package whose extdata directory will
be sought (with |
... |
Additional arguments passd to |
It is expected that you will use this in an R source file within the data folder of a package. See Examples for more information.
If dfpath is missing, it will be inferred from datapath
according to the following pattern:
myblob.rda main data file
myblob.df.rda metdata file
a neuronlist object
Other extdata:
find_extdata(),
save_nl_in_parts()
## Not run: # you could use the following in a file # data/make_data.R delayedAssign('pns', read_nl_from_parts('pns.rds', package='testlazyneuronlist')) # based on objects created by save_nl_in_parts(pns) # which would make: # - inst/extdata/pns.rds # - inst/extdata/pns.df.rds ## End(Not run)## Not run: # you could use the following in a file # data/make_data.R delayedAssign('pns', read_nl_from_parts('pns.rds', package='testlazyneuronlist')) # based on objects created by save_nl_in_parts(pns) # which would make: # - inst/extdata/pns.rds # - inst/extdata/pns.df.rds ## End(Not run)
Run a command if input files are newer than outputs
RunCmdForNewerInput( cmd, infiles, outfiles, Verbose = FALSE, UseLock = FALSE, Force = FALSE, ReturnInputTimes = FALSE, ... )RunCmdForNewerInput( cmd, infiles, outfiles, Verbose = FALSE, UseLock = FALSE, Force = FALSE, ReturnInputTimes = FALSE, ... )
cmd |
An |
infiles |
Character vector of path to one or more input files |
outfiles |
Character vector of path to one or more output files |
Verbose |
Write information to consolse (Default FALSE) |
UseLock |
Stop other processes working on this task (Default FALSE) |
Force |
Ignore file modification times and always produce output if input files exist. |
ReturnInputTimes |
Return mtimes of input files (default FALSE) |
... |
additional parameters passed to |
cmd can be an R expression, which is
evaluated if necessary in the environment calling
RunCmdForNewerInput, a string to be passed to system
or NULL/NA in which cases the files are checked and TRUE or
FALSE is returned depending on whether action is required.
When UseLock=TRUE, the lock file created is called outfiles[1].lock
When ReturnInputTimes=TRUE, the input mtimes are returned as an
attribute of a logical value (if available).
logical indicating if cmd was run or for an R expression, eval(cmd)
## Not run: RunCmdForNewerInput(expression(myfunc("somefile"))) ## End(Not run)## Not run: RunCmdForNewerInput(expression(myfunc("somefile"))) ## End(Not run)
Save a neuronlist object into separate data and metadata parts
save_nl_in_parts( x, datapath = NULL, dfpath = NULL, extdata = TRUE, format = c("rds", "rda"), ... )save_nl_in_parts( x, datapath = NULL, dfpath = NULL, extdata = TRUE, format = c("rds", "rda"), ... )
x |
A neuronlist object to save in separate parts |
datapath |
Optional path to new data file (constructed from name of
|
dfpath |
Optional path to new metadata file (constructed from
|
extdata |
Logical indicating whether the files should be saved into
extdata folder (default |
format |
Either |
... |
Additional arguments passed to |
Saves a neuronlist into separate data and metadata parts. This can
significantly mitigate git repository bloat since only the metadata object
will change when any metadata is updated. By default the objects will be
saved into the package inst/extdata folder with sensible names based
on the incoming object. E.g. if x=mypns the files will be
mypns.rds
mypns.df.rds
character vector with path to the saved files (returned invisibly)
Other extdata:
find_extdata(),
read_nl_from_parts()
## Not run: save_nl_in_parts(pns) # which would make: # - inst/extdata/pns.rds # - inst/extdata/pns.df.rds save_nl_in_parts(pns, format='rda') # which would make: # - inst/extdata/pns.rda # - inst/extdata/pns.df.rda save_nl_in_parts(pns, 'mypns.rda') # which would make (NB format argument wins): # - inst/extdata/mypns.rds # - inst/extdata/mypns.df.rds ## End(Not run)## Not run: save_nl_in_parts(pns) # which would make: # - inst/extdata/pns.rds # - inst/extdata/pns.df.rds save_nl_in_parts(pns, format='rda') # which would make: # - inst/extdata/pns.rda # - inst/extdata/pns.df.rda save_nl_in_parts(pns, 'mypns.rda') # which would make (NB format argument wins): # - inst/extdata/mypns.rds # - inst/extdata/mypns.df.rds ## End(Not run)
Split file path into individual components (optionally including separators)
split_path( path, include.fseps = FALSE, omit.duplicate.fseps = FALSE, fsep = .Platform$file.sep )split_path( path, include.fseps = FALSE, omit.duplicate.fseps = FALSE, fsep = .Platform$file.sep )
path |
A path with directories separated by |
include.fseps |
Whether to include the separators in the returned
character vector (default |
omit.duplicate.fseps |
Whether to omit duplicate file separators if
|
fsep |
The path separator (default to |
A character vector with one element for each component in the path
(including path separators if include.fseps=TRUE).
Other path_utils:
abs2rel(),
common_path()
split_path("/a/b/c") split_path("a/b/c") parts=split_path("/a/b/c", include.fseps=TRUE) # join parts back up again paste(parts, collapse = "") split_path("a/b//c", include.fseps=TRUE, omit.duplicate.fseps=TRUE) # Windows style split_path("C:\\a\\b\\c", fsep="\\")split_path("/a/b/c") split_path("a/b/c") parts=split_path("/a/b/c", include.fseps=TRUE) # join parts back up again paste(parts, collapse = "") split_path("a/b//c", include.fseps=TRUE, omit.duplicate.fseps=TRUE) # Windows style split_path("C:\\a\\b\\c", fsep="\\")
If neither a time or a reference file is provided then the current time is used. If the file does not already exist, it is created unless Create=FALSE.
touch( file, time, reference, timestoupdate = c("access", "modification"), Create = TRUE )touch( file, time, reference, timestoupdate = c("access", "modification"), Create = TRUE )
file |
Path to file to modify |
time |
Absolute time in POSIXct format |
reference |
Path to a reference file |
timestoupdate |
"access" or "modification" (default both) |
Create |
Logical indicating whether to create file (default TRUE) |
TRUE or FALSE according to success
jefferis
Return information about a zip archive using system unzip command
zipinfo(f)zipinfo(f)
f |
Path to one (or more) files |
Uses system unzip command.
dataframe of information
jefferis
Other ziputils:
zipok()
Verify integrity of one or more zip files
zipok(f, Verbose = FALSE)zipok(f, Verbose = FALSE)
f |
Path to one (or more) files |
Verbose |
Whether to be Verbose (default FALSE) |
Uses system unzip command.
TRUE when file OK, FALSE otherwise
jefferis
Other ziputils:
zipinfo()