Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
{plyr} is long-superseded. The package itself only uses
plyr::count()
; a vignette also usesround_any()
.I was pursuing just dropping {plyr} and re-implementing in base, but gave up for three main reasons:
table(<data.frame>)
may fail if there are very fewduplicates and each column is of high cardinality, meaning
table(x)
would have a very large number of 0 entries thatneed to be computed and dropped (
plyr::count()
skips them).interaction(..., drop=TRUE)
+tapply()
to imitate this, but it's hard to genericallyreconstruct the un-interacted levels needed to build an
equivalent data.frame -- basically, we'd need to, for full
generality, use a
sep=<str>
where<str>
is not present inany of the unique values of any of the columns of
x
in orderfor
strsplit(<level>, <sep>)
to uniquely map back.vapply(split(x, x), nrow, integer(1L))
is alsoappealingly simple, but
split()
always drops missing levels(https://bugs.r-project.org/show_bug.cgi?id=18899) --> we'd
need an onerous/ugly loop over the columns to replace missing
observations with a unique
NA
-equivalent, end-sorting sentinel.Thus the move to {dplyr}, despite it being a non-lightweight choice.
I also applied some code quality fixes to nearby lines:
T
/F
-->TRUE
/FALSE
.1:<n>
loops replaced byseq_len()
/seq_along()
, as appropriate.x <- c(); for (i in seq_along(y)) x[i] <- foo(y[i])
should pre-initializex
to belength(y)
.return()
.