Useful interactive tool when merging or binding objects together. It lists the names of elements that differ in presence or class across multiple datasets. Before running rbind, you may want to check the compatibility of the data.

compareCols(
  ...,
  list.data,
  keep.names = TRUE,
  test.equal = FALSE,
  diff.only = TRUE,
  cols.wanted,
  fun.class = base::class,
  quiet,
  as.fun,
  keepNames,
  testEqual
)

Arguments

...

objects which element names to compare

list.data

As alternative to ..., you can supply the data sets in a list here.

keep.names

If TRUE, the original dataset names are used in reported table. If not, generic x1, x2,... are used. The latter may be preferred for readability.

test.equal

Do you just want a TRUE/FALSE to whether the names of the two objects are the same? Default is FALSE which means to return an overview for interactive use. You might want to use TRUE in programming. However, notice that this check may be overly rigorous. Many classes are compatible enough (say numeric and integer), and compareCols doesn't take this into account.

diff.only

If TRUE, don't report columns where no difference found. Default is TRUE if number of data sets supplied is greater than one. If only one data set is supplied, the full list of columns is shown by default.

cols.wanted

Columns of special interest. These will always be included in overview and indicated by a prepended * to the column names. This argument is often useful when you start by defining a set of columns that you want to end up with by combining a number of data sets.

fun.class

the function that will be run on each column to check for differences. base::class is default. Notice that the alternative base::typeof is different in certain ways. For instance, typeof will not report a difference on numeric vs difftime. You could basically submit any function that takes a vector and returns a single value.

quiet

The default is to give some information along the way on what data is found. But consider setting this to TRUE for non-interactive use. Default can be configured using NMdataConf.

as.fun

A function that will be run on the result before returning. If first input data set is a data.table, the default is to return a data.table, if not the default is to return a data.frame. Use whatever to get what fits in with your workflow. Default can be configured with NMdataConf.

keepNames

Deprecated. Use keep.names instead.

testEqual

Deprecated. Use test.equal instead.

Value

A data.frame with an overview of elements and their classes of objects in ... Class as defined by as.fun.

Details

technically, this function compares classes of elements in lists. However, in relation to NMdata, this will most of the time be columns in data.frames.

Despite the name of the argument fun.class, it can be any function to be evaluated on each element in `...`. See examples for how to extract SAS labels on an object read with `read_sas` from the `haven` package.

See also

Other DataWrangling: dims(), listMissings()

Examples

## get SAS labels from objects read with haven::read_sas
if (FALSE) {
compareCols(...,fun.class=function(x)attributes(x)$label)
}