Retrieve IPEDS data by establishing which data files are required for filtering (if included) and those required for selected variables.
Arguments
- ipedscall
Current list of parameters carried forward from prior functions in the chain (ignore)
- bind
Row bind all same name survey files (e.g., HD2022 and HD2023)
- join
Join different name survey files by UNITID and year. If
bind = FALSE
, thenjoin
will be set toFALSE
and the function argument ignored.
Value
Depending on argument combination, the chain will return one of the following objects:
bind = FALSE, join = FALSE
: A list of files with no further processing (each unique complete data file required returned as a list item).bind = TRUE, join = FALSE
: A list of files in which like files (e.g., HD*, IC*) are row bound together but unjoined to unlike filesbind = TRUE, join = TRUE
: A data frame in which like files are bound and all are joined
Details
Notes on filters:
Filters will be attempted depending on how the user selects to return the
data. By default (join = TRUE
), the complete filter will be applied to the
final joined data set.
When the user chooses only to bind like files (join = FALSE
) or return all
files separately (bind = FALSE
), attempts will be made to apply the filter
to the files to which they apply. This may be impossible if the filter is
complex, requiring consideration of variables across multiple files that the
user chose not to join. Users will receive a warning message in this
situation and the return of the unfiltered data files. In the situation in
which a filter applies only to one type of data file and works, but also
removes missing (NA
) values, other data files nominally unaffected by the
filter may also have rows removed if that institution had completely missing
data in the filtered file.
The more complicated the data call (many selected variables, many selected years, more complex filter), the longer the data request may take, particularly if downloading files, and the greater the likelihood of unexpected behavior with the join. Users may wish to break up large complex requests into multiple smaller requests or elect to return a list of unbound / unjoined data frames they can manipulate directly.