| Title: | Data Science for Psychologists |
|---|---|
| Description: | All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2026, <doi:10.5281/zenodo.7229812>), freely available at <https://hneth-ds4psy.share.connect.posit.cloud/>. The book and corresponding courses introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant. |
| Authors: | Hansjoerg Neth [aut, cre] (ORCID: <https://orcid.org/0000-0001-5427-3141>) |
| Maintainer: | Hansjoerg Neth <[email protected]> |
| License: | CC BY-SA 4.0 |
| Version: | 1.3.0.9003 |
| Built: | 2026-05-07 11:18:18 UTC |
| Source: | https://github.com/hneth/ds4psy |
base_digits provides numeral symbols (digits)
for notational place-value systems with arbitrary bases
(as a named character vector).
base_digitsbase_digits
An object of class character of length 62.
Note that the elements (digits) are character symbols
(i.e., numeral digits "0"-"9", "A"-"F", etc.),
whereas their names correspond to their
numeric values (from 0 to length(base_digits) - 1).
Thus, the maximum base value in conversions by
base2dec or dec2base
is length(base_digits).
base2dec converts numerals in some base into decimal numbers;
dec2base converts decimal numbers into numerals in another base;
as.roman converts integers into Roman numerals.
Other numeric functions:
base2dec(),
dec2base(),
is_equal(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
Other utility functions:
base2dec(),
dec2base(),
is_equal(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
base_digits # named character vector, zero-indexed names length(base_digits) # 62 (maximum base value) base_digits[10] # 10. element ("9" with name "9") base_digits["10"] # named element "10" ("A" with name "10") base_digits[["10"]] # element named "10" ("A")base_digits # named character vector, zero-indexed names length(base_digits) # 62 (maximum base value) base_digits[10] # 10. element ("9" with name "9") base_digits["10"] # named element "10" ("A" with name "10") base_digits[["10"]] # element named "10" ("A")
base2dec converts a sequence of numeral symbols (digits)
from its notation as positional numerals (with some base or radix)
into standard decimal notation (using the base or radix of 10).
base2dec(x, base = 2)base2dec(x, base = 2)
x |
A (required) sequence of numeric symbols (as a character sequence or vector of digits). |
base |
The base or radix of the symbols in |
The individual digits provided in x (e.g., from "0" to "9", "A" to "F")
must be defined in the specified base (i.e., every digit value must be lower
than the base or radix value).
See base_digits for the sequence of default digits.
base2dec is the complement of dec2base.
An integer number (in decimal notation).
dec2base converts decimal numbers into numerals in another base;
as.roman converts integers into Roman numerals.
Other numeric functions:
base_digits,
dec2base(),
is_equal(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
Other utility functions:
base_digits,
dec2base(),
is_equal(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
# (a) single string input: base2dec("11") # default base = 2 base2dec("0101") base2dec("1010") base2dec("11", base = 3) base2dec("11", base = 5) base2dec("11", base = 10) base2dec("11", base = 12) base2dec("11", base = 14) base2dec("11", base = 16) # (b) numeric vectors as inputs: base2dec(c(0, 1, 0)) base2dec(c(0, 1, 0), base = 3) # (c) character vector as inputs: base2dec(c("0", "1", "0")) base2dec(c("0", "1", "0"), base = 3) # (d) multi-digit vectors: base2dec(c(1, 1)) base2dec(c(1, 1), base = 3) # Extreme values: base2dec(rep("1", 32)) # 32 x "1" base2dec(c("1", rep("0", 32))) # 2^32 base2dec(rep("1", 33)) # 33 x "1" base2dec(c("1", rep("0", 33))) # 2^33 # Non-standard inputs: base2dec(" ", 2) # no non-spaces: NA base2dec(" ?! ", 2) # no base digits: NA base2dec(" 100 ", 2) # remove leading and trailing spaces base2dec("- 100", 2) # handle negative inputs (value < 0) base2dec("- -100", 2) # handle double negations base2dec("---100", 2) # handle multiple negations # Special cases: base2dec(NA) base2dec(0) base2dec(c(3, 3), base = 3) # Note message! # Note: base2dec(dec2base(012340, base = 9), base = 9) dec2base(base2dec(043210, base = 11), base = 11)# (a) single string input: base2dec("11") # default base = 2 base2dec("0101") base2dec("1010") base2dec("11", base = 3) base2dec("11", base = 5) base2dec("11", base = 10) base2dec("11", base = 12) base2dec("11", base = 14) base2dec("11", base = 16) # (b) numeric vectors as inputs: base2dec(c(0, 1, 0)) base2dec(c(0, 1, 0), base = 3) # (c) character vector as inputs: base2dec(c("0", "1", "0")) base2dec(c("0", "1", "0"), base = 3) # (d) multi-digit vectors: base2dec(c(1, 1)) base2dec(c(1, 1), base = 3) # Extreme values: base2dec(rep("1", 32)) # 32 x "1" base2dec(c("1", rep("0", 32))) # 2^32 base2dec(rep("1", 33)) # 33 x "1" base2dec(c("1", rep("0", 33))) # 2^33 # Non-standard inputs: base2dec(" ", 2) # no non-spaces: NA base2dec(" ?! ", 2) # no base digits: NA base2dec(" 100 ", 2) # remove leading and trailing spaces base2dec("- 100", 2) # handle negative inputs (value < 0) base2dec("- -100", 2) # handle double negations base2dec("---100", 2) # handle multiple negations # Special cases: base2dec(NA) base2dec(0) base2dec(c(3, 3), base = 3) # Note message! # Note: base2dec(dec2base(012340, base = 9), base = 9) dec2base(base2dec(043210, base = 11), base = 11)
Bushisms contains some phrases uttered by
or attributed to U.S. president George W. Bush
(the 43rd president of the United States of America,
in office from January 2001 to January 2009).
BushismsBushisms
A vector of type character
with length(Bushisms) = 22.
Data based on https://en.wikipedia.org/wiki/Bushism.
Other datasets:
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
capitalize converts
the first n initial characters of
each element of a text string x (i.e., characters or words)
to upper- or lowercase.
capitalize(x, n = 1, upper = TRUE, as_text = FALSE)capitalize(x, n = 1, upper = TRUE, as_text = FALSE)
x |
A string of text (required). |
n |
Number of initial characters to convert.
Default: |
upper |
Convert to uppercase?
Default: |
as_text |
Treat and return |
If as_text = TRUE, the input x is merged into
one string of text and the arguments are applied to each word.
A character vector.
caseflip for converting the case of all letters;
words_to_text and text_to_words for converting character vectors and texts.
Other text objects and functions:
Umlaut,
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
x <- c("Hello world!", "this is a TEST sentence.", "the end.") capitalize(x) capitalize(tolower(x)) # Options: capitalize(x, n = 3) # leaves strings intact capitalize(x, n = 3, as_text = TRUE) # treats strings as text capitalize(x, n = 3, upper = FALSE) # first n in lowercasex <- c("Hello world!", "this is a TEST sentence.", "the end.") capitalize(x) capitalize(tolower(x)) # Options: capitalize(x, n = 3) # leaves strings intact capitalize(x, n = 3, as_text = TRUE) # treats strings as text capitalize(x, n = 3, upper = FALSE) # first n in lowercase
caseflip flips the case of all characters
in a string of text x.
caseflip(x)caseflip(x)
x |
A string of text (required). |
Internally, caseflip uses the letters and LETTERS
constants of base R and the chartr function
for replacing characters in strings of text.
A character vector.
capitalize for converting the case of initial letters;
chartr for replacing characters in strings of text.
Other text objects and functions:
Umlaut,
capitalize(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
x <- c("Hello world!", "This is a 1st sentence.", "This is the 2nd sentence.", "The end.") caseflip(x)x <- c("Hello world!", "This is a 1st sentence.", "This is the 2nd sentence.", "The end.") caseflip(x)
cclass provides different character classes
(as a named character vector).
cclasscclass
An object of class character of length 6.
cclass allows illustrating matching
character classes via regular expressions.
See ?base::regex for details on regular expressions
and ?"'" for a list of character constants/quotes in R.
metachar for a vector of metacharacters.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
cclass["hex"] # select by name writeLines(cclass["pun"]) grep("[[:alpha:]]", cclass, value = TRUE)cclass["hex"] # select by name writeLines(cclass["pun"]) grep("[[:alpha:]]", cclass, value = TRUE)
change_time changes the time and time zone
without changing the time display.
change_time(time, tz = "")change_time(time, tz = "")
time |
Time (as a scalar or vector).
If |
tz |
Time zone (as character string).
Default: |
change_time expects inputs to time
to be local time(s) (of the "POSIXlt" class)
and a valid time zone argument tz (as a string)
and returns the same time display (but different actual times)
as calendar time(s) (of the "POSIXct" class).
A calendar time of class "POSIXct".
change_tz function which preserves time but changes time display;
Sys.time() function of base R.
Other date and time functions:
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
change_time(as.POSIXlt(Sys.time()), tz = "UTC") # from "POSIXlt" time: t1 <- as.POSIXlt("2020-01-01 10:20:30", tz = "Europe/Berlin") change_time(t1, "Pacific/Auckland") change_time(t1, "America/Los_Angeles") # from "POSIXct" time: tc <- as.POSIXct("2020-07-01 12:00:00", tz = "UTC") change_time(tc, "Pacific/Auckland") # from "Date": dt <- as.Date("2020-12-31", tz = "Pacific/Honolulu") change_time(dt, tz = "Pacific/Auckland") # from time "string": ts <- "2020-12-31 20:30:45" change_time(ts, tz = "America/Los_Angeles") # from other "string" times: tx <- "7:30:45" change_time(tx, tz = "Asia/Calcutta") ty <- "1:30" change_time(ty, tz = "Europe/London") # convert into local times: (l1 <- as.POSIXlt("2020-06-01 10:11:12")) change_tz(change_time(l1, "Pacific/Auckland"), tz = "UTC") change_tz(change_time(l1, "Europe/Berlin"), tz = "UTC") change_tz(change_time(l1, "America/New_York"), tz = "UTC") # with vector of "POSIXlt" times: (l2 <- as.POSIXlt("2020-12-31 23:59:55", tz = "America/Los_Angeles")) (tv <- c(l1, l2)) # uses tz of l1 change_time(tv, "America/Los_Angeles") # change time and tzchange_time(as.POSIXlt(Sys.time()), tz = "UTC") # from "POSIXlt" time: t1 <- as.POSIXlt("2020-01-01 10:20:30", tz = "Europe/Berlin") change_time(t1, "Pacific/Auckland") change_time(t1, "America/Los_Angeles") # from "POSIXct" time: tc <- as.POSIXct("2020-07-01 12:00:00", tz = "UTC") change_time(tc, "Pacific/Auckland") # from "Date": dt <- as.Date("2020-12-31", tz = "Pacific/Honolulu") change_time(dt, tz = "Pacific/Auckland") # from time "string": ts <- "2020-12-31 20:30:45" change_time(ts, tz = "America/Los_Angeles") # from other "string" times: tx <- "7:30:45" change_time(tx, tz = "Asia/Calcutta") ty <- "1:30" change_time(ty, tz = "Europe/London") # convert into local times: (l1 <- as.POSIXlt("2020-06-01 10:11:12")) change_tz(change_time(l1, "Pacific/Auckland"), tz = "UTC") change_tz(change_time(l1, "Europe/Berlin"), tz = "UTC") change_tz(change_time(l1, "America/New_York"), tz = "UTC") # with vector of "POSIXlt" times: (l2 <- as.POSIXlt("2020-12-31 23:59:55", tz = "America/Los_Angeles")) (tv <- c(l1, l2)) # uses tz of l1 change_time(tv, "America/Los_Angeles") # change time and tz
change_tz changes the nominal time zone (i.e., the time display)
without changing the actual time.
change_tz(time, tz = "")change_tz(time, tz = "")
time |
Time (as a scalar or vector).
If |
tz |
Time zone (as character string).
Default: |
change_tz expects inputs to time
to be calendar time(s) (of the "POSIXct" class)
and a valid time zone argument tz (as a string)
and returns the same time(s) as local time(s)
(of the "POSIXlt" class).
A local time of class "POSIXlt".
change_time function which preserves time display but changes time;
Sys.time() function of base R.
Other date and time functions:
change_time(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
change_tz(Sys.time(), tz = "Pacific/Auckland") change_tz(Sys.time(), tz = "Pacific/Honolulu") # from "POSIXct" time: tc <- as.POSIXct("2020-07-01 12:00:00", tz = "UTC") change_tz(tc, "Australia/Melbourne") change_tz(tc, "Europe/Berlin") change_tz(tc, "America/Los_Angeles") # from "POSIXlt" time: tl <- as.POSIXlt("2020-07-01 12:00:00", tz = "UTC") change_tz(tl, "Australia/Melbourne") change_tz(tl, "Europe/Berlin") change_tz(tl, "America/Los_Angeles") # from "Date": dt <- as.Date("2020-12-31") change_tz(dt, "Pacific/Auckland") change_tz(dt, "Pacific/Honolulu") # Note different date! # with a vector of "POSIXct" times: t2 <- as.POSIXct("2020-12-31 23:59:55", tz = "America/Los_Angeles") tv <- c(tc, t2) tv # Note: Both times in tz of tc change_tz(tv, "America/Los_Angeles")change_tz(Sys.time(), tz = "Pacific/Auckland") change_tz(Sys.time(), tz = "Pacific/Honolulu") # from "POSIXct" time: tc <- as.POSIXct("2020-07-01 12:00:00", tz = "UTC") change_tz(tc, "Australia/Melbourne") change_tz(tc, "Europe/Berlin") change_tz(tc, "America/Los_Angeles") # from "POSIXlt" time: tl <- as.POSIXlt("2020-07-01 12:00:00", tz = "UTC") change_tz(tl, "Australia/Melbourne") change_tz(tl, "Europe/Berlin") change_tz(tl, "America/Los_Angeles") # from "Date": dt <- as.Date("2020-12-31") change_tz(dt, "Pacific/Auckland") change_tz(dt, "Pacific/Honolulu") # Note different date! # with a vector of "POSIXct" times: t2 <- as.POSIXct("2020-12-31 23:59:55", tz = "America/Los_Angeles") tv <- c(tc, t2) tv # Note: Both times in tz of tc change_tz(tv, "America/Los_Angeles")
x into a single string of textchars_to_text combines multi-element character inputs x
into a single string of text (i.e., a character object of length 1),
while preserving punctuation and spaces.
chars_to_text(x, sep = "")chars_to_text(x, sep = "")
x |
A vector (required), typically a character vector. |
sep |
Character to insert between the elements
of a multi-element character vector as input |
chars_to_text is an inverse function of text_to_chars.
Note that using paste(x, collapse = "") would remove spaces.
See collapse_chars for a simpler alternative.
A character vector (of length 1).
collapse_chars for collapsing character vectors;
text_to_chars for splitting text into a vector of characters;
text_to_words for splitting text into a vector of words;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
# (a) One string (with spaces and punctuation): t1 <- "Hello world! This is _A TEST_. Does this work?" (cv <- unlist(strsplit(t1, split = ""))) (t2 <- chars_to_text(cv)) t1 == t2 # (b) Multiple strings (nchar from 0 to >1): s <- c("Hi", " ", "", "there!", " ", "", "Does THIS work?") chars_to_text(s) # Note: Using sep argument: chars_to_text(c("Hi there!", "How are you today?"), sep = " ") chars_to_text(1:3, sep = " | ")# (a) One string (with spaces and punctuation): t1 <- "Hello world! This is _A TEST_. Does this work?" (cv <- unlist(strsplit(t1, split = ""))) (t2 <- chars_to_text(cv)) t1 == t2 # (b) Multiple strings (nchar from 0 to >1): s <- c("Hi", " ", "", "there!", " ", "", "Does THIS work?") chars_to_text(s) # Note: Using sep argument: chars_to_text(c("Hi there!", "How are you today?"), sep = " ") chars_to_text(1:3, sep = " | ")
coin generates a sequence of events that
represent the results of flipping a fair coin n times.
coin(n = 1, events = c("H", "T"))coin(n = 1, events = c("H", "T"))
n |
Number of coin flips.
Default: |
events |
Possible outcomes (as a vector).
Default: |
By default, the 2 possible events for each flip
are "H" (for "heads") and "T" (for "tails").
Other sampling functions:
dice(),
dice_2(),
sample_char(),
sample_date(),
sample_time()
# Basics: coin() table(coin(n = 100)) table(coin(n = 100, events = LETTERS[1:3])) # Note an oddity: coin(10, events = 8:9) # works as expected, but coin(10, events = 9:9) # odd: see sample() for an explanation. # Limits: coin(2:3) coin(NA) coin(0) coin(1/2) coin(3, events = "X") coin(3, events = NA) coin(NULL, NULL)# Basics: coin() table(coin(n = 100)) table(coin(n = 100, events = LETTERS[1:3])) # Note an oddity: coin(10, events = 8:9) # works as expected, but coin(10, events = 9:9) # odd: see sample() for an explanation. # Limits: coin(2:3) coin(NA) coin(0) coin(1/2) coin(3, events = "X") coin(3, events = NA) coin(NULL, NULL)
x into a single character stringcollapse_chars converts multi-element character inputs x
into a single string of text (i.e., a character object of length 1),
separating its elements by sep.
collapse_chars(x, sep = " ")collapse_chars(x, sep = " ")
x |
A vector (required), typically a character vector. |
sep |
A character inserted as separator/delimiter
between elements when collapsing multi-element strings of |
As collapse_chars is a wrapper around
paste(x, collapse = sep).
It preserves spaces within the elements of x.
The separator sep is only used when collapsing multi-element vectors
and inserted between elements.
See chars_to_text for combining character vectors into text.
A character vector (of length 1).
chars_to_text for combining character vectors into text;
text_to_chars for splitting text into a vector of characters;
text_to_words for splitting text into a vector of words;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
collapse_chars(c("Hello", "world", "!")) collapse_chars(c("_", " _ ", " _ "), sep = "|") # preserves spaces writeLines(collapse_chars(c("Hello", "world", "!"), sep = "\n")) collapse_chars(1:3, sep = "")collapse_chars(c("Hello", "world", "!")) collapse_chars(c("_", " _ ", " _ "), sep = "|") # preserves spaces writeLines(collapse_chars(c("Hello", "world", "!"), sep = "\n")) collapse_chars(1:3, sep = "")
count_chars provides frequency counts of the
characters in a string of text x
as a named numeric vector.
count_chars(x, case_sense = TRUE, rm_specials = TRUE, sort_freq = TRUE)count_chars(x, case_sense = TRUE, rm_specials = TRUE, sort_freq = TRUE)
x |
A string of text (required). |
case_sense |
Boolean: Distinguish lower- vs. uppercase characters?
Default: |
rm_specials |
Boolean: Remove special characters?
Default: |
sort_freq |
Boolean: Sort output by character frequency?
Default: |
If rm_specials = TRUE (as per default),
most special (or non-word) characters are
removed and not counted. (Note that this currently works
without using regular expressions.)
The quantification is case-sensitive and the resulting vector is sorted by name (alphabetically) or by frequency (per default).
A named numeric vector.
count_words for counting the frequency of words;
count_chars_words for counting both characters and words;
plot_chars for a corresponding plotting function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
# Default: x <- c("Hello world!", "This is a 1st sentence.", "This is the 2nd sentence.", "THE END.") count_chars(x) # Options: count_chars(x, case_sense = FALSE) count_chars(x, rm_specials = FALSE) count_chars(x, sort_freq = FALSE)# Default: x <- c("Hello world!", "This is a 1st sentence.", "This is the 2nd sentence.", "THE END.") count_chars(x) # Options: count_chars(x, case_sense = FALSE) count_chars(x, rm_specials = FALSE) count_chars(x, sort_freq = FALSE)
count_chars_words provides frequency counts of the
characters and words of a string of text x
on a per character basis.
count_chars_words(x, case_sense = TRUE, sep = "|", rm_sep = TRUE)count_chars_words(x, case_sense = TRUE, sep = "|", rm_sep = TRUE)
x |
A string of text (required). |
case_sense |
Boolean: Distinguish lower- vs. uppercase characters?
Default: |
sep |
Dummy character(s) to insert between elements/lines
when parsing a multi-element character vector |
rm_sep |
Should |
count_chars_words calls both count_chars and
count_words and maps their results
to a data frame that contains a row for each
character of x.
The quantifications are case-sensitive. Special characters (e.g., parentheses, punctuation, and spaces) are counted as characters, but removed from word counts.
If input x consists of multiple text strings,
they are collapsed with an added " " (space) between them.
A data frame with 4 variables
(char, char_freq, word, word_freq).
count_chars for counting the frequency of characters;
count_words for counting the frequency of words;
plot_chars for a character plotting function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
s1 <- ("This test is to test this function.") head(count_chars_words(s1)) head(count_chars_words(s1, case_sense = FALSE)) s3 <- c("A 1st sentence.", "The 2nd sentence.", "A 3rd --- and also THE FINAL --- SENTENCE.") tail(count_chars_words(s3)) tail(count_chars_words(s3, case_sense = FALSE))s1 <- ("This test is to test this function.") head(count_chars_words(s1)) head(count_chars_words(s1, case_sense = FALSE)) s3 <- c("A 1st sentence.", "The 2nd sentence.", "A 3rd --- and also THE FINAL --- SENTENCE.") tail(count_chars_words(s3)) tail(count_chars_words(s3, case_sense = FALSE))
count_words provides frequency counts of the
words in a string of text x
as a named numeric vector.
count_words(x, case_sense = TRUE, sort_freq = TRUE)count_words(x, case_sense = TRUE, sort_freq = TRUE)
x |
A string of text (required). |
case_sense |
Boolean: Distinguish lower- vs. uppercase characters?
Default: |
sort_freq |
Boolean: Sort output by word frequency?
Default: |
Special (or non-word) characters are removed and not counted.
The quantification is case-sensitive and the resulting vector is sorted by name (alphabetically) or by frequency (per default).
A named numeric vector.
count_chars for counting the frequency of characters;
count_chars_words for counting both characters and words;
plot_chars for a character plotting function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
# Default: s3 <- c("A first sentence.", "The second sentence.", "A third --- and also THE FINAL --- SENTENCE.") count_words(s3) # case-sensitive, sorts by frequency # Options: count_words(s3, case_sense = FALSE) # case insensitive count_words(s3, sort_freq = FALSE) # sorts alphabetically# Default: s3 <- c("A first sentence.", "The second sentence.", "A third --- and also THE FINAL --- SENTENCE.") count_words(s3) # case-sensitive, sorts by frequency # Options: count_words(s3, case_sense = FALSE) # case insensitive count_words(s3, sort_freq = FALSE) # sorts alphabetically
countries is a dataset containing the names of
197 countries (as a vector of text strings).
countriescountries
A vector of type character
with length(countries) = 197.
Data from https://www.gapminder.org: Original data at https://www.gapminder.org/data/documentation/gd004/.
Other datasets:
Bushisms,
Trumpisms,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
cur_date provides a relaxed version of
Sys.time() that is sufficient for most purposes.
cur_date(rev = FALSE, as_string = TRUE, sep = "-")cur_date(rev = FALSE, as_string = TRUE, sep = "-")
rev |
Boolean: Reverse from "yyyy-mm-dd" to "dd-mm-yyyy" format?
Default: |
as_string |
Boolean: Return as character string?
Default: |
sep |
Character: Separator to use.
Default: |
By default, cur_date returns Sys.Date
as a character string (using current system settings and
sep for formatting).
If as_string = FALSE, a "Date" object is returned.
Alternatively, consider using Sys.Date
or Sys.time() to obtain the "
format according to the ISO 8601 standard.
For more options, see the documentations of the
date and Sys.Date functions of base R
and the formatting options for Sys.time().
A character string or object of class "Date".
what_date() function to print dates with more options;
date() and today() functions of the lubridate package;
date(), Sys.Date(), and Sys.time() functions of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
cur_date() cur_date(sep = "/") cur_date(rev = TRUE) cur_date(rev = TRUE, sep = ".") # return a "Date" object: from <- cur_date(as_string = FALSE) class(from)cur_date() cur_date(sep = "/") cur_date(rev = TRUE) cur_date(rev = TRUE, sep = ".") # return a "Date" object: from <- cur_date(as_string = FALSE) class(from)
cur_time provides a satisficing version of
Sys.time() that is sufficient for most purposes.
cur_time(seconds = FALSE, as_string = TRUE, sep = ":")cur_time(seconds = FALSE, as_string = TRUE, sep = ":")
seconds |
Boolean: Show time with seconds?
Default: |
as_string |
Boolean: Return as character string?
Default: |
sep |
Character: Separator to use.
Default: |
By default, cur_time returns a
Sys.time() as a character string
(in "
using current system settings.
If as_string = FALSE, a "POSIXct"
(calendar time) object is returned.
For a time zone argument,
see the what_time function,
or the now() function of
the lubridate package.
A character string or object of class "POSIXct".
what_time() function to print times with more options;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
cur_time() cur_time(seconds = TRUE) cur_time(sep = ".") # return a "POSIXct" object: t <- cur_time(as_string = FALSE) format(t, "%T %Z")cur_time() cur_time(seconds = TRUE) cur_time(sep = ".") # return a "POSIXct" object: t <- cur_time(as_string = FALSE) format(t, "%T %Z")
data_1 is a fictitious dataset to practice importing data
(from a DELIMITED file).
data_1data_1
A table with 100 cases (rows) and 4 variables (columns).
See DELIMITED data at http://rpository.com/ds4psy/data/data_1.dat.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_2 is a fictitious dataset to practice importing data
(from a FWF file).
data_2data_2
A table with 100 cases (rows) and 4 variables (columns).
See FWF data at http://rpository.com/ds4psy/data/data_2.dat.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t1 is a fictitious dataset to practice importing and joining data
(from a CSV file).
data_t1data_t1
A table with 20 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/data_t1.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t1_de is a fictitious dataset to practice importing data
(from a CSV file, de/European style).
data_t1_dedata_t1_de
A table with 20 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/data_t1_de.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t1_tab is a fictitious dataset to practice importing data
(from a TAB file).
data_t1_tabdata_t1_tab
A table with 20 cases (rows) and 4 variables (columns).
See TAB-delimited data at http://rpository.com/ds4psy/data/data_t1_tab.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t2 is a fictitious dataset to practice importing and joining data
(from a CSV file).
data_t2data_t2
A table with 20 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/data_t2.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t3 is a fictitious dataset to practice importing and joining data
(from a CSV file).
data_t3data_t3
A table with 20 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/data_t3.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
data_t4 is a fictitious dataset to practice importing and joining data
(from a CSV file).
data_t4data_t4
A table with 20 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/data_t4.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
days_in_month computes the number of days in the months of
given dates (provided as a date or time dt,
or number/string denoting a 4-digit year).
days_in_month(dt = Sys.Date(), ...)days_in_month(dt = Sys.Date(), ...)
dt |
Date or time (scalar or vector).
Default: |
... |
Other parameters (passed to |
The function requires dt as "Dates",
rather than month names or numbers,
to check for leap years (in which February has 29 days).
A named (numeric) vector.
is_leap_year to check for leap years;
diff_tz for time zone-based time differences;
days_in_month function of the lubridate package.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
days_in_month() # Robustness: days_in_month(Sys.Date()) # Date days_in_month(Sys.time()) # POSIXct days_in_month("2020-07-01") # string days_in_month(20200901) # number days_in_month(c("2020-02-10 01:02:03", "2021-02-11", "2024-02-12")) # vectors of strings # For leap years: ds <- as.Date("2020-02-20") + (365 * 0:4) days_in_month(ds) # (2020/2024 are leap years)days_in_month() # Robustness: days_in_month(Sys.Date()) # Date days_in_month(Sys.time()) # POSIXct days_in_month("2020-07-01") # string days_in_month(20200901) # number days_in_month(c("2020-02-10 01:02:03", "2021-02-11", "2024-02-12")) # vectors of strings # For leap years: ds <- as.Date("2020-02-20") + (365 * 0:4) days_in_month(ds) # (2020/2024 are leap years)
dec2base converts an integer from its standard decimal notation
(i.e., using positional numerals with a base or radix of 10)
into a sequence of numeric symbols (digits) in some other base.
See base_digits for the sequence of default digits.
dec2base(x, base = 2)dec2base(x, base = 2)
x |
A (required) integer in decimal (base 10) notation or corresponding string of digits (i.e., digits 0-9). |
base |
The base or radix of the digits in the output.
Default: |
To prevent erroneous interpretations of numeric outputs,
dec2base returns a sequence of digits (as a character string).
dec2base is the complement of base2dec.
A character string of digits (in base notation).
base2dec converts numerals in some base into decimal numbers;
as.roman converts integers into Roman numerals.
Other numeric functions:
base2dec(),
base_digits,
is_equal(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
Other utility functions:
base2dec(),
base_digits,
is_equal(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
# (a) single numeric input: dec2base(3) # base = 2 dec2base(8, base = 2) dec2base(8, base = 3) dec2base(8, base = 7) dec2base(100, base = 5) dec2base(100, base = 10) dec2base(100, base = 15) dec2base(14, base = 14) dec2base(15, base = 15) dec2base(16, base = 16) dec2base(15, base = 16) dec2base(31, base = 16) dec2base(47, base = 16) # (b) single string input: dec2base("7", base = 2) dec2base("8", base = 3) # Extreme values: dec2base(base2dec(rep("1", 32))) # 32 x "1" dec2base(base2dec(c("1", rep("0", 32)))) # 2^32 dec2base(base2dec(rep("1", 33))) # 33 x "1" dec2base(base2dec(c("1", rep("0", 33)))) # 2^33 # Non-standard inputs: dec2base(" ") # only spaces: NA dec2base("?") # no decimal digits: NA dec2base(" 10 ", 2) # remove leading and trailing spaces dec2base("-10", 2) # handle negative inputs (in character strings) dec2base(" -- 10", 2) # handle multiple negations dec2base("xy -10 ", 2) # ignore non-decimal digit prefixes # Note: base2dec(dec2base(012340, base = 9), base = 9) dec2base(base2dec(043210, base = 11), base = 11)# (a) single numeric input: dec2base(3) # base = 2 dec2base(8, base = 2) dec2base(8, base = 3) dec2base(8, base = 7) dec2base(100, base = 5) dec2base(100, base = 10) dec2base(100, base = 15) dec2base(14, base = 14) dec2base(15, base = 15) dec2base(16, base = 16) dec2base(15, base = 16) dec2base(31, base = 16) dec2base(47, base = 16) # (b) single string input: dec2base("7", base = 2) dec2base("8", base = 3) # Extreme values: dec2base(base2dec(rep("1", 32))) # 32 x "1" dec2base(base2dec(c("1", rep("0", 32)))) # 2^32 dec2base(base2dec(rep("1", 33))) # 33 x "1" dec2base(base2dec(c("1", rep("0", 33)))) # 2^33 # Non-standard inputs: dec2base(" ") # only spaces: NA dec2base("?") # no decimal digits: NA dec2base(" 10 ", 2) # remove leading and trailing spaces dec2base("-10", 2) # handle negative inputs (in character strings) dec2base(" -- 10", 2) # handle multiple negations dec2base("xy -10 ", 2) # ignore non-decimal digit prefixes # Note: base2dec(dec2base(012340, base = 9), base = 9) dec2base(base2dec(043210, base = 11), base = 11)
dice generates a sequence of events that
represent the results of throwing a fair dice
(with a given number of events or number of sides)
n times.
dice(n = 1, events = 1:6)dice(n = 1, events = 1:6)
n |
Number of dice throws.
Default: |
events |
Events to draw from (or number of sides).
Default: |
By default, the 6 possible events for each throw of the dice
are the numbers from 1 to 6.
Other sampling functions:
coin(),
dice_2(),
sample_char(),
sample_date(),
sample_time()
# Basics: dice() table(dice(10^4)) # 5-sided dice: dice(events = 1:5) table(dice(100, events = 5)) # Strange dice: dice(5, events = 8:9) table(dice(100, LETTERS[1:3])) # Note: dice(10, 1) table(dice(100, 2)) # Note an oddity: dice(10, events = 8:9) # works as expected, but dice(10, events = 9:9) # odd: see sample() for an explanation. # Limits: dice(NA) dice(0) dice(1/2) dice(2:3) dice(5, events = NA) dice(5, events = 1/2) dice(NULL, NULL)# Basics: dice() table(dice(10^4)) # 5-sided dice: dice(events = 1:5) table(dice(100, events = 5)) # Strange dice: dice(5, events = 8:9) table(dice(100, LETTERS[1:3])) # Note: dice(10, 1) table(dice(100, 2)) # Note an oddity: dice(10, events = 8:9) # works as expected, but dice(10, events = 9:9) # odd: see sample() for an explanation. # Limits: dice(NA) dice(0) dice(1/2) dice(2:3) dice(5, events = NA) dice(5, events = 1/2) dice(NULL, NULL)
dice_2 is a variant of dice that
generates a sequence of events that
represent the results of throwing a dice
(with a given number of sides) n times.
dice_2(n = 1, sides = 6)dice_2(n = 1, sides = 6)
n |
Number of dice throws.
Default: |
sides |
Number of sides.
Default: |
Something is wrong with this dice. Can you examine it and measure its problems in a quantitative fashion?
Other sampling functions:
coin(),
dice(),
sample_char(),
sample_date(),
sample_time()
# Basics: dice_2() table(dice_2(100)) # 10-sided dice: dice_2(sides = 10) table(dice_2(100, sides = 10)) # Note: dice_2(10, 1) table(dice_2(5000, sides = 5)) # Note an oddity: dice_2(n = 10, sides = 8:9) # works, but dice_2(n = 10, sides = 9:9) # odd: see sample() for an explanation.# Basics: dice_2() table(dice_2(100)) # 10-sided dice: dice_2(sides = 10) table(dice_2(100, sides = 10)) # Note: dice_2(10, 1) table(dice_2(5000, sides = 5)) # Note an oddity: dice_2(n = 10, sides = 8:9) # works, but dice_2(n = 10, sides = 9:9) # odd: see sample() for an explanation.
diff_dates computes the difference between two dates
(i.e., from some from_date to some to_date)
in human measurement units (periods).
diff_dates( from_date, to_date = Sys.Date(), unit = "years", as_character = TRUE )diff_dates( from_date, to_date = Sys.Date(), unit = "years", as_character = TRUE )
from_date |
From date (required, scalar or vector, as "Date"). Date of birth (DOB), assumed to be of class "Date", and coerced into "Date" when of class "POSIXt". |
to_date |
To date (optional, scalar or vector, as "Date").
Default: |
unit |
Largest measurement unit for representing results.
Units represent human time periods, rather than
chronological time differences.
Default:
Units may be abbreviated. |
as_character |
Boolean: Return output as character?
Default: |
diff_dates answers questions like
"How much time has elapsed between two dates?"
or "How old are you?" in human time periods
of (full) years, months, and days.
Key characteristics:
If to_date or from_date are not "Date" objects,
diff_dates aims to coerce them into "Date" objects.
If to_date is missing (i.e., NA),
to_date is set to today's date (i.e., Sys.Date()).
If to_date is specified, any intermittent missing values
(i.e., NA) are set to today's date (i.e., Sys.Date()).
Thus, dead people (with both birth dates and death dates specified)
do not age any further, but people still alive (with is.na(to_date),
are measured to today's date (i.e., Sys.Date()).
If to_date precedes from_date (i.e., from_date > to_date)
computations are performed on swapped days and
the result is marked as negative (by a character "-") in the output.
If the lengths of from_date and to_date differ,
the shorter vector is recycled to the length of the longer one.
By default, diff_dates provides output as (signed) character strings.
For numeric outputs, use as_character = FALSE.
A character vector or data frame (with dates, sign, and numeric columns for units).
Time spans (interval as.period) in the lubridate package.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
y_100 <- Sys.Date() - (100 * 365.25) + -1:1 diff_dates(y_100) # with "to_date" argument: y_050 <- Sys.Date() - (50 * 365.25) + -1:1 diff_dates(y_100, y_050) diff_dates(y_100, y_050, unit = "d") # days (with decimals) # Time unit and output format: ds_from <- as.Date("2010-01-01") + 0:2 ds_to <- as.Date("2020-03-01") # (2020 is leap year) diff_dates(ds_from, ds_to, unit = "y", as_character = FALSE) # years diff_dates(ds_from, ds_to, unit = "m", as_character = FALSE) # months diff_dates(ds_from, ds_to, unit = "d", as_character = FALSE) # days # Robustness: days_cur_year <- 365 + is_leap_year(Sys.Date()) diff_dates(Sys.time() - (1 * (60 * 60 * 24) * days_cur_year)) # for POSIXt times diff_dates("10-08-11", "20-08-10") # for strings diff_dates(20200228, 20200301) # for numbers (2020 is leap year) # Recycling "to_date" to length of "from_date": y_050_2 <- Sys.Date() - (50 * 365.25) diff_dates(y_100, y_050_2) # Note maxima and minima: diff_dates("0000-01-01", "9999-12-31") # max. d + m + y diff_dates("1000-06-01", "1000-06-01") # min. d + m + y # If from_date == to_date: diff_dates("2000-01-01", "2000-01-01") # If from_date > to_date: diff_dates("2000-01-02", "2000-01-01") # Note negation "-" diff_dates("2000-02-01", "2000-01-01", as_character = TRUE) diff_dates("2001-02-02", "2000-02-02", as_character = FALSE) # Test random date samples: f_d <- sample_date(size = 10) t_d <- sample_date(size = 10) diff_dates(f_d, t_d, as_character = TRUE) # Using 'fame' data: dob <- as.Date(fame$DOB, format = "%B %d, %Y") dod <- as.Date(fame$DOD, format = "%B %d, %Y") head(diff_dates(dob, dod)) # Note: Deceased people do not age further. head(diff_dates(dob, dod, as_character = FALSE)) # numeric outputsy_100 <- Sys.Date() - (100 * 365.25) + -1:1 diff_dates(y_100) # with "to_date" argument: y_050 <- Sys.Date() - (50 * 365.25) + -1:1 diff_dates(y_100, y_050) diff_dates(y_100, y_050, unit = "d") # days (with decimals) # Time unit and output format: ds_from <- as.Date("2010-01-01") + 0:2 ds_to <- as.Date("2020-03-01") # (2020 is leap year) diff_dates(ds_from, ds_to, unit = "y", as_character = FALSE) # years diff_dates(ds_from, ds_to, unit = "m", as_character = FALSE) # months diff_dates(ds_from, ds_to, unit = "d", as_character = FALSE) # days # Robustness: days_cur_year <- 365 + is_leap_year(Sys.Date()) diff_dates(Sys.time() - (1 * (60 * 60 * 24) * days_cur_year)) # for POSIXt times diff_dates("10-08-11", "20-08-10") # for strings diff_dates(20200228, 20200301) # for numbers (2020 is leap year) # Recycling "to_date" to length of "from_date": y_050_2 <- Sys.Date() - (50 * 365.25) diff_dates(y_100, y_050_2) # Note maxima and minima: diff_dates("0000-01-01", "9999-12-31") # max. d + m + y diff_dates("1000-06-01", "1000-06-01") # min. d + m + y # If from_date == to_date: diff_dates("2000-01-01", "2000-01-01") # If from_date > to_date: diff_dates("2000-01-02", "2000-01-01") # Note negation "-" diff_dates("2000-02-01", "2000-01-01", as_character = TRUE) diff_dates("2001-02-02", "2000-02-02", as_character = FALSE) # Test random date samples: f_d <- sample_date(size = 10) t_d <- sample_date(size = 10) diff_dates(f_d, t_d, as_character = TRUE) # Using 'fame' data: dob <- as.Date(fame$DOB, format = "%B %d, %Y") dod <- as.Date(fame$DOD, format = "%B %d, %Y") head(diff_dates(dob, dod)) # Note: Deceased people do not age further. head(diff_dates(dob, dod, as_character = FALSE)) # numeric outputs
diff_times computes the difference between two times
(i.e., from some from_time to some to_time)
in human measurement units (periods).
diff_times(from_time, to_time = Sys.time(), unit = "days", as_character = TRUE)diff_times(from_time, to_time = Sys.time(), unit = "days", as_character = TRUE)
from_time |
From time (required, scalar or vector, as "POSIXct"). Origin time, assumed to be of class "POSIXct", and coerced into "POSIXct" when of class "Date" or "POSIXlt. |
to_time |
To time (optional, scalar or vector, as "POSIXct").
Default: |
unit |
Largest measurement unit for representing results.
Units represent human time periods, rather than
chronological time differences.
Default:
Units may be abbreviated. |
as_character |
Boolean: Return output as character?
Default: |
diff_times answers questions like
"How much time has elapsed between two dates?"
or "How old are you?" in human time periods
of (full) years, months, and days.
Key characteristics:
If to_time or from_time are not "POSIXct" objects,
diff_times aims to coerce them into "POSIXct" objects.
If to_time is missing (i.e., NA),
to_time is set to the current time (i.e., Sys.time()).
If to_time is specified, any intermittent missing values
(i.e., NA) are set to the current time (i.e., Sys.time()).
If to_time precedes from_time (i.e., from_time > to_time)
computations are performed on swapped times and the result is marked
as negative (by a character "-") in the output.
If the lengths of from_time and to_time differ,
the shorter vector is recycled to the length of the longer one.
By default, diff_times provides output as (signed) character strings.
For numeric outputs, use as_character = FALSE.
A character vector or data frame (with times, sign, and numeric columns for units).
diff_dates for date differences;
time spans (an interval as.period) in the lubridate package.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
t1 <- as.POSIXct("1969-07-13 13:53 CET") # (before UNIX epoch) diff_times(t1, unit = "years", as_character = TRUE) diff_times(t1, unit = "secs", as_character = TRUE)t1 <- as.POSIXct("1969-07-13 13:53 CET") # (before UNIX epoch) diff_times(t1, unit = "years", as_character = TRUE) diff_times(t1, unit = "secs", as_character = TRUE)
diff_tz computes the time difference
between two times t1 and t2
that is exclusively due to both times being in
different time zones.
diff_tz(t1, t2, in_min = FALSE)diff_tz(t1, t2, in_min = FALSE)
t1 |
First time (required, as "POSIXt" time point/moment). |
t2 |
Second time (required, as "POSIXt" time point/moment). |
in_min |
Return time-zone based time
difference in minutes (Boolean)?
Default: |
diff_tz ignores all differences in nominal times,
but allows adjusting time-based computations
for time shifts that are due to time zone differences
(e.g., different locations, or
changes to/from daylight saving time, DST),
rather than differences in actual times.
Internally, diff_tz determines and contrasts the POSIX
conversion specifications "
(in numeric form).
If the lengths of t1 and t2 differ,
the shorter vector is recycled to the length of the longer one.
A character (in "HH:MM" format) or numeric vector (number of minutes).
days_in_month for the number of days in given months;
is_leap_year to check for leap years.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
# Time zones differences: tm <- "2020-01-01 01:00:00" # nominal time t1 <- as.POSIXct(tm, tz = "Pacific/Auckland") t2 <- as.POSIXct(tm, tz = "Europe/Berlin") t3 <- as.POSIXct(tm, tz = "Pacific/Honolulu") # as character (in "HH:MM"): diff_tz(t1, t2) diff_tz(t2, t3) diff_tz(t1, t3) # as numeric (in minutes): diff_tz(t1, t3, in_min = TRUE) # Compare local times (POSIXlt): t4 <- as.POSIXlt(Sys.time(), tz = "Pacific/Auckland") t5 <- as.POSIXlt(Sys.time(), tz = "Europe/Berlin") diff_tz(t4, t5) diff_tz(t4, t5, in_min = TRUE) # DSL shift: Spring ahead (on 2020-03-29: 02:00:00 > 03:00:00): s6 <- "2020-03-29 01:00:00 CET" # before DSL switch s7 <- "2020-03-29 03:00:00 CEST" # after DSL switch t6 <- as.POSIXct(s6, tz = "Europe/Berlin") # CET t7 <- as.POSIXct(s7, tz = "Europe/Berlin") # CEST diff_tz(t6, t7) # 1 hour forward diff_tz(t6, t7, in_min = TRUE)# Time zones differences: tm <- "2020-01-01 01:00:00" # nominal time t1 <- as.POSIXct(tm, tz = "Pacific/Auckland") t2 <- as.POSIXct(tm, tz = "Europe/Berlin") t3 <- as.POSIXct(tm, tz = "Pacific/Honolulu") # as character (in "HH:MM"): diff_tz(t1, t2) diff_tz(t2, t3) diff_tz(t1, t3) # as numeric (in minutes): diff_tz(t1, t3, in_min = TRUE) # Compare local times (POSIXlt): t4 <- as.POSIXlt(Sys.time(), tz = "Pacific/Auckland") t5 <- as.POSIXlt(Sys.time(), tz = "Europe/Berlin") diff_tz(t4, t5) diff_tz(t4, t5, in_min = TRUE) # DSL shift: Spring ahead (on 2020-03-29: 02:00:00 > 03:00:00): s6 <- "2020-03-29 01:00:00 CET" # before DSL switch s7 <- "2020-03-29 03:00:00 CEST" # after DSL switch t6 <- as.POSIXct(s6, tz = "Europe/Berlin") # CET t7 <- as.POSIXct(s7, tz = "Europe/Berlin") # CEST diff_tz(t6, t7) # 1 hour forward diff_tz(t6, t7, in_min = TRUE)
The ds4psy package currently only contains a default vignette that provides general information and links.
ds4psy.guide()ds4psy.guide()
dt_10 contains precise DOB information of
10 non-existent, but definitely Danish people.
dt_10dt_10
A table with 10 cases (rows) and 7 variables (columns).
See CSV data file at http://rpository.com/ds4psy/data/dt_10.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
exp_num_dt is a fictitious set of data describing
1000 non-existing, but surprisingly friendly people.
exp_num_dtexp_num_dt
A table with 1000 cases (rows) and 15 variables (columns).
Codebook The data characterize 1000 individuals (rows) in 15 variables (columns):
1. name: Participant initials.
2. gender: Self-identified gender (as a binary variable).
3. bday: Day (within month) of DOB.
4. bmonth: Month (within year) of DOB.
5. byear: Year of DOB.
6. height: Height (in cm).
7. blood_type: Blood type.
8. bnt_1 to 11. bnt_4: Correct response to corresponding BNT question? (1: correct, 0: incorrect).
12. g_iq and 13. s_iq: Scores from two IQ tests (general vs. social).
14. t_1 and 15. t_2: Study start and end time.
exp_num_dt was generated for practice purposes.
It allows
(1) converting data tables from wider into longer format,
(2) dealing with date- and time-related variables, and
(3) computing, analyzing, and visualizing test scores (e.g., numeracy, IQ).
The gender variable was converted into a binary variable
(i.e., using 2 categories "female" and "not female").
See CSV data files at http://rpository.com/ds4psy/data/numeracy.csv and http://rpository.com/ds4psy/data/dt.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
exp_wide is a fictitious dataset to practice tidying data
(here: converting from wide to long format).
exp_wideexp_wide
A table with 10 cases (rows) and 7 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/exp_wide.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
falsePosPsy_all is a dataset containing the data from 2 studies designed to
highlight problematic research practices within psychology.
falsePosPsy_allfalsePosPsy_all
A table with 78 cases (rows) and 19 variables (columns):
Simmons, Nelson and Simonsohn (2011) published a controversial article with a necessarily false finding. By conducting simulations and 2 simple behavioral experiments, the authors show that flexibility in data collection, analysis, and reporting dramatically increases the rate of false-positive findings.
Study ID.
Participant ID.
Days since participant was born (based on their self-reported birthday).
Age in years.
Is participant a woman? 1: yes, 2: no.
Father's age (in years).
Mother's age (in years).
Did the participant hear the song 'Hot Potato' by The Wiggles? 1: yes, 2: no.
Did the participant hear the song 'When I am 64' by The Beatles? 1: yes, 2: no.
Did the participant hear the song 'Kalimba' by Mr. Scrub? 1: yes, 2: no.
In which condition was the participant? control: Subject heard the song 'Kalimba' by Mr. Scrub; potato: Subject heard the song 'Hot Potato' by The Wiggles; 64: Subject heard the song 'When I am 64' by The Beatles.
Could participant report the square root of 100? 1: yes, 2: no.
Imagine a restaurant you really like offered a 30 percent discount for dining between 4pm and 6pm. How likely would you be to take advantage of that offer? Scale from 1: very unlikely, 7: very likely.
In the political spectrum, where would you place yourself? Scale: 1: very liberal, 2: liberal, 3: centrist, 4: conservative, 5: very conservative.
If you had to guess who was chosen the quarterback of the year in Canada last year, which of the following four options would you choose? 1: Dalton Bell, 2: Daryll Clark, 3: Jarious Jackson, 4: Frank Wilczynski.
How often have you referred to some past part of your life as “the good old days”? Scale: 11: never, 12: almost never, 13: sometimes, 14: often, 15: very often.
How old do you feel? Scale: 1: very young, 2: young, 3: neither young nor old, 4: old, 5: very old.
Computers are complicated machines. Scale from 1: strongly disagree, to 5: strongly agree.
Imagine you were going to a diner for dinner tonight, how much do you think you would like the food? Scale from 1: dislike extremely, to 9: like extremely.
See https://bookdown.org/hneth/ds4psy/B.2-datasets-false.html for codebook and more information.
Articles
Simmons, J.P., Nelson, L.D., & Simonsohn, U. (2011).
False-positive psychology: Undisclosed flexibility in data collection and analysis
allows presenting anything as significant.
Psychological Science, 22(11), 1359–1366.
doi: 10.1177/0956797611417632
Simmons, J.P., Nelson, L.D., & Simonsohn, U. (2014).
Data from paper "False-Positive Psychology:
Undisclosed Flexibility in Data Collection and Analysis
Allows Presenting Anything as Significant".
Journal of Open Psychology Data, 2(1), e1.
doi: 10.5334/jopd.aa
See files at https://openpsychologydata.metajnl.com/articles/10.5334/jopd.aa/ and the archive at https://zenodo.org/record/7664 for original dataset.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
fame is a dataset to practice working with dates.
fame contains the names, areas, dates of birth (DOB), and
— if applicable — the dates of death (DOD) of famous people.
famefame
A table with 67 cases (rows) and 4 variables (columns).
Student solutions to exercises, dates mostly from https://www.wikipedia.org/.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
flowery contains versions and variations
of Gertrude Stein's popular phrase
"A rose is a rose is a rose".
floweryflowery
A vector of type character
with length(flowery) = 60.
The phrase stems from Gertrude Stein's poem "Sacred Emily" (written in 1913 and published in 1922, in "Geography and Plays"). The verbatim line in the poem actually reads "Rose is a rose is a rose is a rose".
See https://en.wikipedia.org/wiki/Rose_is_a_rose_is_a_rose_is_a_rose for additional variations and sources.
Data based on https://en.wikipedia.org/wiki/Rose_is_a_rose_is_a_rose_is_a_rose.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
fruits is a dataset containing the names of
122 fruits (as a vector of text strings).
fruitsfruits
A vector of type character
with length(fruits) = 122.
Botanically, "fruits" are the seed-bearing structures of flowering plants (angiosperms) formed from the ovary after flowering.
In common usage, "fruits" refer to the fleshy seed-associated structures of a plant that taste sweet or sour, and are edible in their raw state.
Data based on https://simple.wikipedia.org/wiki/List_of_fruits.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
get_set obtains a set of x/y coordinates and returns it
(as a data frame).
get_set(n = 1)get_set(n = 1)
n |
Number of set (as an integer from 1 to 4)).
Default: |
Each set stems from Anscombe's Quartet
(see datasets::anscombe, hence
1 <= n <= 4) and is returned as an
11 x 2 data frame.
See ?datasets:anscombe for details and references.
Other data functions:
make_grid()
get_set(1) plot(get_set(2), col = "red")get_set(1) plot(get_set(2), col = "red")
i2ds_survey contains pre-processed data
from the i2ds online survey.
i2ds_surveyi2ds_survey
On 2026-05-04, this data contains 104 participants (rows) and 119 variables (columns).
Prefix codes
Many variable names have prefixes that indicate a particular type of variable:
rv: A random variable
c(#): A choice variable (with # alternatives)
t: A text variable (with any input)
tn: A text variable (with numeric input)
crs: A course-related variable
combined: A composite variable created by averaging either 4 or 5 individual Likert-scale items. Depending on the item set, the resulting score was normalized (i.e., divided by 4 or 5), and stored as a new variable.
List of variables
After pre-processing the raw data and re-arranging its variables (columns),
the variable names and their contents in the i2ds_survey tibble are as follows:
Key person-related variables:
c4_gender A categorical (character) variable indicating the participant’s gender identity,
with possible values including "female", "male", "non-binary" or "do not wish to respond".
This variable is used for demographic analysis.
tn_year A numeric (double) variable indicating the year of birth (e.g., 1999, 2000, 2001, etc.).
tn_month A numeric (double) variable indicating the participant’s birth month (1–12).
This variable also supports demographic profiling.
tn_day A numeric (double) variable indicating the day of birth provided by the participant (1–31).
Used for demographic purposes and potential exploratory analyses.
DOB-related variables can be used to calculate age and analyze age-related trends.
t_height A character variable indicating a participant's self-described height, using various formats and units
(e.g., "1.80", "180 cm", "1,80m", or "5'11"). This variable requires pre-processing for analysis.
t_pid An optional character variable capturing a participant ID, pseudonym, or other identifying entry.
This variable allows participants to recognize their own data without disclosing their identity.
Variables indicating informed consent and willingness to share data:
c2_informed_consent A logical variable indicating whether the participant provided informed consent before starting the study
(TRUE = consent provided, FALSE = no consent provided).
This variable is a pre-requisite for ethical compliance (i.e., should be TRUE for all participants).
c2_use_data_2 A logical variable indicating whether a participant still agrees to allow their data to be shared after having finished the survey
(TRUE = consent provided, FALSE = no consent provided).
This variable is a pre-requisite for data re-usability in research (and should be TRUE for all cases included here).
Variables indicating course membership:
crs_i2ds_1 A logical variable indicating whether a participant is currently enrolled in the course Introduction to Data Science 1: Basics (i2ds 1: TRUE = enrolled).
crs_i2ds_2 A logical variable indicating whether a participant is enrolled in the course Introduction to Data Science 2: Applications (i2ds 2: TRUE = enrolled).
crs_ds4psy A logical variable indicating whether a participant is enrolled in the course Data Science for Psychology (ds4psy: TRUE = enrolled).
crs_viz A logical variable indicating whether a participant is enrolled in the course Data visualization for psychology research (vis4psy: TRUE = enrolled).
crs_diff_kn A logical variable indicating whether a participant is enrolled in a different course at the University of Konstanz (TRUE = yes).
crs_diff_else A logical variable indicating whether a participant is enrolled in a course not at the University of Konstanz (TRUE = yes).
This variable helps identifying external learners.
crs_self_study A logical variable indicating whether a participant is engaging with course materials without formal enrollment (TRUE = yes).
This variable reflects informal learning engagement.
crs_only_study A logical variable indicating whether a participant is taking the survey only, without engaging with course materials (TRUE = yes).
This variable identifies participants not studying R or data science.
t_crs_other A character variable capturing free-text input describing any other course a participant is taking.
v_crs_other_dept A character variable indicating the department of the other course(s) mentioned in t_crs_other.
This variable may facilitate grouping participants by academic discipline.
Variables indicating (randomized) survey conditions:
rv_anchor_high_low A randomized (character) variable that indicates whether a person is to keep a relatively large or small number in memory (i.e., assignment to either 242 or 42, respectively). This manipulation is used to examine anchoring effects on later responses.
rv_scale_randomization A randomized (character) variable that indicates whether a person was asked to rate their personality (from "serious" to "humorous") on a 4-point or on a 5-point Likert scale.
The variable controls for the influence of scale granularity on ratings.
rv_barnum_pos_neg A randomized (character) variable that indicates whether the participant is to receive a positive or negative Barnum statement ("positive" vs. "negative"). This is used to measure sensitivity to vague or generic personality feedback.
rv_sc_false_dicho_3 A randomized (character) variable indicating which version of the scale is to be shown: a dichotomous comparison between admiration vs. respect, fear vs. love, admiration vs. love and fear, or a single undivided scale (values: "admir_resp" "fear_love", "admir_love" fear_resp", "single_scale"). Used to examine how scale format affects evaluative judgments.
rv_wait_time A randomized (character) variable that indicates whether the participant waited 10 seconds ("short") or 30 seconds ("long") before continuing. This manipulation aims to examine whether a longer waiting period increases the perceived credibility or value of a following personality feedback, in line with mechanisms underlying the Barnum effect.
rv_political_orientation A randomized (character) variable indicating the order in which the two political orientation scales ("left–right" and "liberal–conservative") were presented. Possible values include "left_right, lib_cons", "left_cons, lib_right", etc. This variable is used to control for potential order effects in political self-placement tasks.
rv_thinkingstyle A randomized (character) variable that indicates the order in which pairs of thinking styles are to be presented ("deliberative vs. intuitive"; "reflective vs. spontaneous";" deliberative vs. spontaneous";"reflective vs. Intuitive"). The order is counterbalanced to reduce presentation bias in self-assessment tasks.
Binary choices on art preference:
c2_img_sel_1 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 1.
The binary variable indicates the participant's image preference:
1 corresponds to the cubist painting Les Baigneurs (the bathers), by Roger de La Fresnaye, 1912
2 corresponds to the expressionist painting Badende Mädchen (bathing girls), by August Macke, 1913
c2_img_sel_2 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 2.
The binary variable indicates the participant's image preference:
1 corresponds to the cubist painting Le Gouter (the taster, aka. tea time), by Jean Metzinger, 1911
2 corresponds to the expressionist painting La petite Jeanne, by Amedeo Modigliani, 1909
c2_img_sel_3 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 3.
The binary variable indicates the participant's image preference:
1 corresponds to the cubist painting Edtaonisl Ecclesiastic (the 1st word being an acronym made by alternating the French words for 'star' and 'dance'), by Francis Picabia, 1913
2 corresponds to the impressionist painting Femme avec parasol dans un jardin (woman with parasol in a garden), by Pierre-Auguste Renoir, 1875
c2_img_sel_4 A numeric (double) variable that represents the participant's preferred choice between 2 images in choice Set 4.
The binary variable indicates the participant's image preference:
1 corresponds to the expressionist painting Solitude, by Alexej von Jawlensky, 1912
2 corresponds to the impressionist painting Pont dans le Jardin de Monet (bridge in Monet’s garden), by Claude Monet, 1895–96
Variables describing habits and preferences:
c7_eating_habits A categorical (character) variable that indicates which dietary lifestyle an individual assigns to itself
(1 = "vegetarian"; 2 = "omnivore"; 3 = "vegan"; 4 = "pescetarian"; 5 = "flexitarian"; 6 = "carnivore"; 7 = "other").
t_eating_habits_other A character variable intended to capture free-text input for other dietary descriptions;
usually NA unless "other" was selected. May appear as logical if no responses were entered.
c7_apple A numeric (double) variable indicating how much a participant likes apples
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_cherry A numeric (double) variable indicating how much a participant likes cherries
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_broccoli A numeric (double) variable indicating how much a participant likes broccoli
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_asparagus A numeric (double) variable indicating how much a participant likes asparagus
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_spinach A numeric (double) variable indicating how much a participant likes spinach
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_mud A numeric (double) variable indicating how much a participant likes mud
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
c7_banana A numeric (double) variable indicating how much a participant likes bananas
on a 1-7 ranking scale (1 = highest preference, 7 = lowest preference, 0 if not ranked).
Note: Variables c7_apple to c7_banana were derived from a sorting/ranking task in which each participant sorted/ranked food items by preference.
Each item was subsequently coded as a numeric value between 1 and 7 (0 if not ranked).
Responses to binary choice items:
c2_decsleep_instant A categorical (character) variable indicating whether a participant prefers to sleep
before making important decisions ("sleep") or to make them instantly ("instant").
c2_shopperson_online A categorical (character) variable indicating whether a participant prefers shopping in person ("person") or online ("online").
c2_town_city A categorical (character) variable indicating whether a participant prefers living in a town ("town") or in a city ("city").
c2_club_house A categorical (character) variable indicating whether a participant prefers to party in a club ("club") or to attend an house party ("house").
c2_hotel_camping A categorical (character) variable capturing a participant's preference for staying in a hotel ("hotel") versus going camping ("camping").
c2_photo_being A categorical (character) variable indicating whether a participant prefers photographing ("photo") or being in a moment ("being").
c2_spring_fall A categorical (character) variable indicating whether a participant prefers the spring season ("spring") or the fall/autumn season ("fall").
c2_beach_mount A categorical (character) variable reflecting whether a participant prefers the beach ("beach") or the mountains ("mount").
c2_cats_dogs A categorical (character) variable indicating preference for cats ("cats") versus dogs ("dogs").
c2_indiv_team A categorical (character) variable indicating whether a participant prefers individual ("indiv") or team sports ("team").
c2_movies_books A categorical (character) variable indicating a participant's preference for movies ("movies") or books ("books").
c2_board_video A categorical (character) variable indicating whether a participant prefers board games ("board") or video games ("video").
c2_ios_android A categorical (character) variable indicating whether a participant prefers iOS ("ios") or Android ("android") as a mobile operating system.
c2_text_voice A categorical (character) variable indicating whether a participant prefers texting ("text") or sending voice messages ("voice").
c2_cook_bake A categorical (character) variable indicating whether a participant prefers cooking ("cook") or baking ("bake").
c2_pinapple_no A categorical (character) variable that records whether a participant likes pineapple on pizza ("yes") or not ("no").
c2_ketchup_mayo A categorical (character) variable indicating whether a participant prefers ketchup ("ketchup") or mayonnaise ("mayo").
c2_coffee_tea A categorical (character) variable indicating whether a participant prefers coffee ("coffee") or tea ("tea").
c2_math_lang A categorical (character) variable indicating whether a participant prefers mathematics ("math") or language-related subjects ("lang").
c2_odd_even A categorical (character) variable indicating whether a participant prefers odd numbers ("odd") or even numbers ("even").
c3_diff_bin A categorical (character) variable indicating how difficult it was for a participant to make their previous preference decisions (items 22–41) .
Response options include "yes", "a little", and "no".
This item captures perceived decisional difficulty and may serve as an indicator of response certainty, thinking style, or task engagement.
Variables on political opinions:
politics_left A numeric (double) variable representing the participant’s self-placement on a left–right political spectrum.
Values range from 1 (left) to 6 (right).
politics_liberal A numeric (double) variable representing self-placement on a liberal to conservative scale, ranging from 1 (liberal) to 6 (conservative).
Miscellaneous estimates, choices, opinions, and preferences:
tn_estimate_sun A numeric (double) variable capturing the participant’s estimate of how many times larger the sun’s diameter is compared to that of the earth.
This item serves as a manipulation check for the anchoring effect, based on previously presented numeric anchors (e.g., 42 or 242).
t_att_check_1 A character variable containing the participant’s open-text response to an attention check prompt ("Please type: 'I read the instructions'").
This attention check allows detecting inattentive or automated responses.
c2_fly_invisible A categorical (character) variable indicating whether the participant would prefer the superpower of flying ("fly") or becoming invisible ("invisible").
t_fly_invisible_explain A character variable where participants explain their choice between flying and invisibility.
This free text answer allows for qualitative analysis of a participant's justifications and motivations.
combined_c_ser_hum_self A numeric (double) variable reflecting a participant’s self-assessment on a "serious vs. humorous" scale.
The score is based on a 4-point or 5-point Likert scale, depending on random assignment.
This variable is used to test how perspective (self vs. others) and scale format (presence vs. absence of a middle option) influences self-ratings.
combined_c_ser_hum_others A combined numeric (double) variable reflecting how humorous or serious participants believe others to perceive them.
This score is derived from either a 4-point or 5-point scale and is used to examine the effect of perspective and scale design on perceived external ratings.
c4_chronotype A categorical (character) variable indicating whether the participant identifies as a
morning person ("morning"), evening person ("evening") mid-day person ("mid-day") or a never person ("never").
tn_sleep A numeric (double) variable indicating the typical number of hours the participant typically sleeps per night.
tn_bedtime A character variable representing the participant’s usual bedtime,
to be entered in 24-hour format (e.g., "22:30", "00:00").
tn_anchor_recall_1 A numeric (double) variable recording the number (either 42 or 242) that the participant was previously asked to memorize and later recall.
It is used to test memory for the anchor manipulation.
combined_admired A combined numeric (double) variable reflecting how much a participant wants to be admired by others,
rated on a 1–6 Likert scale (1 = not at all, 6 = very much).
combined_feared A combined numeric (double) variable reflecting how much a participant wants to be feared by others,
rated on a 1–6 Likert scale (1 = not at all, 6 = very much).
combined_loved A combined numeric (double) variable reflecting how much a participant wants to be loved by others,
rated on a 1–6 Likert scale (1 = not at all, 6 = very much).
combined_respected A combined numeric (double) variable reflecting how much a participant wants to be respected by others,
rated on a 1–6 Likert scale (1 = not at all, 6 = very much).
c7_pess_opti A numeric (double) variable capturing a participant’s self-rated tendency toward pessimism versus optimism,
on a 7-point scale (1 = very pessimistic, 7 = very optimistic).
c7_story_list A numeric (double) variable indicating how much a participant enjoys listening to or reading stories,
rated from 1 (not at all) to 7 (very much).
c7_stab_adv A numeric (double) variable indicating a participant’s self-assessed position on a stability versus adventurousness spectrum,
rated on a scale from 1 (very stable) to 7 (very adventurous).
This variable may indicate personality traits related to risk-taking.
think_reflect A numeric (double) variable representing a participant’s placement on a bipolar scale ranging from 1 ("reflective") to 6 (either "spontaneous" or " intuitive").
The specific version of the 2nd scale anchor is randomly assigned.
think_delib A numeric (double) variable representing a participant’s placement on a bipolar scale ranging from 1 ("deliberative") to 6 (either "intuitive" or " spontaneous".
The specific version of the 2nd scale anchor is randomly assigned.
c4_intro_extrovert A categorical (character) variable indicating a participant's self-rated social orientation:
"introverted", "extroverted", or mixed variants such as "extro-intro" or "intro-extro".
tn_favorit_number A numeric (double) variable capturing a participant’s favorite number,
in free answer format.
c3_cutlery A categorical (character) variable indicating which piece of cutlery a participant most identifies with.
The 3 possible values include "knife", "fork", and "spoon".
c3_rock_paper_scissors A categorical (character) variable capturing a participant's selection in a rock–paper–scissors scenario.
The 3 possible values are "rock", "paper", or "scissors".
c5_att_check_2 A numeric (double) variable used as an attention check.
Participants were asked to select the number that most resembles the shape of a circle.
The correct response is 0, which corresponds to scale option 5.
Responses deviating from this may indicate inattentiveness.
c6_barnum_accuracy A numeric (double) variable indicating how accurately a participant rated a generic personality description (i.e., a Barnum statement),
on a scale from 1 (poor) to 6 (perfect).
This variable is used to assess susceptibility to the so-called Barnum effect (i.e., the tendency to perceive vague and general statements as highly accurate).
t_anchor_recall_2 A numeric (double) variable recording whether a participant correctly remembered a previously presented number (either 42 or 242).
This assesses memory and anchoring manipulation success (for a 2nd time).
Other person-related variables:
c9_occupation A categorical (character) variable indicating a participant’s current occupational status
(e.g., "student", "employed", "other"). This variable may be used for demographic segmentation.
t_occupation_other A logical variable for free-text input if a participant selected "other" for occupation.
This variable captures detailed occupational descriptions not covered by the pre-defined options.
c7_education A categorical (character) variable indicating a participant’s highest completed education level
(e.g., "high school", "bachelor", "master"). This variable may be used for demographic segmentation.
t_education_other A logical variable to allow participants to enter their education level in free text (if "other" was selected).
This variable enables open-format responses for less common education paths.
c3_current_degree A categorical (character) variable indicating the type of academic degree a participant is currently pursuing (e.g, "bachelor", "master").
This variable provides educational context for other academic measures.
tn_semester A numeric (double) variable indicating the current semester of study reported by a participant (e.g., 1, 6, 10).
This variable helps contextualize course experience and academic progress.
c14_studyfield A categorical (character) variable indicating the participant’s field of study (e.g., "psychology", "data science").
This variable is used to examine field-specific attitudes and skills.
t_studyfield_other A character variable capturing free-text responses if the participant selected "other" as their study field.
This variable allows classification of less common disciplines.
Preferences for course contents:
c5_pref_stats A numeric (double) variable indicating a participant’s interest in preparing data for statistical analysis,
rated on a scale from 1 (no interest) to 5 (absolutely essential).
c5_pref_visualize A numeric (double) variable indicating a participant's interest in data visualization in R,
rated on a scale from 1 (no interest) to 5 (absolutely essential).
c5_pref_sims A numeric (double) variable indicating a participant’s interest in using R for simulations and modeling,
rated on a scale from 1 (no interest) to 5 (absolutely essential).
c5_pref_shiny A numeric (double) variable capturing how essential a participant considers learning to build interactive web applications using R Shiny.
Responses range from 1 (no interest) to 5 (absolutely essential).
c5_pref_scrape A numeric (double) variable capturing how essential a participant considers learning web scraping with R.
Responses range from 1 (no interest) to 5 (absolutely essential).
c5_pref_arts A numeric (double) variable capturing how essential a participant considers exploring artistic or creative aspects of data science (e.g., generative art in R).
Responses range from 1 (no interest) to 5 (absolutely essential).
Course-related expectations and worries:
t_crs_expect_i2ds_1 A character variable containing free-text input describing a participant’s expectations and hopes for the course Introduction to Data Science 1: Basics (i2ds 1).
t_crs_worry_i2ds_1 A character variable capturing free-text responses describing a participant’s worries and reservations related to the course Introduction to Data Science 1: Basics (i2ds 1).
t_crs_expect_i2ds_2 A character variable containing free-text input describing a participant’s expectations and hopes for the course Introduction to Data Science 2: Applications (i2ds 2).
t_crs_worry_i2ds_2 A character variable capturing free-text input describing a participant’s worries and reservations concerns related to the course Introduction to Data Science 2: Applications (i2ds 2).
t_crs_expect_ds4psy A logical variable containing free-text input describing a participant’s expectations and hopes for the course Data Science for Psychology (ds4psy).
t_crs_worry_ds4psy A logical variable describing a participant’s worries and reservations regarding the course Data Science for Psychology (ds4psy), in free text format.
t_crs_expect_viz A logical variable containing free-text input describing a participant’s expectations and hopes for the course Data visualization for psychology research (vis4psy).
t_crs_worry_viz A logical variable describing a participant’s worries and reservations regarding the course Data visualization for psychology research (vis4psy), in free text format.
Variables on prior experience:
c6_exp_math A numeric (double) variable indicating a participant’s self-assessed experience with mathematics,
rated on a scale from 1 (no experience) to 6 (extremely experienced).
c6_exp_statistics A numeric (double) variable measuring a participant’s self-assessed experience with statistics,
rated on a scale from 1 (no experience) to 6 (extremely experienced).
c6_exp_program A numeric (double) variable indicating a participant’s experience with programming (any programming language),
rated on a scale from 1 (no experience) to 6 (extremely experienced).
c6_exp_r A numeric (double) variable indicating a participant’s experience with R programming,
rated on a scale from 1 (no experience) to 6 (extremely experienced).
c6_exp_datavisual A numeric (double) variable capturing a participant’s prior experience with data visualization,
rated on a scale from 1 (no experience) to 6 (extremely experienced).
Survey feedback:
t_feedback An optional character variable containing general feedback provided by the participant regarding the survey or course.
This is an open-ended text field for final comments, impressions, or suggestions.
Session info:
referer URL of referring page.
datetime Date and time of initial survey access.
duration Session duration (in seconds).
date_of_last_access Date and time of final survey access.
See the codebook and print version for additional coding details.
Missing values are represented as NA values in the data.
These can be due to a participant not providing a response to an item or to an item not being applicable to this participant.
See online survey at https://ww3.unipark.de/uc/i2ds_survey/.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
invert_rules allows decoding messages that were
encoded by a set of rules x.
invert_rules(x)invert_rules(x)
x |
The rules used for encoding a message (as a named vector). |
x is assumed to be a named vector.
invert_rules replaces the elements of x
by the names of x, and vice versa.
A message is issued if the elements of x are repeated
(i.e., decoding is non-unique).
A character vector.
transl33t for encoding text (e.g., into leet slang);
l33t_rul35 for default rules used.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
invert_rules(l33t_rul35) # Note repeated elements # Encoding and decoding a message: (txt_0 <- "Hello world! How are you doing today?") # message (txt_1 <- transl33t(txt_0, rules = l33t_rul35)) # encoding (txt_2 <- transl33t(txt_1, rules = invert_rules(l33t_rul35))) # decodinginvert_rules(l33t_rul35) # Note repeated elements # Encoding and decoding a message: (txt_0 <- "Hello world! How are you doing today?") # message (txt_1 <- transl33t(txt_0, rules = l33t_rul35)) # encoding (txt_2 <- transl33t(txt_1, rules = invert_rules(l33t_rul35))) # decoding
is_equal tests if two vectors x and y are pairwise equal.
is_equal(x, y, ...)is_equal(x, y, ...)
x |
1st vector to compare (required). |
y |
2nd vector to compare (required). |
... |
Other parameters (passed to |
If both x and y are numeric,
is_equal calls num_equal(x, y, ...)
(allowing for a tolerance threshold tol).
Otherwise, x and y are compared by x == y.
is_equal provides a wrapper around num_equal
(for numeric objects x and y) and == (otherwise).
num_equal function for comparing numeric vectors;
all.equal function of the R base package;
near of the dplyr package.
Other numeric functions:
base2dec(),
base_digits,
dec2base(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
# numeric data: is_equal(2, sqrt(2)^2) is_equal(2, sqrt(2)^2, tol = 0) is_equal(c(2, 3), c(sqrt(2)^2, sqrt(3)^2, 4/2, 9/3)) # other data types: is_equal((1:3 > 1), (1:3 > 2)) # logical is_equal(c("A", "B", "c"), toupper(c("a", "b", "c"))) # character is_equal(as.Date("2023-10-30"), Sys.Date()) # dates # factors: is_equal((1:3 > 1), as.factor((1:3 > 2))) is_equal(c(1, 2, 3), as.factor(c(1, 2, 3))) is_equal(c("A", "B", "C"), as.factor(c("A", "B", "C")))# numeric data: is_equal(2, sqrt(2)^2) is_equal(2, sqrt(2)^2, tol = 0) is_equal(c(2, 3), c(sqrt(2)^2, sqrt(3)^2, 4/2, 9/3)) # other data types: is_equal((1:3 > 1), (1:3 > 2)) # logical is_equal(c("A", "B", "c"), toupper(c("a", "b", "c"))) # character is_equal(as.Date("2023-10-30"), Sys.Date()) # dates # factors: is_equal((1:3 > 1), as.factor((1:3 > 2))) is_equal(c(1, 2, 3), as.factor(c(1, 2, 3))) is_equal(c("A", "B", "C"), as.factor(c("A", "B", "C")))
is_leap_year checks whether a given year
(provided as a date or time dt,
or number/string denoting a 4-digit year)
lies in a so-called leap year (i.e., a year containing a date of Feb-29).
is_leap_year(dt)is_leap_year(dt)
dt |
Date or time (scalar or vector). Numbers or strings with dates are parsed into 4-digit numbers denoting the year. |
When dt is not recognized as "Date" or "POSIXt" object(s),
is_leap_year aims to parse a string dt
as describing year(s) in a "dddd" (4-digit year) format,
as a valid "Date" string (to retrieve the 4-digit year "%Y"),
or a numeric dt as 4-digit integer(s).
is_leap_year then solves the task
by verifying the numeric definition of a "leap year"
(see https://en.wikipedia.org/wiki/Leap_year).
An alternative solution that tried using
as.Date() for defining a "Date" of Feb-29
in the corresponding year(s) was removed,
as it evaluated NA values as FALSE.
Boolean vector.
See https://en.wikipedia.org/wiki/Leap_year for definition.
days_in_month for the number of days in given months;
diff_tz for time zone-based time differences;
leap_year function of the lubridate package.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
is_leap_year(2020) (days_this_year <- 365 + is_leap_year(Sys.Date())) # from dates: is_leap_year(Sys.Date()) is_leap_year(as.Date("2022-02-28")) # from times: is_leap_year(Sys.time()) is_leap_year(as.POSIXct("2022-10-11 10:11:12")) is_leap_year(as.POSIXlt("2022-10-11 10:11:12")) # from non-integers: is_leap_year(2019.5) # For vectors: is_leap_year(2020:2028) # with dt as strings: is_leap_year(c("2020", "2021")) is_leap_year(c("2020-02-29 01:02:03", "2021-02-28 01:02")) # Note: Invalid date string yields error: # is_leap_year("2021-02-29")is_leap_year(2020) (days_this_year <- 365 + is_leap_year(Sys.Date())) # from dates: is_leap_year(Sys.Date()) is_leap_year(as.Date("2022-02-28")) # from times: is_leap_year(Sys.time()) is_leap_year(as.POSIXct("2022-10-11 10:11:12")) is_leap_year(as.POSIXlt("2022-10-11 10:11:12")) # from non-integers: is_leap_year(2019.5) # For vectors: is_leap_year(2020:2028) # with dt as strings: is_leap_year(c("2020", "2021")) is_leap_year(c("2020-02-29 01:02:03", "2021-02-28 01:02")) # Note: Invalid date string yields error: # is_leap_year("2021-02-29")
is_vect tests if x is a vector.
is_vect(x)is_vect(x)
x |
Vector(s) to test (required). |
is_vect does what the base R function is.vector is not designed to do:
is_vect() returns TRUE if x is an atomic vector or a list (irrespective of its attributes).
is.vector() returns TRUE if x is a vector of the specified mode having no attributes other than names, otherwise FALSE.
Internally, the function is a wrapper for is.atomic(x) | is.list(x).
Note that data frames are also vectors.
See the is_vector function of the purrr package
and the base R functions
is.atomic, is.list, and is.vector,
for details.
is_vect function of the purrr package;
is.atomic function of the R base package;
is.list function of the R base package;
is.vector function of the R base package.
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_wholenumber(),
num_as_char(),
num_as_ordinal(),
num_equal()
# Define 3 types of vectors: v1 <- 1:3 # (a) atomic vector names(v1) <- LETTERS[v1] # with names v2 <- v1 # (b) copy vector attr(v2, "my_attr") <- "foo" # add an attribute ls <- list(1, 2, "C") # (c) list # Compare: is.vector(v1) is.list(v1) is_vect(v1) is.vector(v2) # FALSE is.list(v2) is_vect(v2) # TRUE is.vector(ls) is.list(ls) is_vect(ls) # Data frames are also vectors: df <- as.data.frame(1:3) is_vect(df) # is TRUE# Define 3 types of vectors: v1 <- 1:3 # (a) atomic vector names(v1) <- LETTERS[v1] # with names v2 <- v1 # (b) copy vector attr(v2, "my_attr") <- "foo" # add an attribute ls <- list(1, 2, "C") # (c) list # Compare: is.vector(v1) is.list(v1) is_vect(v1) is.vector(v2) # FALSE is.list(v2) is_vect(v2) # TRUE is.vector(ls) is.list(ls) is_vect(ls) # Data frames are also vectors: df <- as.data.frame(1:3) is_vect(df) # is TRUE
is_wholenumber tests if x contains only integer numbers.
is_wholenumber(x, tol = .Machine$double.eps^0.5)is_wholenumber(x, tol = .Machine$double.eps^0.5)
x |
Number(s) to test (required, accepts numeric vectors). |
tol |
Numeric tolerance value.
Default: |
is_wholenumber does what the base R function is.integer is not designed to do:
is_wholenumber() returns TRUE or FALSE depending on whether its numeric argument x is an integer value (i.e., a "whole" number).
is.integer() returns TRUE or FALSE depending on whether its argument is of integer type, and FALSE if its argument is a factor.
See the documentation of is.integer for definition and details.
is.integer function of the R base package.
Other numeric functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
num_as_char(),
num_as_ordinal(),
num_equal()
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_vect(),
num_as_char(),
num_as_ordinal(),
num_equal()
is_wholenumber(1) # is TRUE is_wholenumber(1/2) # is FALSE x <- seq(1, 2, by = 0.5) is_wholenumber(x) # Compare: is.integer(1+2) is_wholenumber(1+2)is_wholenumber(1) # is TRUE is_wholenumber(1/2) # is FALSE x <- seq(1, 2, by = 0.5) is_wholenumber(x) # Compare: is.integer(1+2) is_wholenumber(1+2)
l33t_rul35 specifies rules for translating characters
into other characters (typically symbols) to mimic
leet/l33t slang (as a named character vector).
l33t_rul35l33t_rul35
An object of class character of length 13.
Old (i.e., to be replaced) characters are
paste(names(l33t_rul35), collapse = "").
New (i.e., replaced) characters are
paste(l33t_rul35, collapse = "").
See https://en.wikipedia.org/wiki/Leet for details.
transl33t for a corresponding function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
make_grid generates a grid of x/y coordinates and returns it
(as a data frame).
make_grid(x_min = 0, x_max = 2, y_min = 0, y_max = 1)make_grid(x_min = 0, x_max = 2, y_min = 0, y_max = 1)
x_min |
Minimum x coordinate.
Default: |
x_max |
Maximum x coordinate.
Default: |
y_min |
Minimum y coordinate.
Default: |
y_max |
Maximum y coordinate.
Default: |
Other data functions:
get_set()
make_grid() make_grid(x_min = -3, x_max = 3, y_min = -2, y_max = 2)make_grid() make_grid(x_min = -3, x_max = 3, y_min = -2, y_max = 2)
map_text_chars parses text
(from a text string x)
into a table that contains a row for each character
and x/y-coordinates corresponding to the character positions in x.
map_text_chars(x, flip_y = FALSE)map_text_chars(x, flip_y = FALSE)
x |
The text string(s) to map (required).
If |
flip_y |
Boolean: Should y-coordinates be flipped,
so that the lowest line in the text file becomes |
map_text_chars creates a data frame with 3 variables:
Each character's x- and y-coordinates (from top to bottom)
and a variable char for the character at these coordinates.
Note that map_text_chars was originally a part of
read_ascii, but has been separated to
enable independent access to separate functionalities.
Note that map_text_chars is replaced by the simpler
map_text_coord function.
A data frame with 3 variables:
Each character's x- and y-coordinates (from top to bottom)
and a variable char for the character at this coordinate.
read_ascii for parsing text from file or user input;
plot_chars for a character plotting function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
map_text_coord parses text (from a text string x)
into a table that contains a row for each character
and x/y-coordinates corresponding to the character positions in x.
map_text_coord(x, flip_y = FALSE, sep = "")map_text_coord(x, flip_y = FALSE, sep = "")
x |
The text string(s) to map (required).
If |
flip_y |
Boolean: Should y-coordinates be flipped,
so that the lowest line in the text file becomes |
sep |
Character to insert between the elements
of a multi-element character vector as input |
map_text_coord creates a data frame with 3 variables:
Each character's x- and y-coordinates (from top to bottom)
and a variable char for the character at these coordinates.
Note that map_text_coord was originally a part of
read_ascii, but has been separated to
enable independent access to separate functionalities.
A data frame with 3 variables:
Each character's x- and y-coordinates (from top to bottom)
and a variable char for the character at this coordinate.
map_text_regex for mapping text to a character table and matching patterns;
plot_charmap for plotting character maps;
plot_chars for creating and plotting character maps;
read_ascii for parsing text from file or user input.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
map_text_coord("Hello world!") # 1 line of text map_text_coord(c("Hello", "world!")) # 2 lines of text map_text_coord(c("Hello", " ", "world!")) # 3 lines of text ## Read text from file: ## Create a temporary file "test.txt": # cat("Hello world!", "This is a test.", # "Can you see this text?", "Good! Please carry on...", # file = "test.txt", sep = "\n") # txt <- read_ascii("test.txt") # map_text_coord(txt) # unlink("test.txt") # clean up (by deleting file).map_text_coord("Hello world!") # 1 line of text map_text_coord(c("Hello", "world!")) # 2 lines of text map_text_coord(c("Hello", " ", "world!")) # 3 lines of text ## Read text from file: ## Create a temporary file "test.txt": # cat("Hello world!", "This is a test.", # "Can you see this text?", "Good! Please carry on...", # file = "test.txt", sep = "\n") # txt <- read_ascii("test.txt") # map_text_coord(txt) # unlink("test.txt") # clean up (by deleting file).
map_text_regex parses text (from a file or user input)
into a data frame that contains a row for each
character of x.
map_text_regex( x = NA, file = "", lbl_hi = NA, lbl_lo = NA, bg_hi = NA, bg_lo = "[[:space:]]", lbl_rotate = NA, case_sense = TRUE, lbl_tiles = TRUE, col_lbl = "black", col_lbl_hi = pal_ds4psy[[1]], col_lbl_lo = pal_ds4psy[[9]], col_bg = pal_ds4psy[[7]], col_bg_hi = pal_ds4psy[[4]], col_bg_lo = "white", col_sample = FALSE, rseed = NA, angle_fg = c(-90, 90), angle_bg = 0 )map_text_regex( x = NA, file = "", lbl_hi = NA, lbl_lo = NA, bg_hi = NA, bg_lo = "[[:space:]]", lbl_rotate = NA, case_sense = TRUE, lbl_tiles = TRUE, col_lbl = "black", col_lbl_hi = pal_ds4psy[[1]], col_lbl_lo = pal_ds4psy[[9]], col_bg = pal_ds4psy[[7]], col_bg_hi = pal_ds4psy[[4]], col_bg_lo = "white", col_sample = FALSE, rseed = NA, angle_fg = c(-90, 90), angle_bg = 0 )
x |
The text to map or plot (as a character vector).
Different elements denote different lines of text.
If |
file |
A text file to read (or its path).
If |
lbl_hi |
Labels to highlight (as regex).
Default: |
lbl_lo |
Labels to de-emphasize (as regex).
Default: |
bg_hi |
Background tiles to highlight (as regex).
Default: |
bg_lo |
Background tiles to de-emphasize (as regex).
Default: |
lbl_rotate |
Labels to rotate (as regex).
Default: |
case_sense |
Boolean: Distinguish
lower- vs. uppercase characters in pattern matches?
Default: |
lbl_tiles |
Are character labels shown?
This enables pattern matching for (fg) color and
angle aesthetics.
Default: |
col_lbl |
Default color of text labels.
Default: |
col_lbl_hi |
Highlighting color of text labels.
Default: |
col_lbl_lo |
De-emphasizing color of text labels.
Default: |
col_bg |
Default color to fill background tiles.
Default: |
col_bg_hi |
Highlighting color to fill background tiles.
Default: |
col_bg_lo |
De-emphasizing color to fill background tiles.
Default: |
col_sample |
Boolean: Sample color vectors (within category)?
Default: |
rseed |
Random seed (number).
Default: |
angle_fg |
Angle(s) for rotating character labels
matching the pattern of the |
angle_bg |
Angle(s) of rotating character labels
not matching the pattern of the |
map_text_regex allows for regular expression (regex)
to match text patterns and create corresponding variables
(e.g., for color or orientation).
Five regular expressions and corresponding color and angle arguments allow identifying, marking (highlighting or de-emphasizing), and rotating those sets of characters (i.e, their text labels or fill colors). that match the provided patterns.
The plot generated by plot_chars is character-based:
Individual characters are plotted at equidistant x-y-positions
and the aesthetic settings provided for text labels and tile fill colors.
map_text_regex returns a plot description (as a data frame).
Using this output as an input to plot_charmap plots
text in a character-based fashion (i.e., individual characters are
plotted at equidistant x-y-positions).
Together, both functions replace the over-specialized
plot_chars and plot_text functions.
A data frame describing a plot.
map_text_coord for mapping text to a table of character coordinates;
plot_charmap for plotting character maps;
plot_chars for creating and plotting character maps;
plot_text for plotting characters and color tiles by frequency;
read_ascii for reading text inputs into a character string.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
## (1) From text string(s): ts <- c("Hello world!", "This is a test to test this splendid function", "Does this work?", "That's good.", "Please carry on.") sum(nchar(ts)) # (a) simple use: map_text_regex(ts) # (b) matching patterns (regex): map_text_regex(ts, lbl_hi = "\\b\\w{4}\\b", bg_hi = "[good|test]", lbl_rotate = "[^aeiou]", angle_fg = c(-45, +45)) ## (2) From user input: # map_text_regex() # (enter text in Console) ## (3) From text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # # map_text_regex(file = "test.txt") # default # map_text_regex(file = "test.txt", lbl_hi = "[[:upper:]]", lbl_lo = "[[:punct:]]", # col_lbl_hi = "red", col_lbl_lo = "blue") # # map_text_regex(file = "test.txt", lbl_hi = "[aeiou]", col_lbl_hi = "red", # col_bg = "white", bg_hi = "see") # mark vowels and "see" (in bg) # map_text_regex(file = "test.txt", bg_hi = "[aeiou]", col_bg_hi = "gold") # mark (bg of) vowels # # # Label options: # map_text_regex(file = "test.txt", bg_hi = "see", lbl_tiles = FALSE) # map_text_regex(file = "test.txt", angle_bg = c(-20, 20)) # # unlink("test.txt") # clean up (by deleting file).## (1) From text string(s): ts <- c("Hello world!", "This is a test to test this splendid function", "Does this work?", "That's good.", "Please carry on.") sum(nchar(ts)) # (a) simple use: map_text_regex(ts) # (b) matching patterns (regex): map_text_regex(ts, lbl_hi = "\\b\\w{4}\\b", bg_hi = "[good|test]", lbl_rotate = "[^aeiou]", angle_fg = c(-45, +45)) ## (2) From user input: # map_text_regex() # (enter text in Console) ## (3) From text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # # map_text_regex(file = "test.txt") # default # map_text_regex(file = "test.txt", lbl_hi = "[[:upper:]]", lbl_lo = "[[:punct:]]", # col_lbl_hi = "red", col_lbl_lo = "blue") # # map_text_regex(file = "test.txt", lbl_hi = "[aeiou]", col_lbl_hi = "red", # col_bg = "white", bg_hi = "see") # mark vowels and "see" (in bg) # map_text_regex(file = "test.txt", bg_hi = "[aeiou]", col_bg_hi = "gold") # mark (bg of) vowels # # # Label options: # map_text_regex(file = "test.txt", bg_hi = "see", lbl_tiles = FALSE) # map_text_regex(file = "test.txt", angle_bg = c(-20, 20)) # # unlink("test.txt") # clean up (by deleting file).
metachar provides the metacharacters of extended regular expressions
(as a character vector).
metacharmetachar
An object of class character of length 12.
metachar allows illustrating the notion of
meta-characters in regular expressions
(and provides corresponding exemplars).
See ?base::regex for details on regular expressions
and ?"'" for a list of character constants/quotes in R.
cclass for a vector of character classes.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
metachar length(metachar) # 12 nchar(paste0(metachar, collapse = "")) # 12metachar length(metachar) # 12 nchar(paste0(metachar, collapse = "")) # 12
num_as_char converts a number into a character sequence
(of a specific length).
num_as_char(x, n_pre_dec = 2, n_dec = 2, sym = "0", sep = ".")num_as_char(x, n_pre_dec = 2, n_dec = 2, sym = "0", sep = ".")
x |
Number(s) to convert (required, accepts numeric vectors). |
n_pre_dec |
Number of digits before the decimal separator.
Default: |
n_dec |
Number of digits after the decimal separator.
Default: |
sym |
Symbol to add to front or back.
Default: |
sep |
Decimal separator to use.
Default: |
The arguments n_pre_dec and n_dec set a number of desired digits
before and after the decimal separator sep.
num_as_char tries to meet these digit numbers by adding zeros to the front
and end of x. However, when n_pre_dec is lower than the
number of relevant (pre-decimal) digits, all relevant digits are shown.
n_pre_dec also works for negative numbers, but
the minus symbol is not counted as a (pre-decimal) digit.
Caveat: Note that this function illustrates how numbers,
characters, for loops, and paste() can be combined
when writing functions. It is not written efficiently or well.
Other numeric functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_wholenumber(),
num_as_ordinal(),
num_equal()
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_vect(),
is_wholenumber(),
num_as_ordinal(),
num_equal()
num_as_char(1) num_as_char(10/3) num_as_char(1000/6) # rounding down: num_as_char((1.3333), n_pre_dec = 0, n_dec = 0) num_as_char((1.3333), n_pre_dec = 2, n_dec = 0) num_as_char((1.3333), n_pre_dec = 2, n_dec = 1) # rounding up: num_as_char(1.6666, n_pre_dec = 1, n_dec = 0) num_as_char(1.6666, n_pre_dec = 1, n_dec = 1) num_as_char(1.6666, n_pre_dec = 2, n_dec = 2) num_as_char(1.6666, n_pre_dec = 2, n_dec = 3) # Note: If n_pre_dec is too small, actual number is kept: num_as_char(11.33, n_pre_dec = 0, n_dec = 1) num_as_char(11.66, n_pre_dec = 1, n_dec = 1) # Note: num_as_char(1, sep = ",") num_as_char(2, sym = " ") num_as_char(3, sym = " ", n_dec = 0) # for vectors: num_as_char(1:10/1, n_pre_dec = 1, n_dec = 1) num_as_char(1:10/3, n_pre_dec = 2, n_dec = 2) # for negative numbers (adding relevant pre-decimals): mix <- c(10.33, -10.33, 10.66, -10.66) num_as_char(mix, n_pre_dec = 1, n_dec = 1) num_as_char(mix, n_pre_dec = 1, n_dec = 0) # Beware of bad inputs: num_as_char(4, sym = "8") num_as_char(5, sym = "99")num_as_char(1) num_as_char(10/3) num_as_char(1000/6) # rounding down: num_as_char((1.3333), n_pre_dec = 0, n_dec = 0) num_as_char((1.3333), n_pre_dec = 2, n_dec = 0) num_as_char((1.3333), n_pre_dec = 2, n_dec = 1) # rounding up: num_as_char(1.6666, n_pre_dec = 1, n_dec = 0) num_as_char(1.6666, n_pre_dec = 1, n_dec = 1) num_as_char(1.6666, n_pre_dec = 2, n_dec = 2) num_as_char(1.6666, n_pre_dec = 2, n_dec = 3) # Note: If n_pre_dec is too small, actual number is kept: num_as_char(11.33, n_pre_dec = 0, n_dec = 1) num_as_char(11.66, n_pre_dec = 1, n_dec = 1) # Note: num_as_char(1, sep = ",") num_as_char(2, sym = " ") num_as_char(3, sym = " ", n_dec = 0) # for vectors: num_as_char(1:10/1, n_pre_dec = 1, n_dec = 1) num_as_char(1:10/3, n_pre_dec = 2, n_dec = 2) # for negative numbers (adding relevant pre-decimals): mix <- c(10.33, -10.33, 10.66, -10.66) num_as_char(mix, n_pre_dec = 1, n_dec = 1) num_as_char(mix, n_pre_dec = 1, n_dec = 0) # Beware of bad inputs: num_as_char(4, sym = "8") num_as_char(5, sym = "99")
num_as_ordinal converts a given (cardinal) number
into an ordinal character sequence.
num_as_ordinal(x, sep = "")num_as_ordinal(x, sep = "")
x |
Number(s) to convert (required, scalar or vector). |
sep |
Decimal separator to use.
Default: |
The function currently only works for the English language and does not accepts inputs that are characters, dates, or times.
Note that the toOrdinal() function of the toOrdinal package works
for multiple languages and provides a toOrdinalDate() function.
Caveat: Note that this function illustrates how numbers,
characters, for loops, and paste() can be combined
when writing functions.
It is instructive, but not written efficiently or well
(see the function definition for an alternative solution
using vector indexing).
toOrdinal() function of the toOrdinal package.
Other numeric functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_wholenumber(),
num_as_char(),
num_equal()
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_equal()
num_as_ordinal(1:4) num_as_ordinal(10:14) # all with "th" num_as_ordinal(110:114) # all with "th" num_as_ordinal(120:124) # 4 different suffixes num_as_ordinal(1:15, sep = "-") # using sep # Note special cases: num_as_ordinal(NA) num_as_ordinal("1") num_as_ordinal(Sys.Date()) num_as_ordinal(Sys.time()) num_as_ordinal(seq(1.99, 2.14, by = .01))num_as_ordinal(1:4) num_as_ordinal(10:14) # all with "th" num_as_ordinal(110:114) # all with "th" num_as_ordinal(120:124) # 4 different suffixes num_as_ordinal(1:15, sep = "-") # using sep # Note special cases: num_as_ordinal(NA) num_as_ordinal("1") num_as_ordinal(Sys.Date()) num_as_ordinal(Sys.time()) num_as_ordinal(seq(1.99, 2.14, by = .01))
num_equal tests if two numeric vectors x and y are pairwise equal
(within a tolerance value 'tol').
num_equal(x, y, tol = .Machine$double.eps^0.5)num_equal(x, y, tol = .Machine$double.eps^0.5)
x |
1st numeric vector to compare (required, assumes a numeric vector). |
y |
2nd numeric vector to compare (required, assumes a numeric vector). |
tol |
Numeric tolerance value.
Default: |
num_equal verifies that x and y are numeric and
then evaluates abs(x - y) < tol.
Thus, num_equal provides a safer way to verify the (near) equality of numeric vectors than ==
(due to possible floating point effects).
is_equal function for generic vectors;
all.equal function of the R base package;
near function of the dplyr package.
Other numeric functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_wholenumber(),
num_as_char(),
num_as_ordinal()
Other utility functions:
base2dec(),
base_digits,
dec2base(),
is_equal(),
is_vect(),
is_wholenumber(),
num_as_char(),
num_as_ordinal()
num_equal(2, sqrt(2)^2) # Recycling: num_equal(c(2, 3), c(sqrt(2)^2, sqrt(3)^2, 4/2, 9/3)) # Contrast: .1 == .3/3 num_equal(.1, .3/3) # Contrast: v <- c(.9 - .8, .8 - .7, .7 - .6, .6 - .5, .5 - .4, .4 - .3, .3 - .2, .2 -.1, .1) unique(v) .1 == v num_equal(.1, v)num_equal(2, sqrt(2)^2) # Recycling: num_equal(c(2, 3), c(sqrt(2)^2, sqrt(3)^2, 4/2, 9/3)) # Contrast: .1 == .3/3 num_equal(.1, .3/3) # Contrast: v <- c(.9 - .8, .8 - .7, .7 - .6, .6 - .5, .5 - .4, .4 - .3, .3 - .2, .2 -.1, .1) unique(v) .1 == v num_equal(.1, v)
outliers is a fictitious dataset containing the id, sex, and height
of 1000 non-existing, but otherwise normal people.
outliersoutliers
A table with 100 cases (rows) and 3 variables (columns).
Codebook
Participant ID (as character code)
Gender (female vs. male)
Height (in cm)
See CSV data at http://rpository.com/ds4psy/data/out.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
pal_ds4psy provides a dedicated color palette.
pal_ds4psypal_ds4psy
An object of class data.frame with 1 rows and 11 columns.
By default, pal_ds4psy is based on
pal_unikn of the unikn package.
Other color objects and functions:
pal_n_sq()
pal_n_sq returns n^2 dedicated colors of a color palette pal
(up to a maximum of n = "all" colors).
pal_n_sq(n = "all", pal = pal_ds4psy)pal_n_sq(n = "all", pal = pal_ds4psy)
n |
The desired number colors of pal (as a number)
or the character string |
pal |
A color palette (as a data frame).
Default: |
Use the more specialized function unikn::usecol for choosing
n dedicated colors of a known color palette.
plot_tiles to plot tile plots.
Other color objects and functions:
pal_ds4psy
pal_n_sq(1) # 1 color: seeblau3 pal_n_sq(2) # 4 colors pal_n_sq(3) # 9 colors (5: white) pal_n_sq(4) # 11 colors (6: white)pal_n_sq(1) # 1 color: seeblau3 pal_n_sq(2) # 4 colors pal_n_sq(3) # 9 colors (5: white) pal_n_sq(4) # 11 colors (6: white)
pi_100k is a dataset containing the first 100k digits of pi.
pi_100kpi_100k
A character of nchar(pi_100k) = 100001.
See TXT data at http://rpository.com/ds4psy/data/pi_100k.txt.
Original data at http://www.geom.uiuc.edu/~huberty/math5337/groupe/digits.html.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
plot_charmap plots a character map and some aesthetics
as a tile plot with text labels (using ggplot2).
plot_charmap( x = NA, file = "", lbl_tiles = TRUE, col_lbl = "black", angle = 0, cex = 3, fontface = 1, family = "sans", col_bg = "grey80", borders = FALSE, border_col = "white", border_size = 0.5 )plot_charmap( x = NA, file = "", lbl_tiles = TRUE, col_lbl = "black", angle = 0, cex = 3, fontface = 1, family = "sans", col_bg = "grey80", borders = FALSE, border_col = "white", border_size = 0.5 )
x |
A character map, as generated by
|
file |
A text file to read (or its path).
If |
lbl_tiles |
Add character labels to tiles?
Default: |
col_lbl |
Default color of text labels
(unless specified as a column |
angle |
Default angle of text labels
(unless specified as a column of |
cex |
Character size (numeric).
Default: |
fontface |
Font face of text labels (numeric).
Default: |
family |
Font family of text labels (name).
Default: |
col_bg |
Default color to fill background tiles
(unless specified as a column |
borders |
Boolean: Add borders to tiles?
Default: |
border_col |
Color of tile borders.
Default: |
border_size |
Size of tile borders.
Default: |
plot_charmap is based on plot_chars.
As it only contains the plotting-related parts,
it assumes a character map generated by
map_text_regex as input.
The plot generated by plot_charmap is character-based:
Individual characters are plotted at equidistant x-y-positions
and aesthetic variables are used for text labels and tile fill colors.
A plot generated by ggplot2.
plot_chars for creating and plotting character maps;
plot_text for plotting characters and color tiles by frequency;
map_text_regex for mapping text to a character table and matching patterns;
map_text_coord for mapping text to a table of character coordinates;
read_ascii for reading text inputs into a character string;
pal_ds4psy for default color palette.
Other plot functions:
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# (0) Prepare: ts <- c("Hello world!", "This is a test to test this splendid function", "Does this work?", "That's good.", "Please carry on.") sum(nchar(ts)) # (1) From character map: # (a) simple: cm_1 <- map_text_coord(x = ts, flip_y = TRUE) plot_charmap(cm_1) # (b) pattern matching (regex): cm_2 <- map_text_regex(ts, lbl_hi = "\\b\\w{4}\\b", bg_hi = "[good|test]", lbl_rotate = "[^aeiou]", angle_fg = c(-45, +45)) plot_charmap(cm_2) # (2) Alternative inputs: # (a) From text string(s): plot_charmap(ts) # (b) From user input: # plot_charmap() # (enter text in Console) # (c) From text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_charmap(file = "test.txt") # unlink("test.txt") # clean up (by deleting file).# (0) Prepare: ts <- c("Hello world!", "This is a test to test this splendid function", "Does this work?", "That's good.", "Please carry on.") sum(nchar(ts)) # (1) From character map: # (a) simple: cm_1 <- map_text_coord(x = ts, flip_y = TRUE) plot_charmap(cm_1) # (b) pattern matching (regex): cm_2 <- map_text_regex(ts, lbl_hi = "\\b\\w{4}\\b", bg_hi = "[good|test]", lbl_rotate = "[^aeiou]", angle_fg = c(-45, +45)) plot_charmap(cm_2) # (2) Alternative inputs: # (a) From text string(s): plot_charmap(ts) # (b) From user input: # plot_charmap() # (enter text in Console) # (c) From text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_charmap(file = "test.txt") # unlink("test.txt") # clean up (by deleting file).
plot_chars parses text (from a file or user input)
into a table and then plots its individual characters
as a tile plot (using ggplot2).
plot_chars( x = NA, file = "", lbl_hi = NA, lbl_lo = NA, bg_hi = NA, bg_lo = "[[:space:]]", lbl_rotate = NA, case_sense = TRUE, lbl_tiles = TRUE, angle_fg = c(-90, 90), angle_bg = 0, col_lbl = "black", col_lbl_hi = pal_ds4psy[[1]], col_lbl_lo = pal_ds4psy[[9]], col_bg = pal_ds4psy[[7]], col_bg_hi = pal_ds4psy[[4]], col_bg_lo = "white", col_sample = FALSE, rseed = NA, cex = 3, fontface = 1, family = "sans", borders = FALSE, border_col = "white", border_size = 0.5 )plot_chars( x = NA, file = "", lbl_hi = NA, lbl_lo = NA, bg_hi = NA, bg_lo = "[[:space:]]", lbl_rotate = NA, case_sense = TRUE, lbl_tiles = TRUE, angle_fg = c(-90, 90), angle_bg = 0, col_lbl = "black", col_lbl_hi = pal_ds4psy[[1]], col_lbl_lo = pal_ds4psy[[9]], col_bg = pal_ds4psy[[7]], col_bg_hi = pal_ds4psy[[4]], col_bg_lo = "white", col_sample = FALSE, rseed = NA, cex = 3, fontface = 1, family = "sans", borders = FALSE, border_col = "white", border_size = 0.5 )
x |
The text to plot (as a character vector).
Different elements denote different lines of text.
If |
file |
A text file to read (or its path).
If |
lbl_hi |
Labels to highlight (as regex).
Default: |
lbl_lo |
Labels to de-emphasize (as regex).
Default: |
bg_hi |
Background tiles to highlight (as regex).
Default: |
bg_lo |
Background tiles to de-emphasize (as regex).
Default: |
lbl_rotate |
Labels to rotate (as regex).
Default: |
case_sense |
Boolean: Distinguish
lower- vs. uppercase characters in pattern matches?
Default: |
lbl_tiles |
Add character labels to tiles?
Default: |
angle_fg |
Angle(s) for rotating character labels
matching the pattern of the |
angle_bg |
Angle(s) of rotating character labels
not matching the pattern of the |
col_lbl |
Default color of text labels.
Default: |
col_lbl_hi |
Highlighting color of text labels.
Default: |
col_lbl_lo |
De-emphasizing color of text labels.
Default: |
col_bg |
Default color to fill background tiles.
Default: |
col_bg_hi |
Highlighting color to fill background tiles.
Default: |
col_bg_lo |
De-emphasizing color to fill background tiles.
Default: |
col_sample |
Boolean: Sample color vectors (within category)?
Default: |
rseed |
Random seed (number).
Default: |
cex |
Character size (numeric).
Default: |
fontface |
Font face of text labels (numeric).
Default: |
family |
Font family of text labels (name).
Default: |
borders |
Boolean: Add borders to tiles?
Default: |
border_col |
Color of tile borders.
Default: |
border_size |
Size of tile borders.
Default: |
plot_chars blurs the boundary between a text
and its graphical representation by combining options
for matching patterns of text with visual features
for displaying characters (e.g., their color or orientation).
plot_chars is based on plot_text,
but provides additional support for detecting and displaying characters
(i.e., text labels, their orientation, and color options)
based on matching regular expression (regex).
Internally, plot_chars is a wrapper that calls
(1) map_text_regex for creating a character map
(allowing for matching patterns for some aesthetics) and
(2) plot_charmap for plotting this character map.
However, in contrast to plot_charmap,
plot_chars invisibly returns a
description of the plot (as a data frame).
The plot generated by plot_chars is character-based:
Individual characters are plotted at equidistant x-y-positions
and the aesthetic settings provided for text labels and tile fill colors.
Five regular expressions and corresponding color and angle arguments allow identifying, marking (highlighting or de-emphasizing), and rotating those sets of characters (i.e, their text labels or fill colors). that match the provided patterns.
An invisible data frame describing the plot.
plot_charmap for plotting character maps;
plot_text for plotting characters and color tiles by frequency;
map_text_coord for mapping text to a table of character coordinates;
map_text_regex for mapping text to a character table and matching patterns;
read_ascii for reading text inputs into a character string;
pal_ds4psy for default color palette.
Other plot functions:
plot_charmap(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# (A) From text string(s): plot_chars(x = c("Hello world!", "Does this work?", "That's good.", "Please carry on...")) # (B) From user input: # plot_chars() # (enter text in Console) # (C) From text file: # Create and use a text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_chars(file = "test.txt") # default # plot_chars(file = "test.txt", lbl_hi = "[[:upper:]]", lbl_lo = "[[:punct:]]", # col_lbl_hi = "red", col_lbl_lo = "blue") # plot_chars(file = "test.txt", lbl_hi = "[aeiou]", col_lbl_hi = "red", # col_bg = "white", bg_hi = "see") # mark vowels and "see" (in bg) # plot_chars(file = "test.txt", bg_hi = "[aeiou]", col_bg_hi = "gold") # mark (bg of) vowels ## Label options: # plot_chars(file = "test.txt", bg_hi = "see", lbl_tiles = FALSE) # plot_chars(file = "test.txt", cex = 5, family = "mono", fontface = 4, lbl_angle = c(-20, 20)) ## Note: plot_chars() invisibly returns a description of the plot (as df): # tb <- plot_chars(file = "test.txt", lbl_hi = "[aeiou]", lbl_rotate = TRUE) # head(tb) # unlink("test.txt") # clean up (by deleting file). ## (B) From text file (in subdir): # plot_chars(file = "data-raw/txt/hello.txt") # requires txt file # plot_chars(file = "data-raw/txt/ascii.txt", lbl_hi = "[2468]", bg_lo = "[[:digit:]]", # col_lbl_hi = "red", cex = 10, fontface = 2) ## (C) User input: # plot_chars() # (enter text in Console)# (A) From text string(s): plot_chars(x = c("Hello world!", "Does this work?", "That's good.", "Please carry on...")) # (B) From user input: # plot_chars() # (enter text in Console) # (C) From text file: # Create and use a text file: # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_chars(file = "test.txt") # default # plot_chars(file = "test.txt", lbl_hi = "[[:upper:]]", lbl_lo = "[[:punct:]]", # col_lbl_hi = "red", col_lbl_lo = "blue") # plot_chars(file = "test.txt", lbl_hi = "[aeiou]", col_lbl_hi = "red", # col_bg = "white", bg_hi = "see") # mark vowels and "see" (in bg) # plot_chars(file = "test.txt", bg_hi = "[aeiou]", col_bg_hi = "gold") # mark (bg of) vowels ## Label options: # plot_chars(file = "test.txt", bg_hi = "see", lbl_tiles = FALSE) # plot_chars(file = "test.txt", cex = 5, family = "mono", fontface = 4, lbl_angle = c(-20, 20)) ## Note: plot_chars() invisibly returns a description of the plot (as df): # tb <- plot_chars(file = "test.txt", lbl_hi = "[aeiou]", lbl_rotate = TRUE) # head(tb) # unlink("test.txt") # clean up (by deleting file). ## (B) From text file (in subdir): # plot_chars(file = "data-raw/txt/hello.txt") # requires txt file # plot_chars(file = "data-raw/txt/ascii.txt", lbl_hi = "[2468]", bg_lo = "[[:digit:]]", # col_lbl_hi = "red", cex = 10, fontface = 2) ## (C) User input: # plot_chars() # (enter text in Console)
plot_circ_points arranges a number of n
on a circle (defined by its origin coordinates and radius).
plot_circ_points( n = 4, x_org = 0, y_org = 0, radius = 1, show_axes = FALSE, show_label = FALSE, ... )plot_circ_points( n = 4, x_org = 0, y_org = 0, radius = 1, show_axes = FALSE, show_label = FALSE, ... )
n |
The number of points (or shapes defined by |
x_org |
The x-value of circle origin. |
y_org |
The y-value of circle origin. |
radius |
The circle radius. |
show_axes |
Show axes? Default: |
show_label |
Show a point label? Default: |
... |
Additional aesthetics (passed to |
The ... is passed to points of
the graphics package.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
plot_circ_points(8) # default # with aesthetics of points(): plot_circ_points(n = 8, r = 10, cex = 8, pch = sample(21:25, size = 8, replace = TRUE), bg = "deeppink") plot_circ_points(n = 12, r = 8, show_axes = TRUE, show_label = TRUE, cex = 6, pch = 21, lwd = 5, col = "deepskyblue", bg = "gold")plot_circ_points(8) # default # with aesthetics of points(): plot_circ_points(n = 8, r = 10, cex = 8, pch = sample(21:25, size = 8, replace = TRUE), bg = "deeppink") plot_circ_points(n = 12, r = 8, show_axes = TRUE, show_label = TRUE, cex = 6, pch = 21, lwd = 5, col = "deepskyblue", bg = "gold")
plot_fn is a function that uses parameters for plotting a plot.
plot_fn( x = NA, y = 1, A = TRUE, B = FALSE, C = TRUE, D = FALSE, E = FALSE, F = FALSE, f = c(rev(pal_seeblau), "white", pal_pinky), g = "white" )plot_fn( x = NA, y = 1, A = TRUE, B = FALSE, C = TRUE, D = FALSE, E = FALSE, F = FALSE, f = c(rev(pal_seeblau), "white", pal_pinky), g = "white" )
x |
Numeric (integer > 0).
Default: |
y |
Numeric (double).
Default: |
A |
Boolean.
Default: |
B |
Boolean.
Default: |
C |
Boolean.
Default: |
D |
Boolean.
Default: |
E |
Boolean.
Default: |
F |
Boolean.
Default: |
f |
A color palette (as a vector).
Default: |
g |
A color (e.g., a color name, as a character).
Default: |
plot_fn is deliberately kept cryptic and obscure to illustrate
how function parameters can be explored.
plot_fn also shows that brevity in argument names should not
come at the expense of clarity. In fact, transparent argument names
are absolutely essential for understanding and using a function.
plot_fn currently requires pal_seeblau and
pal_pinky (from the unikn package) for its default colors.
plot_fun for a related function;
pal_ds4psy for a color palette.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# Basics: plot_fn() # Exploring options: plot_fn(x = 2, A = TRUE) plot_fn(x = 3, A = FALSE, E = TRUE) plot_fn(x = 4, A = TRUE, B = TRUE, D = TRUE) plot_fn(x = 5, A = FALSE, B = TRUE, E = TRUE, f = c("black", "white", "gold")) plot_fn(x = 7, A = TRUE, B = TRUE, F = TRUE, f = c("steelblue", "white", "forestgreen"))# Basics: plot_fn() # Exploring options: plot_fn(x = 2, A = TRUE) plot_fn(x = 3, A = FALSE, E = TRUE) plot_fn(x = 4, A = TRUE, B = TRUE, D = TRUE) plot_fn(x = 5, A = FALSE, B = TRUE, E = TRUE, f = c("black", "white", "gold")) plot_fn(x = 7, A = TRUE, B = TRUE, F = TRUE, f = c("steelblue", "white", "forestgreen"))
plot_fun provides options for plotting a plot.
plot_fun( a = NA, b = TRUE, c = TRUE, d = 1, e = FALSE, f = FALSE, g = FALSE, c1 = c(rev(pal_seeblau), "white", pal_grau, "black", Bordeaux), c2 = "black" )plot_fun( a = NA, b = TRUE, c = TRUE, d = 1, e = FALSE, f = FALSE, g = FALSE, c1 = c(rev(pal_seeblau), "white", pal_grau, "black", Bordeaux), c2 = "black" )
a |
Numeric (integer > 0).
Default: |
b |
Boolean.
Default: |
c |
Boolean.
Default: |
d |
Numeric (double).
Default: |
e |
Boolean.
Default: |
f |
Boolean.
Default: |
g |
Boolean.
Default: |
c1 |
A color palette (as a vector).
Default: |
c2 |
A color (e.g., color name, as character).
Default: |
plot_fun is deliberately kept cryptic and obscure to illustrate
how function parameters can be explored.
plot_fun also shows that brevity in argument names should not
come at the expense of clarity. In fact, transparent argument names
are absolutely essential for understanding and using a function.
plot_fun currently requires pal_seeblau, pal_grau, and
Bordeaux (from the unikn package) for its default colors.
plot_fn for a related function;
pal_ds4psy for color palette.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# Basics: plot_fun() # Exploring options: plot_fun(a = 3, b = FALSE, e = TRUE) plot_fun(a = 4, f = TRUE, g = TRUE, c1 = c("steelblue", "white", "firebrick"))# Basics: plot_fun() # Exploring options: plot_fun(a = 3, b = FALSE, e = TRUE) plot_fun(a = 4, f = TRUE, g = TRUE, c1 = c("steelblue", "white", "firebrick"))
plot_n plots a row or column of n tiles
on fixed or polar coordinates.
plot_n( n = NA, row = TRUE, polar = FALSE, pal = pal_ds4psy, sort = TRUE, borders = TRUE, border_col = "black", border_size = 0, lbl_tiles = FALSE, lbl_title = FALSE, rseed = NA, save = FALSE, save_path = "images/tiles", prefix = "", suffix = "" )plot_n( n = NA, row = TRUE, polar = FALSE, pal = pal_ds4psy, sort = TRUE, borders = TRUE, border_col = "black", border_size = 0, lbl_tiles = FALSE, lbl_title = FALSE, rseed = NA, save = FALSE, save_path = "images/tiles", prefix = "", suffix = "" )
n |
Basic number of tiles (on either side). |
row |
Plot as a row?
Default: |
polar |
Plot on polar coordinates?
Default: |
pal |
A color palette (automatically extended to |
sort |
Sort tiles?
Default: |
borders |
Add borders to tiles?
Default: |
border_col |
Color of borders (if |
border_size |
Size of borders (if |
lbl_tiles |
Add numeric labels to tiles?
Default: |
lbl_title |
Add numeric label (of n) to plot?
Default: |
rseed |
Random seed (number).
Default: |
save |
Save plot as png file?
Default: |
save_path |
Path to save plot (if |
prefix |
Prefix to plot name (if |
suffix |
Suffix to plot name (if |
Note that a polar row makes a tasty pie, whereas a polar column makes a target plot.
pal_ds4psy for default color palette.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# (1) Basics (as ROW or COL): plot_n() # default plot (random n, row = TRUE, with borders, no labels) plot_n(row = FALSE) # default plot (random n, with borders, no labels) plot_n(n = 4, sort = FALSE) # random order plot_n(n = 6, borders = FALSE) # no borders plot_n(n = 8, lbl_tiles = TRUE, # with tile + lbl_title = TRUE) # title labels # Set colors: plot_n(n = 5, row = TRUE, lbl_tiles = TRUE, lbl_title = TRUE, pal = c("orange", "white", "firebrick"), border_col = "white", border_size = 2) # Fixed rseed: plot_n(n = 4, sort = FALSE, borders = FALSE, lbl_tiles = TRUE, lbl_title = TRUE, rseed = 101) # (2) polar plot (as PIE or TARGET): plot_n(polar = TRUE) # PIE plot (with borders, no labels) plot_n(polar = TRUE, row = FALSE) # TARGET plot (with borders, no labels) plot_n(n = 4, polar = TRUE, sort = FALSE) # PIE in random order plot_n(n = 5, polar = TRUE, row = FALSE, borders = FALSE) # TARGET no borders plot_n(n = 5, polar = TRUE, lbl_tiles = TRUE) # PIE with tile labels plot_n(n = 5, polar = TRUE, row = FALSE, lbl_title = TRUE) # TARGET with title label # plot_n(n = 4, row = TRUE, sort = FALSE, borders = TRUE, # border_col = "white", border_size = 2, # polar = TRUE, rseed = 132) # plot_n(n = 4, row = FALSE, sort = FALSE, borders = TRUE, # border_col = "white", border_size = 2, # polar = TRUE, rseed = 134)# (1) Basics (as ROW or COL): plot_n() # default plot (random n, row = TRUE, with borders, no labels) plot_n(row = FALSE) # default plot (random n, with borders, no labels) plot_n(n = 4, sort = FALSE) # random order plot_n(n = 6, borders = FALSE) # no borders plot_n(n = 8, lbl_tiles = TRUE, # with tile + lbl_title = TRUE) # title labels # Set colors: plot_n(n = 5, row = TRUE, lbl_tiles = TRUE, lbl_title = TRUE, pal = c("orange", "white", "firebrick"), border_col = "white", border_size = 2) # Fixed rseed: plot_n(n = 4, sort = FALSE, borders = FALSE, lbl_tiles = TRUE, lbl_title = TRUE, rseed = 101) # (2) polar plot (as PIE or TARGET): plot_n(polar = TRUE) # PIE plot (with borders, no labels) plot_n(polar = TRUE, row = FALSE) # TARGET plot (with borders, no labels) plot_n(n = 4, polar = TRUE, sort = FALSE) # PIE in random order plot_n(n = 5, polar = TRUE, row = FALSE, borders = FALSE) # TARGET no borders plot_n(n = 5, polar = TRUE, lbl_tiles = TRUE) # PIE with tile labels plot_n(n = 5, polar = TRUE, row = FALSE, lbl_title = TRUE) # TARGET with title label # plot_n(n = 4, row = TRUE, sort = FALSE, borders = TRUE, # border_col = "white", border_size = 2, # polar = TRUE, rseed = 132) # plot_n(n = 4, row = FALSE, sort = FALSE, borders = TRUE, # border_col = "white", border_size = 2, # polar = TRUE, rseed = 134)
plot_text parses text
(from a file or from user input)
and plots its individual characters
as a tile plot (using ggplot2).
plot_text( x = NA, file = "", char_bg = " ", lbl_tiles = TRUE, lbl_rotate = FALSE, cex = 3, fontface = 1, family = "sans", col_lbl = "black", col_bg = "white", pal = pal_ds4psy[1:5], pal_extend = TRUE, case_sense = FALSE, borders = TRUE, border_col = "white", border_size = 0.5 )plot_text( x = NA, file = "", char_bg = " ", lbl_tiles = TRUE, lbl_rotate = FALSE, cex = 3, fontface = 1, family = "sans", col_lbl = "black", col_bg = "white", pal = pal_ds4psy[1:5], pal_extend = TRUE, case_sense = FALSE, borders = TRUE, border_col = "white", border_size = 0.5 )
x |
The text to plot (as a character vector).
Different elements denote different lines of text.
If |
file |
A text file to read (or its path).
If |
char_bg |
Character used as background.
Default: |
lbl_tiles |
Add character labels to tiles?
Default: |
lbl_rotate |
Rotate character labels?
Default: |
cex |
Character size (numeric).
Default: |
fontface |
Font face of text labels (numeric).
Default: |
family |
Font family of text labels (name).
Default: |
col_lbl |
Color of text labels.
Default: |
col_bg |
Color of |
pal |
Color palette for filling tiles
of text (used in order of character frequency).
Default: |
pal_extend |
Boolean: Should |
case_sense |
Boolean: Distinguish
lower- vs. uppercase characters?
Default: |
borders |
Boolean: Add borders to tiles?
Default: |
border_col |
Color of borders (if |
border_size |
Size of borders (if |
plot_text blurs the boundary between a text
and its graphical representation by adding visual options
for coloring characters based on their frequency counts.
(Note that plot_chars provides additional
support for matching regular expressions.)
plot_text is character-based:
Individual characters are plotted at equidistant x-y-positions
with color settings for text labels and tile fill colors.
By default, the color palette pal
(used for tile fill colors) is scaled
to indicate character frequency.
plot_text invisibly returns a
description of the plot (as a data frame).
An invisible data frame describing the plot.
plot_charmap for plotting character maps;
plot_chars for creating and plotting character maps;
map_text_coord for mapping text to a table of character coordinates;
map_text_regex for mapping text to a character table and matching patterns;
read_ascii for parsing text from file or user input;
pal_ds4psy for default color palette.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_tiles(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# (A) From text string(s): plot_text(x = c("Hello", "world!")) plot_text(x = c("Hello world!", "How are you today?")) # (B) From user input: # plot_text() # (enter text in Console) # (C) From text file: ## Create a temporary file "test.txt": # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_text(file = "test.txt") ## Set colors, pal_extend, and case_sense: # cols <- c("steelblue", "skyblue", "lightgrey") # cols <- c("firebrick", "olivedrab", "steelblue", "orange", "gold") # plot_text(file = "test.txt", pal = cols, pal_extend = TRUE) # plot_text(file = "test.txt", pal = cols, pal_extend = FALSE) # plot_text(file = "test.txt", pal = cols, pal_extend = FALSE, case_sense = TRUE) ## Customize text and grid options: # plot_text(file = "test.txt", col_lbl = "darkblue", cex = 4, family = "sans", fontface = 3, # pal = "gold1", pal_extend = TRUE, border_col = NA) # plot_text(file = "test.txt", family = "serif", cex = 6, lbl_rotate = TRUE, # pal = NA, borders = FALSE) # plot_text(file = "test.txt", col_lbl = "white", pal = c("green3", "black"), # border_col = "black", border_size = .2) ## Color ranges: # plot_text(file = "test.txt", pal = c("red2", "orange", "gold")) # plot_text(file = "test.txt", pal = c("olivedrab4", "gold")) # unlink("test.txt") # clean up. ## (B) From text file (in subdir): # plot_text(file = "data-raw/txt/hello.txt") # requires txt file # plot_text(file = "data-raw/txt/ascii.txt", cex = 5, # col_bg = "grey", char_bg = "-") ## (C) From user input: # plot_text() # (enter text in Console)# (A) From text string(s): plot_text(x = c("Hello", "world!")) plot_text(x = c("Hello world!", "How are you today?")) # (B) From user input: # plot_text() # (enter text in Console) # (C) From text file: ## Create a temporary file "test.txt": # cat("Hello world!", "This is a test file.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") # plot_text(file = "test.txt") ## Set colors, pal_extend, and case_sense: # cols <- c("steelblue", "skyblue", "lightgrey") # cols <- c("firebrick", "olivedrab", "steelblue", "orange", "gold") # plot_text(file = "test.txt", pal = cols, pal_extend = TRUE) # plot_text(file = "test.txt", pal = cols, pal_extend = FALSE) # plot_text(file = "test.txt", pal = cols, pal_extend = FALSE, case_sense = TRUE) ## Customize text and grid options: # plot_text(file = "test.txt", col_lbl = "darkblue", cex = 4, family = "sans", fontface = 3, # pal = "gold1", pal_extend = TRUE, border_col = NA) # plot_text(file = "test.txt", family = "serif", cex = 6, lbl_rotate = TRUE, # pal = NA, borders = FALSE) # plot_text(file = "test.txt", col_lbl = "white", pal = c("green3", "black"), # border_col = "black", border_size = .2) ## Color ranges: # plot_text(file = "test.txt", pal = c("red2", "orange", "gold")) # plot_text(file = "test.txt", pal = c("olivedrab4", "gold")) # unlink("test.txt") # clean up. ## (B) From text file (in subdir): # plot_text(file = "data-raw/txt/hello.txt") # requires txt file # plot_text(file = "data-raw/txt/ascii.txt", cex = 5, # col_bg = "grey", char_bg = "-") ## (C) From user input: # plot_text() # (enter text in Console)
plot_tiles plots an area of n-by-n tiles
on fixed or polar coordinates.
plot_tiles( n = NA, pal = pal_ds4psy, sort = TRUE, borders = TRUE, border_col = "black", border_size = 0.2, lbl_tiles = FALSE, lbl_title = FALSE, polar = FALSE, rseed = NA, save = FALSE, save_path = "images/tiles", prefix = "", suffix = "" )plot_tiles( n = NA, pal = pal_ds4psy, sort = TRUE, borders = TRUE, border_col = "black", border_size = 0.2, lbl_tiles = FALSE, lbl_title = FALSE, polar = FALSE, rseed = NA, save = FALSE, save_path = "images/tiles", prefix = "", suffix = "" )
n |
Basic number of tiles (on either side). |
pal |
Color palette (automatically extended to |
sort |
Boolean: Sort tiles?
Default: |
borders |
Boolean: Add borders to tiles?
Default: |
border_col |
Color of borders (if |
border_size |
Size of borders (if |
lbl_tiles |
Boolean: Add numeric labels to tiles?
Default: |
lbl_title |
Boolean: Add numeric label (of n) to plot?
Default: |
polar |
Boolean: Plot on polar coordinates?
Default: |
rseed |
Random seed (number).
Default: |
save |
Boolean: Save plot as png file?
Default: |
save_path |
Path to save plot (if |
prefix |
Prefix to plot name (if |
suffix |
Suffix to plot name (if |
pal_ds4psy for default color palette.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
theme_clean(),
theme_ds4psy(),
theme_empty()
# (1) Tile plot: plot_tiles() # default plot (random n, with borders, no labels) plot_tiles(n = 4, sort = FALSE) # random order plot_tiles(n = 6, borders = FALSE) # no borders plot_tiles(n = 8, lbl_tiles = TRUE, # with tile + lbl_title = TRUE) # title labels # Set colors: plot_tiles(n = 4, pal = c("orange", "white", "firebrick"), lbl_tiles = TRUE, lbl_title = TRUE, sort = TRUE) plot_tiles(n = 6, sort = FALSE, border_col = "white", border_size = 2) # Fixed rseed: plot_tiles(n = 4, sort = FALSE, borders = FALSE, lbl_tiles = TRUE, lbl_title = TRUE, rseed = 101) # (2) polar plot: plot_tiles(polar = TRUE) # default polar plot (with borders, no labels) plot_tiles(n = 4, polar = TRUE, sort = FALSE) # random order plot_tiles(n = 6, polar = TRUE, sort = TRUE, # sorted and with lbl_tiles = TRUE, lbl_title = TRUE) # tile + title labels plot_tiles(n = 4, sort = FALSE, borders = TRUE, border_col = "white", border_size = 2, polar = TRUE, rseed = 132) # fixed rseed# (1) Tile plot: plot_tiles() # default plot (random n, with borders, no labels) plot_tiles(n = 4, sort = FALSE) # random order plot_tiles(n = 6, borders = FALSE) # no borders plot_tiles(n = 8, lbl_tiles = TRUE, # with tile + lbl_title = TRUE) # title labels # Set colors: plot_tiles(n = 4, pal = c("orange", "white", "firebrick"), lbl_tiles = TRUE, lbl_title = TRUE, sort = TRUE) plot_tiles(n = 6, sort = FALSE, border_col = "white", border_size = 2) # Fixed rseed: plot_tiles(n = 4, sort = FALSE, borders = FALSE, lbl_tiles = TRUE, lbl_title = TRUE, rseed = 101) # (2) polar plot: plot_tiles(polar = TRUE) # default polar plot (with borders, no labels) plot_tiles(n = 4, polar = TRUE, sort = FALSE) # random order plot_tiles(n = 6, polar = TRUE, sort = TRUE, # sorted and with lbl_tiles = TRUE, lbl_title = TRUE) # tile + title labels plot_tiles(n = 4, sort = FALSE, borders = TRUE, border_col = "white", border_size = 2, polar = TRUE, rseed = 132) # fixed rseed
posPsy_AHI_CESD is a dataset containing answers to the 24 items of the
Authentic Happiness Inventory (AHI) and answers to the
20 items of the Center for Epidemiological Studies Depression (CES-D) scale
(Radloff, 1977) for multiple (1 to 6) measurement occasions.
posPsy_AHI_CESDposPsy_AHI_CESD
A table with 992 cases (rows) and 50 variables (columns).
Codebook
1. id: Participant ID.
2. occasion: Measurement occasion: 0: Pretest (i.e., at enrolment), 1: Posttest (i.e., 7 days after pretest), 2: 1-week follow-up, (i.e., 14 days after pretest, 7 days after posttest), 3: 1-month follow-up, (i.e., 38 days after pretest, 31 days after posttest), 4: 3-month follow-up, (i.e., 98 days after pretest, 91 days after posttest), 5: 6-month follow-up, (i.e., 189 days after pretest, 182 days after posttest).
3. elapsed.days: Time since enrolment measured in fractional days.
4. intervention: Type of intervention: 3 positive psychology interventions (PPIs), plus 1 control condition: 1: "Using signature strengths", 2: "Three good things", 3: "Gratitude visit", 4: "Recording early memories" (control condition).
5.-28. (from ahi01 to ahi24): Responses on 24 AHI items.
29.-48. (from cesd01 to cesd20): Responses on 20 CES-D items.
49. ahiTotal: Total AHI score.
50. cesdTotal: Total CES-D score.
See codebook and references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
Articles
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R., & Schüz, B. (2017).
Web-based positive psychology interventions: A reexamination of effectiveness.
Journal of Clinical Psychology, 73(3), 218–232.
doi: 10.1002/jclp.22328
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R. and Schüz, B. (2018).
Data from, ‘Web-based positive psychology interventions: A reexamination of effectiveness’.
Journal of Open Psychology Data, 6(1).
doi: 10.5334/jopd.35
See https://openpsychologydata.metajnl.com/articles/10.5334/jopd.35/ for details and doi:10.6084/m9.figshare.1577563.v1 for original dataset.
Additional references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
posPsy_long for a corrected version of this file (in long format).
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
posPsy_long is a dataset containing answers to the 24 items of the
Authentic Happiness Inventory (AHI) and answers to the
20 items of the Center for Epidemiological Studies Depression (CES-D) scale
(see Radloff, 1977) for multiple (1 to 6) measurement occasions.
posPsy_longposPsy_long
A table with 990 cases (rows) and 50 variables (columns).
This dataset is a corrected version of posPsy_AHI_CESD
and in long-format.
Articles
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R., & Schüz, B. (2017).
Web-based positive psychology interventions: A reexamination of effectiveness.
Journal of Clinical Psychology, 73(3), 218–232.
doi: 10.1002/jclp.22328
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R. and Schüz, B. (2018).
Data from, ‘Web-based positive psychology interventions: A reexamination of effectiveness’.
Journal of Open Psychology Data, 6(1).
doi: 10.5334/jopd.35
See https://openpsychologydata.metajnl.com/articles/10.5334/jopd.35/ for details and doi:10.6084/m9.figshare.1577563.v1 for original dataset.
Additional references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
posPsy_AHI_CESD for source of this file and codebook information;
posPsy_wide for a version of this file (in wide format).
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
posPsy_p_info is a dataset containing details of 295 participants.
posPsy_p_infoposPsy_p_info
A table with 295 cases (rows) and 6 variables (columns).
Participant ID.
Type of intervention: 3 positive psychology interventions (PPIs), plus 1 control condition: 1: "Using signature strengths", 2: "Three good things", 3: "Gratitude visit", 4: "Recording early memories" (control condition).
Sex: 1 = female, 2 = male.
Age (in years).
Education level: Scale from 1: less than 12 years, to 5: postgraduate degree.
Income: Scale from 1: below average, to 3: above average.
See codebook and references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
Articles
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R., & Schüz, B. (2017).
Web-based positive psychology interventions: A reexamination of effectiveness.
Journal of Clinical Psychology, 73(3), 218–232.
doi: 10.1002/jclp.22328
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R. and Schüz, B. (2018).
Data from, ‘Web-based positive psychology interventions: A reexamination of effectiveness’.
Journal of Open Psychology Data, 6(1).
doi: 10.5334/jopd.35
See https://openpsychologydata.metajnl.com/articles/10.5334/jopd.35/ for details and doi:10.6084/m9.figshare.1577563.v1 for original dataset.
Additional references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
posPsy_wide is a dataset containing answers to the 24 items of the
Authentic Happiness Inventory (AHI) and answers to the
20 items of the Center for Epidemiological Studies Depression (CES-D) scale
(see Radloff, 1977) for multiple (1 to 6) measurement occasions.
posPsy_wideposPsy_wide
An object of class spec_tbl_df (inherits from tbl_df, tbl, data.frame) with 295 rows and 294 columns.
This dataset is based on posPsy_AHI_CESD and
posPsy_long, but is in wide format.
Articles
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R., & Schüz, B. (2017).
Web-based positive psychology interventions: A reexamination of effectiveness.
Journal of Clinical Psychology, 73(3), 218–232.
doi: 10.1002/jclp.22328
Woodworth, R. J., O’Brien-Malone, A., Diamond, M. R. and Schüz, B. (2018).
Data from, ‘Web-based positive psychology interventions: A reexamination of effectiveness’.
Journal of Open Psychology Data, 6(1).
doi: 10.5334/jopd.35
See https://openpsychologydata.metajnl.com/articles/10.5334/jopd.35/ for details and doi:10.6084/m9.figshare.1577563.v1 for original dataset.
Additional references at https://bookdown.org/hneth/ds4psy/B.1-datasets-pos.html.
posPsy_AHI_CESD for the source of this file,
posPsy_long for a version of this file (in long format).
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
read_ascii parses text inputs
(from a file or from user input in the Console)
into a character vector.
read_ascii(file = "", quiet = FALSE)read_ascii(file = "", quiet = FALSE)
file |
The text file to read (or its path).
If |
quiet |
Boolean: Provide user feedback?
Default: |
Different lines of text are represented by different elements of the character vector returned.
The getwd function is used to determine the current
working directory. This replaces the here package,
which was previously used to determine an (absolute) file path.
Note that read_ascii originally contained
map_text_coord, but has been separated to
enable independent access to separate functionalities.
A character vector, with its elements denoting different lines of text.
map_text_coord for mapping text to a table of character coordinates;
plot_chars for a character plotting function.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
## Create a temporary file "test.txt": # cat("Hello world!", "This is a test.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") ## (a) Read text (from file): # read_ascii("test.txt") # read_ascii("test.txt", quiet = TRUE) # y flipped # unlink("test.txt") # clean up (by deleting file). ## (b) Read text (from file in subdir): # read_ascii("data-raw/txt/ascii.txt") # requires txt file ## (c) Scan user input (from console): # read_ascii()## Create a temporary file "test.txt": # cat("Hello world!", "This is a test.", # "Can you see this text?", # "Good! Please carry on...", # file = "test.txt", sep = "\n") ## (a) Read text (from file): # read_ascii("test.txt") # read_ascii("test.txt", quiet = TRUE) # y flipped # unlink("test.txt") # clean up (by deleting file). ## (b) Read text (from file in subdir): # read_ascii("data-raw/txt/ascii.txt") # requires txt file ## (c) Scan user input (from console): # read_ascii()
sample_char draws a sample of
n random characters from a given range of characters.
sample_char(x_char = c(letters, LETTERS), n = 1, replace = FALSE, ...)sample_char(x_char = c(letters, LETTERS), n = 1, replace = FALSE, ...)
x_char |
Population of characters to sample from.
Default: |
n |
Number of characters to draw.
Default: |
replace |
Boolean: Sample with replacement?
Default: |
... |
Other arguments.
(Use for specifying |
By default, sample_char draws n = 1
a random alphabetic character from
x_char = c(letters, LETTERS).
As with sample(), the sample size n must not exceed
the number of available characters nchar(x_char),
unless replace = TRUE (i.e., sampling with replacement).
A text string (scalar character vector).
Other sampling functions:
coin(),
dice(),
dice_2(),
sample_date(),
sample_time()
sample_char() # default sample_char(n = 10) sample_char(x_char = "abc", n = 10, replace = TRUE) sample_char(x_char = c("x y", "6 9"), n = 6, replace = FALSE) sample_char(x_char = c("x y", "6 9"), n = 20, replace = TRUE) # Biased sampling: sample_char(x_char = "abc", n = 20, replace = TRUE, prob = c(3/6, 2/6, 1/6)) # Note: By default, n must not exceed nchar(x_char): sample_char(n = 52, replace = FALSE) # works, but # sample_char(n = 53, replace = FALSE) # would yield ERROR; sample_char(n = 53, replace = TRUE) # works again.sample_char() # default sample_char(n = 10) sample_char(x_char = "abc", n = 10, replace = TRUE) sample_char(x_char = c("x y", "6 9"), n = 6, replace = FALSE) sample_char(x_char = c("x y", "6 9"), n = 20, replace = TRUE) # Biased sampling: sample_char(x_char = "abc", n = 20, replace = TRUE, prob = c(3/6, 2/6, 1/6)) # Note: By default, n must not exceed nchar(x_char): sample_char(n = 52, replace = FALSE) # works, but # sample_char(n = 53, replace = FALSE) # would yield ERROR; sample_char(n = 53, replace = TRUE) # works again.
sample_date draws a sample of
n random dates from a given range.
sample_date(from = "1970-01-01", to = Sys.Date(), size = 1, ...)sample_date(from = "1970-01-01", to = Sys.Date(), size = 1, ...)
from |
Earliest date (as "Date" or string).
Default: |
to |
Latest date (as "Date" or string).
Default: |
size |
Size of date samples to draw.
Default: |
... |
Other arguments.
(Use for specifying |
By default, sample_date draws n = 1
random date (as a "Date" object) in the range
from = "1970-01-01"
to = Sys.Date() (current date).
Both from and to currently
need to be scalars (i.e., with a length of 1).
A vector of class "Date".
Other sampling functions:
coin(),
dice(),
dice_2(),
sample_char(),
sample_time()
sample_date() sort(sample_date(size = 10)) sort(sample_date(from = "2020-02-28", to = "2020-03-01", size = 10, replace = TRUE)) # 2020 is a leap year # Note: Oddity with sample(): sort(sample_date(from = "2020-01-01", to = "2020-01-01", size = 10, replace = TRUE)) # range of 0! # see sample(9:9, size = 10, replace = TRUE)sample_date() sort(sample_date(size = 10)) sort(sample_date(from = "2020-02-28", to = "2020-03-01", size = 10, replace = TRUE)) # 2020 is a leap year # Note: Oddity with sample(): sort(sample_date(from = "2020-01-01", to = "2020-01-01", size = 10, replace = TRUE)) # range of 0! # see sample(9:9, size = 10, replace = TRUE)
sample_time draws a sample of
n random times from a given range.
sample_time( from = "1970-01-01 00:00:00", to = Sys.time(), size = 1, as_POSIXct = TRUE, tz = "", ... )sample_time( from = "1970-01-01 00:00:00", to = Sys.time(), size = 1, as_POSIXct = TRUE, tz = "", ... )
from |
Earliest date-time (as string).
Default: |
to |
Latest date-time (as string).
Default: |
size |
Size of time samples to draw.
Default: |
as_POSIXct |
Boolean: Return calendar time ("POSIXct") object?
Default: |
tz |
Time zone.
Default: |
... |
Other arguments.
(Use for specifying |
By default, sample_time draws n = 1
random calendar time (as a "POSIXct" object) in the range
from = "1970-01-01 00:00:00"
to = Sys.time() (current time).
Both from and to currently
need to be scalars (i.e., with a length of 1).
If as_POSIXct = FALSE, a local time ("POSIXlt") object is returned
(as a list).
The tz argument allows specifying time zones
(see Sys.timezone() for current setting
and OlsonNames() for options.)
A vector of class "POSIXct" or "POSIXlt".
Other sampling functions:
coin(),
dice(),
dice_2(),
sample_char(),
sample_date()
# Basics: sample_time() sample_time(size = 10) # Specific ranges: sort(sample_time(from = (Sys.time() - 60), size = 10)) # within last minute sort(sample_time(from = (Sys.time() - 1 * 60 * 60), size = 10)) # within last hour sort(sample_time(from = Sys.time(), to = (Sys.time() + 1 * 60 * 60), size = 10, replace = FALSE)) # within next hour sort(sample_time(from = "2020-12-31 00:00:00 CET", to = "2020-12-31 00:00:01 CET", size = 10, replace = TRUE)) # within 1 sec range # Local time (POSIXlt) objects (as list): (lt_sample <- sample_time(as_POSIXct = FALSE)) unlist(lt_sample) # Time zones: sample_time(size = 3, tz = "UTC") sample_time(size = 3, tz = "America/Los_Angeles") # Note: Oddity with sample(): sort(sample_time(from = "2020-12-31 00:00:00 CET", to = "2020-12-31 00:00:00 CET", size = 10, replace = TRUE)) # range of 0! # see sample(9:9, size = 10, replace = TRUE)# Basics: sample_time() sample_time(size = 10) # Specific ranges: sort(sample_time(from = (Sys.time() - 60), size = 10)) # within last minute sort(sample_time(from = (Sys.time() - 1 * 60 * 60), size = 10)) # within last hour sort(sample_time(from = Sys.time(), to = (Sys.time() + 1 * 60 * 60), size = 10, replace = FALSE)) # within next hour sort(sample_time(from = "2020-12-31 00:00:00 CET", to = "2020-12-31 00:00:01 CET", size = 10, replace = TRUE)) # within 1 sec range # Local time (POSIXlt) objects (as list): (lt_sample <- sample_time(as_POSIXct = FALSE)) unlist(lt_sample) # Time zones: sample_time(size = 3, tz = "UTC") sample_time(size = 3, tz = "America/Los_Angeles") # Note: Oddity with sample(): sort(sample_time(from = "2020-12-31 00:00:00 CET", to = "2020-12-31 00:00:00 CET", size = 10, replace = TRUE)) # range of 0! # see sample(9:9, size = 10, replace = TRUE)
t_1 is a fictitious dataset to practice tidying data.
t_1t_1
A table with 8 cases (rows) and 9 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t_1.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
t_2 is a fictitious dataset to practice tidying data.
t_2t_2
A table with 8 cases (rows) and 5 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t_2.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
t_3 is a fictitious dataset to practice tidying data.
t_3t_3
A table with 16 cases (rows) and 6 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t_3.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_4,
table6,
table7,
table8,
table9,
tb
t_4 is a fictitious dataset to practice tidying data.
t_4t_4
A table with 16 cases (rows) and 8 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t_4.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
table6,
table7,
table8,
table9,
tb
t3 is a fictitious dataset to practice importing and joining data
(from a CSV file).
t3t3
A table with 10 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t3.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
t4 is a fictitious dataset to practice importing and joining data
(from a CSV file).
t4t4
A table with 10 cases (rows) and 4 variables (columns).
See CSV data at http://rpository.com/ds4psy/data/t4.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
table6 is a fictitious dataset to practice reshaping and tidying data.
table6table6
A table with 6 cases (rows) and 2 variables (columns).
This dataset is a further variant of the table1 to table5 datasets
of the tidyr package.
See CSV data at http://rpository.com/ds4psy/data/table6.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table7,
table8,
table9,
tb
table7 is a fictitious dataset to practice reshaping and tidying data.
table7table7
A table with 6 cases (rows) and 1 (horrendous) variable (column).
This dataset is a further variant of the table1 to table5 datasets
of the tidyr package.
See CSV data at http://rpository.com/ds4psy/data/table7.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table8,
table9,
tb
table9 is a fictitious dataset to practice reshaping and tidying data.
table8table8
A table with 3 cases (rows) and 5 variables (columns).
This dataset is a further variant of the table1 to table5 datasets
of the tidyr package.
See CSV data at http://rpository.com/ds4psy/data/table8.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table9,
tb
table9 is a fictitious dataset to practice reshaping and tidying data.
table9table9
A 3 x 2 x 2 array (of type "xtabs") with 2940985206 elements (frequency counts).
This dataset is a further variant of the table1 to table5 datasets
of the tidyr package.
Generated by using stats::xtabs(formula = count ~., data = tidyr::table2).
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
tb
tb is a fictitious set of data describing
100 non-existing, but otherwise ordinary people.
tbtb
A table with 100 cases (rows) and 5 variables (columns).
Codebook
The table contains 5 columns/variables:
1. id: Participant ID.
2. age: Age (in years).
3. height: Height (in cm).
4. shoesize: Shoe size (EU standard).
5. IQ: IQ score (according Raven's Regressive Tables).
tb was originally created to practice loops and iterations
(as a CSV file).
See CSV data file at http://rpository.com/ds4psy/data/tb.csv.
Other datasets:
Bushisms,
Trumpisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9
x into its characterstext_to_chars splits a string of text x
(consisting of one or more character strings)
into a vector of its individual characters.
text_to_chars(x, rm_specials = FALSE, sep = "")text_to_chars(x, rm_specials = FALSE, sep = "")
x |
A string of text (required). |
rm_specials |
Boolean: Remove special characters?
Default: |
sep |
Character to insert between the elements
of a multi-element character vector as input |
If rm_specials = TRUE,
most special (or non-word) characters are
removed. (Note that this currently works
without using regular expressions.)
text_to_chars is an inverse function of chars_to_text.
A character vector (containing individual characters).
chars_to_text for combining character vectors into text;
text_to_sentences for splitting text into a vector of sentences;
text_to_words for splitting text into a vector of words;
count_chars for counting the frequency of characters;
count_words for counting the frequency of words;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
s3 <- c("A 1st sentence.", "The 2nd sentence.", "A 3rd --- and FINAL --- sentence.") text_to_chars(s3) text_to_chars(s3, sep = "\n") text_to_chars(s3, rm_specials = TRUE)s3 <- c("A 1st sentence.", "The 2nd sentence.", "A 3rd --- and FINAL --- sentence.") text_to_chars(s3) text_to_chars(s3, sep = "\n") text_to_chars(s3, rm_specials = TRUE)
x into sentencestext_to_sentences splits text x
(consisting of one or more character strings)
into a vector of its constituting sentences.
text_to_sentences( x, sep = " ", split_delim = "\\.|\\?|!", force_delim = FALSE )text_to_sentences( x, sep = " ", split_delim = "\\.|\\?|!", force_delim = FALSE )
x |
A string of text (required), typically a character vector. |
sep |
A character inserted as separator/delimiter
between elements when collapsing multi-element strings of |
split_delim |
Sentence delimiters (as regex)
used to split the collapsed string of |
force_delim |
Boolean: Enforce splitting at |
The splits of x will occur at given punctuation marks
(provided as a regular expression, default: split_delim = "\.|\?|!").
Empty leading and trailing spaces are removed before returning
a vector of the remaining character sequences (i.e., the sentences).
The Boolean argument force_delim distinguishes between
two splitting modes:
If force_delim = FALSE (as per default),
a standard sentence-splitting pattern is assumed:
A sentence delimiter in split_delim must be followed by
one or more blank spaces and a capital letter starting the next sentence.
Sentence delimiters in split_delim are not removed
from the output.
If force_delim = TRUE,
the function enforces splits at each delimiter in split_delim.
For instance, any dot (i.e., the metacharacter "\.") is
interpreted as a full stop, so that sentences containing dots
mid-sentence (e.g., for abbreviations, etc.) are split into parts.
Sentence delimiters in split_delim are removed
from the output.
Internally, text_to_sentences first uses paste
to collapse strings (adding sep between elements) and then
strsplit to split strings at split_delim.
A character vector (of sentences).
text_to_words for splitting text into a vector of words;
text_to_chars for splitting text into a vector of characters;
count_words for counting the frequency of words;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_words(),
transl33t(),
words_to_text()
x <- c("A first sentence. Exclamation sentence!", "Any questions? But etc. can be tricky. A fourth --- and final --- sentence.") text_to_sentences(x) text_to_sentences(x, force_delim = TRUE) # Changing split delimiters: text_to_sentences(x, split_delim = "\\.") # only split at "." text_to_sentences("Buy apples, berries, and coconuts.") text_to_sentences("Buy apples, berries; and coconuts.", split_delim = ",|;|\\.", force_delim = TRUE) text_to_sentences(c("123. 456? 789! 007 etc."), force_delim = TRUE) # Split multi-element strings (w/o punctuation): e3 <- c("12", "34", "56") text_to_sentences(e3, sep = " ") # Default: Collapse strings adding 1 space, but: text_to_sentences(e3, sep = ".", force_delim = TRUE) # insert sep and force split. # Punctuation within sentences: text_to_sentences("Dr. who is left intact.") text_to_sentences("Dr. Who is problematic.")x <- c("A first sentence. Exclamation sentence!", "Any questions? But etc. can be tricky. A fourth --- and final --- sentence.") text_to_sentences(x) text_to_sentences(x, force_delim = TRUE) # Changing split delimiters: text_to_sentences(x, split_delim = "\\.") # only split at "." text_to_sentences("Buy apples, berries, and coconuts.") text_to_sentences("Buy apples, berries; and coconuts.", split_delim = ",|;|\\.", force_delim = TRUE) text_to_sentences(c("123. 456? 789! 007 etc."), force_delim = TRUE) # Split multi-element strings (w/o punctuation): e3 <- c("12", "34", "56") text_to_sentences(e3, sep = " ") # Default: Collapse strings adding 1 space, but: text_to_sentences(e3, sep = ".", force_delim = TRUE) # insert sep and force split. # Punctuation within sentences: text_to_sentences("Dr. who is left intact.") text_to_sentences("Dr. Who is problematic.")
x into wordstext_to_words splits a string of text x
(consisting of one or more character strings)
into a vector of its constituting words.
text_to_words(x)text_to_words(x)
x |
A string of text (required), typically a character vector. |
text_to_words removes all (standard) punctuation marks
and empty spaces in the resulting text parts,
before returning a vector of the remaining character symbols
(as its words).
Internally, text_to_words uses strsplit to
split strings at punctuation marks (split = "[[:punct:]]")
and blank spaces (split = "( ){1,}").
A character vector (of words).
text_to_words for splitting a text into its words;
text_to_sentences for splitting text into a vector of sentences;
text_to_chars for splitting text into a vector of characters;
count_words for counting the frequency of words;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
transl33t(),
words_to_text()
# Default: x <- c("Hello!", "This is a 1st sentence.", "This is the 2nd sentence.", "The end.") text_to_words(x)# Default: x <- c("Hello!", "This is a 1st sentence.", "This is the 2nd sentence.", "The end.") text_to_words(x)
theme_clean provides an alternative ds4psy theme
to use in ggplot2 commands.
theme_clean( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22, col_title = grey(0, 1), col_panel = grey(0.85, 1), col_gridx = grey(1, 1), col_gridy = grey(1, 1), col_ticks = grey(0.1, 1) )theme_clean( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22, col_title = grey(0, 1), col_panel = grey(0.85, 1), col_gridx = grey(1, 1), col_gridy = grey(1, 1), col_ticks = grey(0.1, 1) )
base_size |
Base font size (optional, numeric).
Default: |
base_family |
Base font family (optional, character).
Default: |
base_line_size |
Base line size (optional, numeric).
Default: |
base_rect_size |
Base rectangle size (optional, numeric).
Default: |
col_title |
Color of plot title (and tag).
Default: |
col_panel |
Color of panel background(s).
Default: |
col_gridx |
Color of (major) panel lines (through x/vertical).
Default: |
col_gridy |
Color of (major) panel lines (through y/horizontal).
Default: |
col_ticks |
Color of axes text and ticks.
Default: |
theme_clean is more minimal than theme_ds4psy
and fills panel backgrounds with a color col_panel.
This theme works well for plots with multiple panels, strong colors and bright color accents, but is of limited use with transparent colors.
A ggplot2 theme.
theme_ds4psy for default theme.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_ds4psy(),
theme_empty()
# Plotting iris dataset (using ggplot2, theme_grau, and unikn colors): library('ggplot2') # theme_clean() requires ggplot2 library('unikn') # for colors and usecol() function ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 3/4) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Karpfenblau, Seegruen))) + labs(tag = "B", title = "Iris sepals", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_clean()# Plotting iris dataset (using ggplot2, theme_grau, and unikn colors): library('ggplot2') # theme_clean() requires ggplot2 library('unikn') # for colors and usecol() function ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 3/4) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Karpfenblau, Seegruen))) + labs(tag = "B", title = "Iris sepals", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_clean()
theme_ds4psy provides a generic ds4psy theme
to use in ggplot2 commands.
theme_ds4psy( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22, col_title = grey(0, 1), col_txt_1 = grey(0.1, 1), col_txt_2 = grey(0.2, 1), col_txt_3 = grey(0.1, 1), col_bgrnd = "transparent", col_panel = grey(1, 1), col_strip = "transparent", col_axes = grey(0, 1), col_gridx = grey(0.75, 1), col_gridy = grey(0.75, 1), col_brdrs = "transparent" )theme_ds4psy( base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22, col_title = grey(0, 1), col_txt_1 = grey(0.1, 1), col_txt_2 = grey(0.2, 1), col_txt_3 = grey(0.1, 1), col_bgrnd = "transparent", col_panel = grey(1, 1), col_strip = "transparent", col_axes = grey(0, 1), col_gridx = grey(0.75, 1), col_gridy = grey(0.75, 1), col_brdrs = "transparent" )
base_size |
Base font size (optional, numeric).
Default: |
base_family |
Base font family (optional, character).
Default: |
base_line_size |
Base line size (optional, numeric).
Default: |
base_rect_size |
Base rectangle size (optional, numeric).
Default: |
col_title |
Color of plot title (and tag).
Default: |
col_txt_1 |
Color of primary text (headings and axis labels).
Default: |
col_txt_2 |
Color of secondary text (caption, legend, axes labels/ticks).
Default: |
col_txt_3 |
Color of other text (facet strip labels).
Default: |
col_bgrnd |
Color of plot background.
Default: |
col_panel |
Color of panel background(s).
Default: |
col_strip |
Color of facet strips.
Default: |
col_axes |
Color of (x and y) axes.
Default: |
col_gridx |
Color of (major and minor) panel lines (through x/vertical).
Default: |
col_gridy |
Color of (major and minor) panel lines (through y/horizontal).
Default: |
col_brdrs |
Color of (panel and strip) borders.
Default: |
The theme is lightweight and no-nonsense, but somewhat opinionated (e.g., in using transparency and grid lines, and relying on grey tones for emphasizing data with color accents).
Basic sizes and the colors of text elements, backgrounds, and lines can be specified. However, excessive customization rarely yields aesthetic improvements over the standard ggplot2 themes.
A ggplot2 theme.
unikn::theme_unikn inspired the current theme.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_empty()
# Plotting iris dataset (using ggplot2 and unikn): library('ggplot2') # theme_ds4psy() requires ggplot2 library('unikn') # for colors and usecol() function ggplot(datasets::iris) + geom_jitter(aes(x = Petal.Length, y = Petal.Width, color = Species), size = 3, alpha = 2/3) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(title = "Iris petals", subtitle = "The subtitle of this plot", caption = "Data from datasets::iris") + theme_ds4psy() ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 2/3) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(tag = "A", title = "Iris sepals", subtitle = "Demo plot with facets and default colors", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_ds4psy() # A unikn::Seeblau look: ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 2/3) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(tag = "B", title = "Iris sepals", subtitle = "Demo plot in unikn::Seeblau colors", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_ds4psy(col_title = pal_seeblau[[4]], col_strip = pal_seeblau[[1]], col_brdrs = Grau)# Plotting iris dataset (using ggplot2 and unikn): library('ggplot2') # theme_ds4psy() requires ggplot2 library('unikn') # for colors and usecol() function ggplot(datasets::iris) + geom_jitter(aes(x = Petal.Length, y = Petal.Width, color = Species), size = 3, alpha = 2/3) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(title = "Iris petals", subtitle = "The subtitle of this plot", caption = "Data from datasets::iris") + theme_ds4psy() ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 2/3) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(tag = "A", title = "Iris sepals", subtitle = "Demo plot with facets and default colors", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_ds4psy() # A unikn::Seeblau look: ggplot(datasets::iris) + geom_jitter(aes(x = Sepal.Length, y = Sepal.Width, color = Species), size = 3, alpha = 2/3) + facet_wrap(~Species) + scale_color_manual(values = usecol(pal = c(Pinky, Seeblau, Seegruen))) + labs(tag = "B", title = "Iris sepals", subtitle = "Demo plot in unikn::Seeblau colors", caption = "Data from datasets::iris") + coord_fixed(ratio = 3/2) + theme_ds4psy(col_title = pal_seeblau[[4]], col_strip = pal_seeblau[[1]], col_brdrs = Grau)
theme_empty provides an empty (blank) theme
to use in ggplot2 commands.
theme_empty( font_size = 12, font_family = "", rel_small = 12/14, plot_mar = c(0, 0, 0, 0) )theme_empty( font_size = 12, font_family = "", rel_small = 12/14, plot_mar = c(0, 0, 0, 0) )
font_size |
Overall font size.
Default: |
font_family |
Base font family.
Default: |
rel_small |
Relative size of smaller text.
Default: |
plot_mar |
Plot margin sizes (on top, right, bottom, left).
Default: |
theme_empty shows nothing but the plot panel.
theme_empty is based on
theme_nothing of the cowplot package
and uses
theme_void of the ggplot2 package.
A ggplot2 theme.
cowplot::theme_nothing is the inspiration and source of this theme.
Other plot functions:
plot_charmap(),
plot_chars(),
plot_circ_points(),
plot_fn(),
plot_fun(),
plot_n(),
plot_text(),
plot_tiles(),
theme_clean(),
theme_ds4psy()
# Plotting iris dataset (using ggplot2): library('ggplot2') # theme_empty() requires ggplot2 ggplot(datasets::iris) + geom_point(aes(x = Petal.Length, y = Petal.Width, color = Species), size = 4, alpha = 1/2) + scale_color_manual(values = c("firebrick3", "deepskyblue3", "olivedrab3")) + labs(title = "NOT SHOWN: Title", subtitle = "NOT SHOWN: Subtitle", caption = "NOT SHOWN: Data from datasets::iris") + theme_empty(plot_mar = c(2, 0, 1, 0)) # margin lines (top, right, bot, left)# Plotting iris dataset (using ggplot2): library('ggplot2') # theme_empty() requires ggplot2 ggplot(datasets::iris) + geom_point(aes(x = Petal.Length, y = Petal.Width, color = Species), size = 4, alpha = 1/2) + scale_color_manual(values = c("firebrick3", "deepskyblue3", "olivedrab3")) + labs(title = "NOT SHOWN: Title", subtitle = "NOT SHOWN: Subtitle", caption = "NOT SHOWN: Data from datasets::iris") + theme_empty(plot_mar = c(2, 0, 1, 0)) # margin lines (top, right, bot, left)
transl33t translates text into leet (or l33t) slang
given a set of rules.
transl33t(txt, rules = l33t_rul35, in_case = "no", out_case = "no")transl33t(txt, rules = l33t_rul35, in_case = "no", out_case = "no")
txt |
The text (character string) to translate. |
rules |
Rules which existing character in |
in_case |
Change case of input string |
out_case |
Change case of output string.
Default: |
The current version of transl33t only uses base R commands,
rather than the stringr package.
A character vector.
l33t_rul35 for default rules used;
invert_rules for inverting rules.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
words_to_text()
# Use defaults: transl33t(txt = "hello world") transl33t(txt = c(letters)) transl33t(txt = c(LETTERS)) # Specify rules: transl33t(txt = "hello world", rules = c("e" = "3", "l" = "1", "o" = "0")) # Set input and output case: transl33t(txt = "hello world", in_case = "up", rules = c("e" = "3", "l" = "1", "o" = "0")) # e only capitalized transl33t(txt = "hEllo world", in_case = "lo", out_case = "up", rules = c("e" = "3", "l" = "1", "o" = "0")) # e transl33ted# Use defaults: transl33t(txt = "hello world") transl33t(txt = c(letters)) transl33t(txt = c(LETTERS)) # Specify rules: transl33t(txt = "hello world", rules = c("e" = "3", "l" = "1", "o" = "0")) # Set input and output case: transl33t(txt = "hello world", in_case = "up", rules = c("e" = "3", "l" = "1", "o" = "0")) # e only capitalized transl33t(txt = "hEllo world", in_case = "lo", out_case = "up", rules = c("e" = "3", "l" = "1", "o" = "0")) # e transl33ted
Trumpisms contains characteristic words and phrases used
by U.S. president Donald J. Trump (the 45th and 47th president of the United States of America)
during his first presidency (ranging from January 20, 2017, to January 20, 2021).
TrumpismsTrumpisms
A vector of type character
with length(Trumpisms) = 168
(on 2021-01-28).
See https://en.wikiquote.org/wiki/Donald_Trump for a more recent collection of attributed and disputed quotes.
Data originally based on a collection of Donald Trump's 20 most frequently used words on https://www.yourdictionary.com
and expanded by interviews, public speeches, and Twitter tweets from https://twitter.com/realDonaldTrump.
Other datasets:
Bushisms,
countries,
data_1,
data_2,
data_t1,
data_t1_de,
data_t1_tab,
data_t2,
data_t3,
data_t4,
dt_10,
exp_num_dt,
exp_wide,
falsePosPsy_all,
fame,
flowery,
fruits,
i2ds_survey,
outliers,
pi_100k,
posPsy_AHI_CESD,
posPsy_long,
posPsy_p_info,
posPsy_wide,
t3,
t4,
t_1,
t_2,
t_3,
t_4,
table6,
table7,
table8,
table9,
tb
Umlaut provides the German Umlaut letters (aka. diaeresis/diacritic)
as a named character vector.
UmlautUmlaut
An object of class character of length 7.
For Unicode details, see https://home.unicode.org/,
For details on German Umlaut letters (aka. diaeresis/diacritic), see https://en.wikipedia.org/wiki/Diaeresis_(diacritic) and https://en.wikipedia.org/wiki/Germanic_umlaut.
Other text objects and functions:
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t(),
words_to_text()
Umlaut names(Umlaut) paste0("Hansj", Umlaut["o"], "rg i", Umlaut["s"], "t s", Umlaut["u"], "sse ", Umlaut["A"], "pfel.") paste0("Das d", Umlaut["u"], "nne M", Umlaut["a"], "dchen l", Umlaut["a"], "chelt.") paste0("Der b", Umlaut["o"], "se Mann macht ", Umlaut["u"], "blen ", Umlaut["A"], "rger.") paste0("Das ", Umlaut["U"], "ber-Ich ist ", Umlaut["a"], "rgerlich.")Umlaut names(Umlaut) paste0("Hansj", Umlaut["o"], "rg i", Umlaut["s"], "t s", Umlaut["u"], "sse ", Umlaut["A"], "pfel.") paste0("Das d", Umlaut["u"], "nne M", Umlaut["a"], "dchen l", Umlaut["a"], "chelt.") paste0("Der b", Umlaut["o"], "se Mann macht ", Umlaut["u"], "blen ", Umlaut["A"], "rger.") paste0("Das ", Umlaut["U"], "ber-Ich ist ", Umlaut["a"], "rgerlich.")
what_date provides a satisficing version of
Sys.Date() that is sufficient for most purposes.
what_date( when = NA, rev = FALSE, as_string = TRUE, sep = "-", month_form = "m", tz = "" )what_date( when = NA, rev = FALSE, as_string = TRUE, sep = "-", month_form = "m", tz = "" )
when |
Date(s) (as a scalar or vector).
Default: |
rev |
Boolean: Reverse date (to
Default: |
as_string |
Boolean: Return as character string?
Default: |
sep |
Character: Separator to use.
Default: |
month_form |
Character: Month format.
Default: |
tz |
Time zone.
Default: |
By default, what_date returns either
Sys.Date() or the dates provided by when
as a character string (using current system settings and
sep for formatting).
If as_string = FALSE, a "Date" object is returned.
The tz argument allows specifying time zones
(see Sys.timezone() for current setting
and OlsonNames() for options.)
However, tz is merely used to represent the
dates provided to the when argument.
Thus, there currently is no active conversion
of dates into other time zones
(see the today function of lubridate package).
A character string or object of class "Date".
what_wday() function to obtain (week)days;
what_time() function to obtain times;
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
what_date() what_date(sep = "/") what_date(rev = TRUE) what_date(rev = TRUE, sep = ".") what_date(rev = TRUE, sep = " ", month_form = "B") # with "POSIXct" times: what_date(when = Sys.time()) # with time vector (of "POSIXct" objects): ts <- c("1969-07-13 13:53 CET", "2020-12-31 23:59:59") what_date(ts) what_date(ts, rev = TRUE, sep = ".") what_date(ts, rev = TRUE, month_form = "b") # return a "Date" object: dt <- what_date(as_string = FALSE) class(dt) # with time zone: ts <- ISOdate(2020, 12, 24, c(0, 12)) # midnight and midday UTC what_date(when = ts, tz = "Pacific/Honolulu", as_string = FALSE)what_date() what_date(sep = "/") what_date(rev = TRUE) what_date(rev = TRUE, sep = ".") what_date(rev = TRUE, sep = " ", month_form = "B") # with "POSIXct" times: what_date(when = Sys.time()) # with time vector (of "POSIXct" objects): ts <- c("1969-07-13 13:53 CET", "2020-12-31 23:59:59") what_date(ts) what_date(ts, rev = TRUE, sep = ".") what_date(ts, rev = TRUE, month_form = "b") # return a "Date" object: dt <- what_date(as_string = FALSE) class(dt) # with time zone: ts <- ISOdate(2020, 12, 24, c(0, 12)) # midnight and midday UTC what_date(when = ts, tz = "Pacific/Honolulu", as_string = FALSE)
what_month provides a satisficing version of
to determine the month corresponding to a given date.
what_month(when = Sys.Date(), abbr = FALSE, as_integer = FALSE)what_month(when = Sys.Date(), abbr = FALSE, as_integer = FALSE)
when |
Date (as a scalar or vector).
Default: |
abbr |
Boolean: Return abbreviated?
Default: |
as_integer |
Boolean: Return as integer?
Default: |
what_month returns the month
of when or Sys.Date()
(as a name or number).
what_week() function to obtain weeks;
what_date() function to obtain dates;
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_time(),
what_wday(),
what_week(),
what_year(),
zodiac()
what_month() what_month(abbr = TRUE) what_month(as_integer = TRUE) # with date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_month(when = ds) what_month(when = ds, abbr = TRUE, as_integer = FALSE) what_month(when = ds, abbr = TRUE, as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-02-29 10:11:12 CET", "2020-12-31 23:59:59") what_month(ts)what_month() what_month(abbr = TRUE) what_month(as_integer = TRUE) # with date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_month(when = ds) what_month(when = ds, abbr = TRUE, as_integer = FALSE) what_month(when = ds, abbr = TRUE, as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-02-29 10:11:12 CET", "2020-12-31 23:59:59") what_month(ts)
what_time provides a satisficing version of
Sys.time() that is sufficient for most purposes.
what_time(when = NA, seconds = FALSE, as_string = TRUE, sep = ":", tz = "")what_time(when = NA, seconds = FALSE, as_string = TRUE, sep = ":", tz = "")
when |
Time (as a scalar or vector).
Default: |
seconds |
Boolean: Show time with seconds?
Default: |
as_string |
Boolean: Return as character string?
Default: |
sep |
Character: Separator to use.
Default: |
tz |
Time zone.
Default: |
By default, what_time prints a simple version of
when or Sys.time()
as a character string (in "
using current default system settings.
If as_string = FALSE, a "POSIXct"
(calendar time) object is returned.
The tz argument allows specifying time zones
(see Sys.timezone() for current setting
and OlsonNames() for options.)
However, tz is merely used to represent the
times provided to the when argument.
Thus, there currently is no active conversion
of times into other time zones
(see the now function of lubridate package).
A character string or object of class "POSIXct".
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_wday(),
what_week(),
what_year(),
zodiac()
what_time() # with vector (of "POSIXct" objects): tm <- c("2020-02-29 01:02:03", "2020-12-31 14:15:16") what_time(tm) # with time zone: ts <- ISOdate(2020, 12, 24, c(0, 12)) # midnight and midday UTC t1 <- what_time(when = ts, tz = "Pacific/Honolulu") t1 # time display changed, due to tz # return "POSIXct" object(s): # Same time in differen tz: t2 <- what_time(as.POSIXct("2020-02-29 10:00:00"), as_string = FALSE, tz = "Pacific/Honolulu") format(t2, "%F %T %Z (UTF %z)") # from string: t3 <- what_time("2020-02-29 10:00:00", as_string = FALSE, tz = "Pacific/Honolulu") format(t3, "%F %T %Z (UTF %z)")what_time() # with vector (of "POSIXct" objects): tm <- c("2020-02-29 01:02:03", "2020-12-31 14:15:16") what_time(tm) # with time zone: ts <- ISOdate(2020, 12, 24, c(0, 12)) # midnight and midday UTC t1 <- what_time(when = ts, tz = "Pacific/Honolulu") t1 # time display changed, due to tz # return "POSIXct" object(s): # Same time in differen tz: t2 <- what_time(as.POSIXct("2020-02-29 10:00:00"), as_string = FALSE, tz = "Pacific/Honolulu") format(t2, "%F %T %Z (UTF %z)") # from string: t3 <- what_time("2020-02-29 10:00:00", as_string = FALSE, tz = "Pacific/Honolulu") format(t3, "%F %T %Z (UTF %z)")
what_wday provides a satisficing version of
to determine the day of the week
corresponding to a given date.
what_wday(when = Sys.Date(), abbr = FALSE)what_wday(when = Sys.Date(), abbr = FALSE)
when |
Date (as a scalar or vector).
Default: |
abbr |
Boolean: Return abbreviated?
Default: |
what_wday returns the name of the weekday
of when or of Sys.Date()
(as a character string).
what_date() function to obtain dates;
what_time() function to obtain times;
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_week(),
what_year(),
zodiac()
what_wday() what_wday(abbr = TRUE) what_wday(Sys.Date() + -1:1) # Date (as vector) what_wday(Sys.time()) # POSIXct what_wday("2020-02-29") # string (of valid date) what_wday(20200229) # number (of valid date) # date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_wday(when = ds) what_wday(when = ds, abbr = TRUE) # time vector (strings of POSIXct times): ts <- c("1969-07-13 13:53 CET", "2020-12-31 23:59:59") what_wday(ts) # fame data: greta_dob <- as.Date(fame[grep(fame$name, pattern = "Greta") , ]$DOB, "%B %d, %Y") what_wday(greta_dob) # Friday, of course.what_wday() what_wday(abbr = TRUE) what_wday(Sys.Date() + -1:1) # Date (as vector) what_wday(Sys.time()) # POSIXct what_wday("2020-02-29") # string (of valid date) what_wday(20200229) # number (of valid date) # date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_wday(when = ds) what_wday(when = ds, abbr = TRUE) # time vector (strings of POSIXct times): ts <- c("1969-07-13 13:53 CET", "2020-12-31 23:59:59") what_wday(ts) # fame data: greta_dob <- as.Date(fame[grep(fame$name, pattern = "Greta") , ]$DOB, "%B %d, %Y") what_wday(greta_dob) # Friday, of course.
what_week provides a satisficing version of
to determine the week corresponding to a given date.
what_week(when = Sys.Date(), unit = "year", as_integer = FALSE)what_week(when = Sys.Date(), unit = "year", as_integer = FALSE)
when |
Date (as a scalar or vector).
Default: |
unit |
Character: Unit of week?
Possible values are |
as_integer |
Boolean: Return as integer?
Default: |
what_week returns the week
of when or Sys.Date()
(as a name or number).
what_wday() function to obtain (week)days;
what_date() function to obtain dates;
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_year(),
zodiac()
what_week() what_week(as_integer = TRUE) # Other dates/times: d1 <- as.Date("2020-12-24") what_week(when = d1, unit = "year") what_week(when = d1, unit = "month") what_week(Sys.time()) # with POSIXct time # with date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_week(when = ds) what_week(when = ds, unit = "month", as_integer = TRUE) what_week(when = ds, unit = "year", as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-12-25 10:11:12 CET", "2020-12-31 23:59:59") what_week(ts)what_week() what_week(as_integer = TRUE) # Other dates/times: d1 <- as.Date("2020-12-24") what_week(when = d1, unit = "year") what_week(when = d1, unit = "month") what_week(Sys.time()) # with POSIXct time # with date vector (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_week(when = ds) what_week(when = ds, unit = "month", as_integer = TRUE) what_week(when = ds, unit = "year", as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-12-25 10:11:12 CET", "2020-12-31 23:59:59") what_week(ts)
what_year provides a satisficing version of
to determine the year corresponding to a given date.
what_year(when = Sys.Date(), abbr = FALSE, as_integer = FALSE)what_year(when = Sys.Date(), abbr = FALSE, as_integer = FALSE)
when |
Date (as a scalar or vector).
Default: |
abbr |
Boolean: Return abbreviated?
Default: |
as_integer |
Boolean: Return as integer?
Default: |
what_year returns the year
of when or Sys.Date()
(as a name or number).
what_week() function to obtain weeks;
what_month() function to obtain months;
cur_time() function to print the current time;
cur_date() function to print the current date;
now() function of the lubridate package;
Sys.time() function of base R.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
zodiac()
what_year() what_year(abbr = TRUE) what_year(as_integer = TRUE) # with date vectors (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_year(when = ds) what_year(when = ds, abbr = TRUE, as_integer = FALSE) what_year(when = ds, abbr = TRUE, as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-02-29 10:11:12 CET", "2020-12-31 23:59:59") what_year(ts)what_year() what_year(abbr = TRUE) what_year(as_integer = TRUE) # with date vectors (as characters): ds <- c("2020-01-01", "2020-02-29", "2020-12-24", "2020-12-31") what_year(when = ds) what_year(when = ds, abbr = TRUE, as_integer = FALSE) what_year(when = ds, abbr = TRUE, as_integer = TRUE) # with time vector (strings of POSIXct times): ts <- c("2020-02-29 10:11:12 CET", "2020-12-31 23:59:59") what_year(ts)
x into a textwords_to_text pastes or collapses
a character string x into a single text string.
words_to_text(x, collapse = " ")words_to_text(x, collapse = " ")
x |
A string of text (required), typically a character vector. |
collapse |
A character string to separate the elements of |
words_to_text is essentially identical to
collapse_chars.
Internally, both functions are wrappers around
paste with a collapse argument.
A text (as a collapsed character vector).
text_to_words for splitting a text into its words;
text_to_sentences for splitting text into a vector of sentences;
text_to_chars for splitting text into a vector of characters;
count_words for counting the frequency of words;
collapse_chars for collapsing character vectors;
strsplit for splitting strings.
Other text objects and functions:
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars(),
count_chars_words(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_sentences(),
text_to_words(),
transl33t()
s <- c("Hello world!", "A 1st sentence.", "A 2nd sentence.", "The end.") words_to_text(s) cat(words_to_text(s, collapse = "\n"))s <- c("Hello world!", "A 1st sentence.", "A 2nd sentence.", "The end.") words_to_text(s) cat(words_to_text(s, collapse = "\n"))
zodiac provides the tropical zodiac sign or symbol
(aka. astrological sign) for given date(s) x.
zodiac( x, out = "en", zodiac_swap_mmdd = c(120, 219, 321, 421, 521, 621, 723, 823, 923, 1023, 1123, 1222) )zodiac( x, out = "en", zodiac_swap_mmdd = c(120, 219, 321, 421, 521, 621, 723, 823, 923, 1023, 1123, 1222) )
x |
Date (as a scalar or vector, required).
If |
out |
Output format (as character).
Available output formats are:
English/Latin ( |
zodiac_swap_mmdd |
Monthly dates on which
the 12 zodiac signs switch (in Default start dates are set to
Alternative start dates for Zodiac signs in some sources include
|
zodiac is flexible by providing different
output formats (in Latin/English, German, or Unicode/HTML,
see out) and allowing to adjust the calendar dates
on which a new zodiac is assigned (via zodiac_swap_mmdd).
Zodiac label or symbol (as a factor).
See https://en.wikipedia.org/wiki/Zodiac or https://de.wikipedia.org/wiki/Tierkreiszeichen for labels, dates, and alternative date ranges.
Zodiac() function of the DescTools package.
Other date and time functions:
change_time(),
change_tz(),
cur_date(),
cur_time(),
days_in_month(),
diff_dates(),
diff_times(),
diff_tz(),
is_leap_year(),
what_date(),
what_month(),
what_time(),
what_wday(),
what_week(),
what_year()
zodiac(Sys.Date()) # Works with vectors: dt <- sample_date(size = 10) zodiac(dt) # Factor levels: levels(zodiac(dt)) # Alternative outputs: zodiac(dt, out = "de") # German/deutsch zodiac(dt, out = "Unicode") # Unicode zodiac(dt, out = "HTML") # HTML # Alternative date breaks: zodiac("2000-08-23") # 0823 is "Virgo" by default zodiac("2000-08-23", # change to 0824 (i.e., August 24): zodiac_swap_mmdd = c(0120, 0219, 0321, 0421, 0521, 0621, 0723, 0824, 0923, 1023, 1123, 1222))zodiac(Sys.Date()) # Works with vectors: dt <- sample_date(size = 10) zodiac(dt) # Factor levels: levels(zodiac(dt)) # Alternative outputs: zodiac(dt, out = "de") # German/deutsch zodiac(dt, out = "Unicode") # Unicode zodiac(dt, out = "HTML") # HTML # Alternative date breaks: zodiac("2000-08-23") # 0823 is "Virgo" by default zodiac("2000-08-23", # change to 0824 (i.e., August 24): zodiac_swap_mmdd = c(0120, 0219, 0321, 0421, 0521, 0621, 0723, 0824, 0923, 1023, 1123, 1222))