Run Settings
Language Version
Run Command
Introduction Dirty data problems: Missing values, data manipulation, duplicates, forms of data dates, outliers, spelling Missing Values in R: is.na() Function for Finding Missing values: A logical vector is returned by this function that indicates all the NA values present. It returns a Boolean value. If NA is present in a vector it returns TRUE else FALSE. example: x<- c(NA, 3, 4, NA, NA, NA) is.na(x) output: [1] TRUE FALSE FALSE TRUE TRUE TRUE is.nan() Function for Finding Missing values: A logical vector is returned by this function that indicates all the NaN values present. It returns a Boolean value. If NaN is present in a vector it returns TRUE else FALSE. example: x<- c(NA, 3, 4, NA, NA, 0 / 0, 0 / 0) is.nan(x) output: [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE DUPLICATE VALUES IN R: 1.duplicated(): The R function duplicated() returns a logical vector where TRUE specifies which elements of a vector or data frame are duplicates. example: x <- c(1, 1, 4, 5, 4, 6) duplicated(x) output: [1] FALSE TRUE FALSE FALSE TRUE FALSE Extract duplicate elements: x <- c(1, 1, 4, 5, 4, 6) x[duplicated(x)] [1] 1 4 REMOVING DUPLICATES IN R: If you want to remove duplicated elements, use !duplicated(), where ! is a logical negation: EXAMPLE: x <- c(1, 1, 4, 5, 4, 6) x[!duplicated(x)] OUTPUT: [1] 1 4 5 6 forms of DATA DATES IN R: 1) Sys.Date(): In R programming, if you use Sys.Date() function, it will give you the system date. syntax: Sys.Date() output: [1] "2022-04-20" 2)Sys.timezone() : a function named Sys.timezone() that allows us to get the timezone based on the location at which the user is running the code on the system. syntax: Sys.timezone() output: [1] "Asia/Calcutta" 3)Sys.time() : we have the Sys.time() function. Which, if used, will return the current date as well as the time of the system with the timezone details. syntax: Sys.time() output: [1] "2022-04-20 11:02:56 IST" 4) as.date(): as.Date() function allows us to create a date value (without time) in R programming. It allows the various input formats of the date value as well through the format = argument. example: mydate<-as.date("2014-04-30") mydate ouptut: [1] "2014-04-30"
Editor Settings
Key bindings
Full width