How to add and subtract Dates and Times in R

Introduction

Following the R recipe How to deal with Date and Time in R where we reviewed the base R capabilities to deal with dates and times, we continue exploring how to add and subtract dates and times, also in base R. I plan to cover lubridate and clock packages in the future.

Addition of a scalar to a date

It’s very common that we need to calculate an end date for a process or service knowing only the starting date and the duration. For this, we need to add the number of days (scalar) to a date. Let’s take a look:

start_point <- Sys.Date()
duration_days <- 365                # one year
start_point + duration_days         # [1] "2022-04-20", one year later

duration_days <- 15                 # 15 days
start_point + duration_days         # [1] "2021-05-05", 15 days later

duration_days <- 3 * 30             # 3 months (approx)
start_point + duration_days         # [1] "2021-07-19", 90 days later

duration_days <- 16 * 7             # 16 weeks
start_point + duration_days         # [1] "2021-08-10", 112 days later

class(start_point + duration_days)  # [1] "Date"

As you have seen above, adding a scalar to date produces another date pointing at the future. If you want to add weeks, months, etc. you need to convert it to days first. That’s a bit tricky because three months aren’t exactly 90 days. The exact number of days to add depends on the actual months involved. This has to be pre-computed separately, I’m afraid.

Using a sequence of dates to assist in the calculation

There is a powerful feature in R that I personally like a lot. It’s the capability to create sequence of dates that respect the oddities of the human calendar: not all the months have 30 days and there exists leap years.

Let’s assume we want to send a bill every 27th of every month during 12 months starting on 27th of Feb 2020. The pre-computed dates are:

seq(as.Date("2020-02-27"), by = "month", length.out = 12)
## [1] "2020-02-27" "2020-03-27" "2020-04-27" "2020-05-27" "2020-06-27" "2020-07-27"
## [7] "2020-08-27" "2020-09-27" "2020-10-27" "2020-11-27" "2020-12-27" "2021-01-27"

The same works for the 1st of every month (or any other day) but not for the last day of the month!. See below:

seq(as.Date("2020-01-01"), by = "month", length.out = 12)
## [1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01" "2020-06-01"
## [7] "2020-07-01" "2020-08-01" "2020-09-01" "2020-10-01" "2020-11-01" "2020-12-01"

seq(as.Date("2020-01-31"), by = "month", length.out = 12)
## [1] "2020-01-31" "2020-03-02" "2020-03-31" "2020-05-01" "2020-05-31" "2020-07-01"
## [7] "2020-07-31" "2020-08-31" "2020-10-01" "2020-10-31" "2020-12-01" "2020-12-31")

In the last example, February doesn’t appear while March appears twice. Same for April, June… For the last day of the month, we need a bit trickier solution. Check the next section: how to calculate the last day of the month.

To finish this chapter of sequencing dates, I want to add that we can also increment by days, weeks, quarters and years. We can even indicate the last date of the sequence instead of its length:

seq(as.Date("2020-01-18"), by = "days", length.out = 3)
# [1] "2020-01-18" "2020-01-19" "2020-01-20"

seq(to = as.Date("2020-01-22"), by = "weeks", to = as.Date("2020-02-29"))
# [1] "2020-01-22" "2020-01-29" "2020-02-05" "2020-02-12" "2020-02-19" "2020-02-26"

seq(as.Date("2020-01-22"), by = "quarters", to = as.Date("2020-10-29"))
# [1] "2020-01-22" "2020-04-22" "2020-07-22" "2020-10-22"

seq(as.Date("2020-01-22"), by = "years", to = as.Date("2024-10-29"))
# [1] "2020-01-22" "2021-01-22" "2022-01-22" "2023-01-22" "2024-01-22"

How to calculate the last day of the month

As I said at the beginning, I don’t want to make use of the ubiquitous lubridate package. So this is based on pure base R. I must say this is my own solution because I haven’t seen anything better/shorter so far. If you happen to know one, please let me know.

  1. We start generating a sequence of 365 days from 1st of Jan 2020. You can, of course, adjustn this to your own needs.
  2. Then, we create groups per month using format()
  3. Finally, we extract only the 2nd column.

Voila!!

dates <- seq(as.Date("2020-01-01"), by = "day", length.out = 365)
c(head(dates), tail(dates))
## [1] "2020-01-01" "2020-01-02" "2020-01-03" "2020-01-04" "2020-01-05" "2020-01-06"
## [7] "2020-12-25" "2020-12-26" "2020-12-27" "2020-12-28" "2020-12-29" "2020-12-30"

last_days <- aggregate(dates, by = list(format(dates, "%Y-%m")), FUN = max)
last_days
##    Group.1          x
## 1  2020-01 2020-01-31
## 2  2020-02 2020-02-29
## 3  2020-03 2020-03-31
## 4  2020-04 2020-04-30
## 5  2020-05 2020-05-31
## 6  2020-06 2020-06-30
## 7  2020-07 2020-07-31
## 8  2020-08 2020-08-31
## 9  2020-09 2020-09-30
## 10 2020-10 2020-10-31
## 11 2020-11 2020-11-30
## 12 2020-12 2020-12-30

last_days[,2]
## [1] "2020-01-31" "2020-02-29" "2020-03-31" "2020-04-30" "2020-05-31" "2020-06-30"
## [7] "2020-07-31" "2020-08-31" "2020-09-30" "2020-10-31" "2020-11-30" "2020-12-30"

Notice that I’ve chosen a date with a leap year on purpose to test the solution

Subtracting a scalar from a date

Conversely, we can also subtract a number of days from a date.

as.Date("2020-02-27") - 4    # [1] "2020-02-23"
as.Date("2020-03-01") - 1    # [1] "2020-02-29"  (leap year)
as.Date("2020-03-01") - 90   # [1] "2019-12-02"  (leap year and change in year)

Nevertheless, you cannot generate sequence of dates in reverse (or backwards). And it doesn’t make any sense on second thought.

Subtracting two dates

Here we work with two dates, not with a date and a scalar as before. But the result of that subtraction results in a scalar: the number of days they fall apart. See below:

date1 <- as.Date("2021-02-21")
date2 <- as.Date("2021-07-28")

date1 - date2          # Time difference of -157 days
date2 - date1          # Time difference of 157 days

x <- date2 - date1
class(x)               # [1] "difftime"
attributes(x)
## $class
## [1] "difftime"
## 
## $units
## [1] "days"

Notice that the number of days can be positive or negative depending on the position of the dates. Also, there is a new class: difftime. This class is used to measure the time interval, and its structure contains the value and its unit. So the unit can be changed. The documentation ?difftime shows: “auto”, “secs”, “mins”, “hours”, “days” and “weeks”.

R will choose the most suitable unit for the value, but we can change or overule it:

x                       # Time difference of 157 days
units(x)                # [1] "days"
units(x) <- "hours"     
units(x)                # [1] "hours"
x                       # Time difference of 3768 hours

difftime(date2, date1, units = "weeks")   # Time difference of 22.42857 weeks
difftime(date2, date1, units = "hours")   # Time difference of 3768 hours

Finally, you may remove the complexity of the difftime class and convert it to a number via as.numeric, but it’ll be converted in the current units. If you want to change it you need to specify the units parameter. Check this out:

x                       # Time difference of 157 days
as.numeric(x)           # 157

units(x) <- "hours"
x                       # Time difference of 3768 hours
as.numeric(x)           # 3768

as.numeric(x, units = "days")   # 157
as.numeric(x, units = "weeks")  # 22.42857

And that’s all! If you’ve read up to here, you’ll feel much more comfortable dealing with dates and times in R as of now.

I’d be very happy to hear from your. You can get in touch with me on:

 Share!

 
comments powered by Disqus