How To Install R Packages, Manage CRAN Mirrors, and Masking [2021]

How To Install R Packages, Manage CRAN Mirrors, and Masking [2021]

This guide covers everything you need to know to install R packages from CRAN or GitHub, manage mirrors and solve object conflicts in R.

R Package Management

An R package is a series of R functions, such as datasets, support files, and compiled code, packed in a compact and well-defined manner. They are compressed files that must be unzipped and placed in the right location on your machine before you can use them in your R project. 

You don’t have to do this manually because R can do it for you. Aside from the software itself, the R installation file includes about 30 default or recommended packages, where about seven are loaded into memory immediately once you start R. 

On request, the remaining 23 are accessible. The 30 packages are mounted in the designated folder on your machine and are suitable for a broad range of tasks, such as data management and statistical analysis.

Nevertheless, because R is open-source, individuals worldwide are already developing and contributing their packages, making them publicly accessible. 

These packages can be accessed from the CRAN website or directly from R’s repositories. The following segment provides further specifics regarding how to download and install the user-contributed R packages.

Launch R and type the following command in the R console, then press the [Enter] key. 

> search()

Note: You don’t have to type the “>” prompt sign in the command. The “>” is the default R prompt and indicates that R is waiting for your input. Type 

The output above (the text that begins on the second line) displays a collection of search path items. The numbers in square brackets show the positional index of the unit immediately to the right. [4] indicates that the grDevices package is the fourth element of the set.

The other packages on the right don’t have a location number assigned to them, but you can figure their number as long as you can count. 

The output above can appear slightly different on your computer. On other displays, your R application’s console window will be sized differently, especially in terms of width, and as a result, the R console will be scaled to match this width. 

Second, you may have other packages loaded, either because you installed them yourself or because the R team has included (or removed) regular packages in newer versions of R. In this guide, I am using version R-4.0.4, which is the latest as of March 2020.

Here is the output from the search() command above showing the packaged automatically loaded when starting R.

search R packages

The global environment “.GlobalEnv” is often in the first place of the search route and is not an R package. The “GlobalEnv” or Global Environment is where freshly created R objects are placed in the memory. 

The R base package is always in the last position, and unlike the other packages, it cannot be deleted.

Note that if you are using R Studio, you will see an additional element named “tools:rstudio” in addition to the output above. If you’re looking to install RStudio on your system, check out this guide for Windows, macOS, Linux and FreeBSD:

The 30 R installed packages mentioned above are not all loaded at the startup to keep your memory usage low. 

However, if you want to load an installed package but not loaded into the memory, you can use the function library. For instance, the package MASS is one of the 30 packages installed but not loaded into the memory. I will use the command library to load it: 

> library(MASS)

As you can see, the package MASS is loaded in the position [2] now, and all the other packages are pushed one position lower. 

R search with MASS function loaded in the memory

The location of each package in the search path is significant as it establishes the priority for replication of functions. If and when this happens, you will receive an alert message. Packages loaded into memory guarantee that certain functions are eligible for usage in such packages throughout the R session.

Packages that are no longer necessary can be excluded from the search route. To unload a package from memory, you need to use the command detach

> detach(package:MASS)

You can also remove a package from memory by specifying its position number, as shown in the picture below: 

> detach(pos=2)

If you run the previous commands, you will remove the stats package from the R search path. Once removed, if you want to use the functions in this package, you will receive an error message.

If you’re looking for more details, check the help page of detach for more information. You can load the package anytime you want without any adverse effects on your system using the command below: 

> library(stats)

As mentioned before, the base and GlobalEnv packages cannot be removed.

Packages that you load manually using the library command will be automatically detached when you quit R and not be reloaded when you start another R session. Keep this in mind as this it’s essential. 

What are R User-contributed packages?

The R user-contributed packages are developed by the R users worldwide and are entirely free to use. On the CRAN project website can find a comprehensive list containing all packages developed by the community. Go check that out.

As of 2021, there are over 14,000 community-packages ready to install and use in R. These packages help reduce the complexity of commands when aiming to accomplish specific tasks. 

When you choose to use functions or datasets from a user-contributed package, you must go through two steps. Installing the package is the first step. You can use the install package function to do this. If the package has been downloaded previously, you can install it from a local zip or tar file. An Internet connection will be needed in this case.

> install.packages("RMySQL")

R will most likely throw the following message in your terminal: 

# — Please select a CRAN mirror for use in this session —

This message means R cannot find the RMySQL package in its repository, and we will need to install a mirror first.

How to install from CRAN mirrors?

When installing the RMySQL package, R asked you to select a CRAN mirror to use. You can either get a repository list window or a text menu with a few choices. But if it doesn’t appear, you can still choose the mirror from which to import the packages using the repositories parameter “repos=.” After doing so, R will no longer bother you to select a mirror.

Here is an example of using the US mirror to get the RMySQL package in my R system.

> install.packages('RMySQL', repos='http://cran.us.r-project.org')

Here you can find the list of all available R mirrors.

It is essential to remember that a package that depends on another cannot be detached from the system. 

Where are R packages located in my system?

Depending on the operating system you are using or your user privileges, the location of the R packages can differ. 

To find out the patch where R is storing its packages, type in R the following command:

> .libPaths()

Typically, on a Windows machine, the R packages will be located in the “C:\Program Files\R” folder. 

On my macOS machine, the R packages are installed in the “/Library/Frameworks/R.framework/Resources/library” folder. 

If you prefer a custom location to install the R packages, you will need to define it in the .RprofileFor instance, I want to install them in my macOS home directory:

> .libPaths( "/Users/tex/lib/R" )

The .Rprofile file on Windows is located in the C:\Program Files\R\R-n. n.n\etc folder, where “n” is your installed R version.

R will remember the new path and install the packages at this location from now on.

How to install R packages from GitHub

Some R packages developed by the R community are located on GitHub

To do this, you will need to install the devtools package in R first. To do this, type in R the following command:

> install.packages("devtools")

Secondly, load the devtools function in the memory using the following command:

> library(devtools)

To install an R GitHub package, head over GitHub and take note of the package author and package name.

In this example, I will install Allison Horst’s “palmerpenguins” package by using the install_github function.

> install_github("allisonhorst/palmerpenguins")

As you can see, the “palmerpenguins” is now listed in the Packages tab in R. 

And, as mentioned before, the “palmerpenguins” package is not loaded into memory until we call the library function:

> library(palmerpenguins)

Package Masking in R?

Some user-contributed R packages may contain functions with the same name as functions in another package. When this situation occurs, a warning message will pop-up in the R terminal. This situation is called masking.

A function in the same package cannot have two names as well as you cannot create two files with the same name in a directory on your machine. However, functions in different packages can have the same name and do completely different things. 

In the above example, we installed the dplyr package, loaded the dplyr function in the memory, and received objections from three objects (packages) with the same name loaded in the memory. 

If you want to use a function that a recently loaded function has masked, you have the following options:

  1. detach the package you don’t use, using the detach function, or
  2. give the package you want to use a higher priority by loading it first before other packages in your project.

To check which package has the highest priority, check the search path:

> search()

The package with the smaller possition number and closer to the .GlobalEnv has the highest priority.

In conclusion, user-contributed packages should be used only when you really need them. If you plan not to use a package that is loaded in the memory, a good practice is to detach it to avoid further function conflicts. 

Keep in mind that when you quit R, all the packages loaded in the memory will be automatically detached. When starting a new session, R will load only the base packages. 

Conclusion

Installing and loading packages in R is a simple, straightforward process and does not require a lot of tinkering to get things done. Anytime you feel stuck in R, use the help.start() command to get help. 

If you have any questions or suggestions to make this guide better, drop me a message in the comment section below. If you found this guide helpful, do me a favor and share it with your friends and family. 

Stay safe!

Leonard

Call me Leo. I'm a technology addict and passionate engineer. If you can't find what you're looking for, drop me a message, and you'll have your step-by-step guide here in no time - I promise!

Leave a Reply