This lesson is in the early stages of development (Alpha version)

Introduction to R - theory sessions: Cheatsheets for Queuing System Quick Reference

Key Points

RStudio Layout, navigating the IDE, project setup
  • R and RStudio are not the same thing. RStudio is an IDE that provides you with a convenient way to manage R projects and R is the underlying language that enables RStudio.

  • You will not learn everything about R or RStudio in a day, there are a huge number of tools available to you in RStudio. The more time you commit to exploring and practicing the more you will achieve.

  • Project management is a vital part of ensuring that you are producing maintainable, sharable, and robust software.

Seeking help
  • You will get stuck at some point, needing help is a case of when not if.

  • Help is available, but it is important you have done your due diligence and are asking for help in the correct places and in the correct format.

Calculating in RStudio
  • PEDMAS

  • Use in-built functions, you don’t need to reinvent the wheel.

Variables and assignment
  • When naming variables it’s important be consistent and succinct

  • Output assignment has to be explicit to keep the result of an operation

  • Different data types can require different operations

  • R will make assumptions about data types unless you are explicit

Vectors and vectorisation
  • Vectors can only hold a single data type

  • Vectorisation allows us to apply operations to all the variables in a vector

  • Vectors can be indexed by multiple methods to retrieve/manipulate the specific information you desire

Flow control
  • Flow control is an important technique you need to learn to create useful software

  • Whenever possible simplicity is the best option

  • Plan & refactor

Functions
  • Functions can help reduce redundancy and increase reusability in your code

Lists and Data frames
  • Data frames are lists of vectors.

  • Data frames are the most common way of storing tabular data

Manipulating data frames
  • The complexity of the process is related to the complexity of the conditions for retrieving the data you want.

Loading data into R
  • Loading in the data is just the first step

Packages in R
  • There are many packages with a wide range of features, becoming adept at utilising packages is key to getting the most out of R

Plotting in R with ggplot2
  • Plotting is useful tool for understanding our data it is not just for results visualisation

Practical session
  • Solving problems with R and RStudio will improve your skills and confidence

Practical session answers
  • NA

Cheatsheets for Queuing System Quick Reference

Glossary

The following list captures terms that need to be added to this glossary. This is a great way to contribute.

Accelerator
to be defined
Beowulf cluster
to be defined
Central processing unit
to be defined
Cloud computing
to be defined
Cluster
a collection of computers configured to enable collaboration on a common task by means of purposefully configured hardware (e.g., networking) and software (e.g. workload management).
Distributed memory
to be defined
Grid computing
to be defined
High availability computing
to be defined
High performance computing
to be defined
Interconnect
to be defined
Node
to be defined
Parallel
to be defined
Serial
to be defined
Server
to be defined
Shared memory
to be defined
Slurm
to be defined
Supercomputer
… “a major scientific instrument” …
Workstation
to be defined
Grid Engine
to be defined
Parallel File System
to be defined