# Make a simple object
<- 2
a
# Check it out
a
[1] 2
This section covers some of the most fundamental operations of both languages. These include variable/object assignment, data type/class, arithmetic, etc. External data are not included in this page.
Note that any line in a code chunk preceded by a hashtag (#
) is a “comment” and is not evaluated in either language. Including comments is generally good practice because it allows humans to read and understand code that may otherwise be unclear to them.
At its most basic, we want to store data in code in such a way that we can use / manipulate it via our scripts. This requires assigning data to a variable/object with the assignment operator.
In R, the assignment operator is <-
. To use it, the name of the new object-to-be is on the left of the arrow and the information to assign is on the right.
Once we’ve created a variable/object we can then use the information stored inside of it in downstream operations! For example, we could perform basic arithmetic on our variable/object and assign the result to a new variable/object.
Addition, subtraction, multiplication, and division share operators across both languages (+
, -
, *
, and /
respectively). However, in R exponents use ^
.
Some operations are only possible on some categories of information. For instance, we can only perform arithmetic on numbers. In Python this is known as the variable’s type & while in R this is the object’s class. In either case, it’s important to know–and be able to check–this information about the variables/objects with which we are working.
In R we use the class
function to get this information. Note that the names of R classes sometimes differ from their equivalents in Python.
In Python, the type
function returns the type of the data object. Note that the names of Python types sometimes differ from their equivalents in R.
When our variables/objects have more than one item/element we may want to examine the piece of information at a specific position. This position is the “index position” and can be accessed in either language fairly easily.
In order to explore this more fully, let’s make some example multi-component variables/objects.
In R, one of the fundamental data structures is a “vector”. Vectors are assembled with the concatenation function (c
) where each item is separated by commas (,
) and the set of them is wrapped in parentheses ((...)
).
Note that the class of the object comes from the vector’s contents rather than the fact that it is a vector. All elements in a vector therefore must share a class.
In Python the fundamental data structure is a “list”. Lists are assembled either by wrapping the items to include in square brackets ([...]
) or by using the list
function. In either case, each item is separated from the others by commas (,
).
Note that the type of the variable comes from the list itself rather than its contents. Lists therefore support items of multiple different types.
One crucial difference between R and Python is that Python is “0-based” meaning that the first item is at index position 0
while in R the position of the equivalent element is 1
.
Fortunately, in either language the syntax for indexing is the same.
To index a multi-element object, simply append square brackets to the end of the object name and specify the number of the index position in which you are interested.
When we index more than one position, this is known as “slicing”. We can still use square brackets in either language to slice multiple items/elements and the syntax inside of those brackets seems shared but yields different results due to inherent syntactical differences.
In R, when we write two numbers separated by a colon (:
), that indicates that we want those two numbers and all integers between them.
We can use this to slice out multiple continuous index positions from an object.
In order to slice in Python, we include the start and stop bounds of the items that we want to slice separated by a colon (:
) inside of square brackets. The first bound (i.e., bound position 0) is actually the starting bracket of the list! This means that we can treat the first number in the slice in the same way we would in single indexing but the second number is actually the bound before the item with that index value.
Another way of thinking about this is that it is similar to a mathematical set. The starting bound is inclusive while the ending bound is exclusive.
Notice that we only get the items at third and fourth index position despite 4
being after the colon (which in an index would return the fifth index position)? That is because the fourth bound is after the fourth item but before the fifth item.