# What `R` you? (R matrixes and R arrays in python)

# Recap

Previously in this series, we discovered the equivalent `python`

data structures for the following `R`

data structures:

In this post, we will look at translating `R`

arrays (and matrixes) into `python`

.

# 1D `R`

array

A 1D `R`

array prints like a vector.

```
library(tidyverse)
library(reticulate)
py_run_string("import numpy as np")
py_run_string("import pandas as pd")
(OneD<-array(1:6))
```

`## [1] 1 2 3 4 5 6`

But it is not truly a vector

`OneD %>% is.vector()`

`## [1] FALSE`

It is more specifically an atomic.

`OneD %>% is.atomic()`

`## [1] TRUE`

An atomic is sometimes termed as an atomic vector, which adds more to the confusion. `?is.atomic`

explains that “It is common to call the atomic types ‘atomic vectors’, but note that is.vector imposes further restrictions: an object can be atomic but not a vector (in that sense)”. Thus, OneD can be an atomic type but not a vector structure.

## 1D `R`

array is a `python`

…

No tricks here. A `R`

array is translated into a `python`

array. Thus, a 1D `R`

array is translated into a 1D `python`

array. The name of the `python`

array is known as `ndarray`

and is governed by the `python`

packaged called `numpy`

.

`r.OneD`

`## array([1, 2, 3, 4, 5, 6])`

`type(r.OneD)`

`## <class 'numpy.ndarray'>`

`r.OneD.ndim`

`## 1`

All `python`

code for this post will be run within the `{python}`

code chunk to explicitly print out the display for `python`

array (i.e. `array([ ])`

)

## 1D `python`

array is a `R`

…

1 dimension `python`

arrays are commonly used in data science for `python`

.

```
p_one= np.arange(6)
p_one
```

`## array([0, 1, 2, 3, 4, 5])`

The 1D `python`

array is translated into a 1D `R`

array.

`py$p_one %>% class()`

`## [1] "array"`

The translated array is an atomic type.

`py$p_one %>% is.atomic()`

`## [1] TRUE`

An the translated array is not a vector which is expected of a 1D `R`

array.

`py$p_one %>% is.vector()`

`## [1] FALSE`

# 2D `R`

array

A 2D `R`

array is also known as a matrix.

`(TwoD<-array(1:6, dim=c(2,3)))`

```
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

`TwoD %>% class()`

`## [1] "matrix"`

## 2D `R`

array is a `python`

…

A 2D `python`

array. `python`

does not name have a special name for their 2D array.

`r.TwoD`

```
## array([[1, 3, 5],
## [2, 4, 6]])
```

`type(r.TwoD)`

`## <class 'numpy.ndarray'>`

`r.TwoD.ndim`

`## 2`

## 2D `python`

array

Besides from 1D `python`

array, 2D `python`

array are also common in data science with `python`

.

```
p_two=np.random.randint(6, size=(2,3))
p_two
```

```
## array([[3, 1, 3],
## [3, 5, 5]])
```

A 2D `python`

array is translated into a 2D `R`

array/ matrix.

`py$p_two %>% class()`

`## [1] "matrix"`

### Reshaping 1D `python`

array into 2D array

Sometimes a `python`

function requires a 2 dimension array and your input variable is a 1 dimension array. Thus, you will need to reshape your 1 dimension array into a 2 dimension array with `numpy`

’s `reshape`

function. Let us convert our 1 dimension array into a 2 dimension array which has 2 rows and 3 columns.

`np.reshape(p_one, (2,3))`

```
## array([[0, 1, 2],
## [3, 4, 5]])
```

Let’s convert it into a 2D array which has 6 rows and 1 column.

`np.reshape(p_one, (6,1))`

```
## array([[0],
## [1],
## [2],
## [3],
## [4],
## [5]])
```

The rows for the above is the same as the length of the 1D array. Thus, if you replace the `6`

with the length of the 1D array, you will achieve the same result.

`np.reshape(p_one, (len(p_one),1))`

```
## array([[0],
## [1],
## [2],
## [3],
## [4],
## [5]])
```

Alternatively, you can also replace it with `-1`

if the input is a 1D array. `-1`

means that it is unspecified and that it will “inferred from the length of the array”.

`np.reshape(p_one, (-1,1))`

```
## array([[0],
## [1],
## [2],
## [3],
## [4],
## [5]])
```

#Difference between `R`

and `python`

array
One of the differences is the printing of values in the array.
`R`

are column-major arrays. The tables are filled column-wise. In other words, the left most column is filled from the top to the bottom before moving to neighbouring right column. This neighbouring column is filled up in a top-down fashion.

`TwoD`

```
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
```

The integrity of this column-major display is maintained when it is translated into `python`

.

`r.TwoD`

```
## array([[1, 3, 5],
## [2, 4, 6]])
```

You would have noticed that `python`

prints its array without the row (eg.[1,]) and column names (e.g. [,1]).

While `python`

is able to use column-major ordered arrays, but it defaults to row-major ordering when arrays are created in `python`

. In other words, values are filled from the first row in a left-to-right fashion before moving to the next row.

`np.reshape(p_one, (2,3))`

```
## array([[0, 1, 2],
## [3, 4, 5]])
```

You may refer to the `reticulate`

package page for more detail explanations and the implications of such differences.

`python`

series

Besides lists, 1D arrays, 2D arrays, there are other `python`

data structures which are commonly used in data science with `python`

. They are series and data frames which are governed by the `pandas`

library. We will look at series in this post and data frames will be covered in a separate post. Series is a 1D array with axis labels.

```
PD=pd.Series(['banana',2])
PD
```

```
## 0 banana
## 1 2
## dtype: object
```

As series is a 1D array, when translated to `R`

it will be classified as a `R`

array.

`py$PD %>% class()`

`## [1] "array"`

However, the translated series appears as a `R`

named list. The index of the series appear as the names in the `R`

list.

`py$PD`

```
## $`0`
## [1] "banana"
##
## $`1`
## [1] 2
```

What did you know? A translated series is both a `R`

array and `R`

list

`py$PD %>% is.array()`

`## [1] TRUE`

`py$PD %>% is.list()`

`## [1] TRUE`