Intro Data dictionary EDA blueprint Outcome Discard the noise 1 Other_ related category 2 Pt_ Patient related category Appropriate patients Case_number Age 3 R_ Radiology related category Effusion and effusion site On chest x-ray ( R_CXR_effusion, R_CXR_effusionSite) On CT chest (R_CT_effusion, R_CT_effusionSite) 4 SS_ Category related to signs and symptoms of CAP 5 Hx_ medical history category HIV details Heart disease 6 Social_ social history category smoking 7 HCAP_ healthcare associated pneumonia category 8 PE_ observations during physical examination category Missing PE_ values Outlier PE_ values Further investigation of outliers To be continued….
Recap Previously in this series, we discovered the equivalent python data structures for the following R data structures:
vectors lists arrays/matrixes In this post, we will look at translating R data frames into python. We will also compare and contrast data frames in R and python.
R data frame is a python… Pretty straight forward, a R data frame is a python data frame. We will use an in built data frame, OrchardSprays, for our illustration.
Recap Previously in this series, we discovered the equivalent python data structures for the following R data structures:
vectors lists In this post, we will look at translating R arrays (and matrixes) into python.
1D R array A 1D R array prints like a vector.
library(tidyverse) library(reticulate) py_run_string("import numpy as np") py_run_string("import pandas as pd") (OneD<-array(1:6)) ## [1] 1 2 3 4 5 6 But it is not truly a vector
Previously, we uncovered what are R vectors in python. In this post, we will convert R lists in python.
A R list is a python … Like R vectors, it depends. A R list will behave differently in python depending if it is named or not.
Unnamed R list An unnamed list in R is a python list.
library(tidyverse) library(reticulate) conda_list()[[1]] %>% use_condaenv() Relement_int=2L Relement_bool=TRUE Relement_char="banana" Rlist_nameno<-list(Relement_int, Relement_bool, Relement_char) class(Rlist_nameno) ## [1] "list" r_to_py(Rlist_nameno) %>% class() ## [1] "python.
reticulate allows us to toggle between R and python in the same session, callling R objects when running python scripts and vice versa. When calling R data structures in python, the R structures are converted to the equivalent python structures where applicable. However, like translating English to Mandarin, translating R structures to python may not be straightforward which we will see later.
There are 5 R data structures:
vector (more specifically atomic vector)