Speaker: Edward Ma

Distributed Computing using parallel, Distributed R, and SparkR

Distributed Computing using parallel, Distributed R, and SparkR

Data volume is ever increasing, while single node performance is stagnate. To scale, analysts need to distribute computations. R has built-in support for parallel computing, and third-party contributions, such as Distributed R and SparkR, enable distributed analysis. However, analyzing large data in R remains a challenge, because interfaces to distributed computing environments, like Spark, are […]

Read More