Published on December 14, 2016 by Microsoft
Want create site? Find Free WordPress Themes and plugins.

It is hard to believe that it has been almost 2 years since we last had Maxim on our show, but I can tell you we are extremely excited he’s back. Maxim is a Senior Program Manager in the Big Data team at Microsoft and he’s back to talk about Interactive Spark on Azure.

Maxim begins our discussion by walking us through the process and challenges data scientists go through when processing data. He explains that data science is an iterative process but that typically their productivity is not efficient because they spend a lot of time waiting for jobs to complete. One of the big factors, Maxim explains, is the size and cleanliness of data which contributes to the long wait times.

At the [05:20] mark Maxim shows us how Spark on Azure provides a solution to this problem by limiting the length of iterations, thus helping you be more productive. Maxim walks us through how that is accomplished. He first introduces is to Apache Spark, and then discusses how Spark on Azure makes data exploration even better.

 At the [08:38] mark its DEMO TIME, where Maxim spends a few minutes showing us how to spin up a Spark HDInsight cluster, then spends the remaining 10 minutes demoing how to use Spark in HDInsight to execute jobs efficiently. I won’t give anything away here, so be sure to watch to see Maxim work his Spark magic! Awesome show!

We definitely look forward to having him back!


Did you find apk for android? You can find new Free Android Games and apps.

Leave a Reply

2 Comments on "Interactive Spark on Azure"

Notify of

Luis Simoes
Luis Simoes
10 months 15 days ago

Tried to use this dataset but the count is about 128k and not even close to the billion…Am I doing something wrong?

10 months 18 days ago

I am not very familiar with Microsoft Azure, as it is learning online, in which I prefer one-on-one guidance.