In the previous blog post, we saw how to start a Spark cluster on EC2 using the inbuilt launch scripts. This is good if you want get something up and running quickly, but it won’t allow fine-grained control over our cluster. A lot of times, we would want to customize the machines that we spin up. Let’s say that you want to use different types of machines to handle production level traffic in different regions. May be you are not on EC2 and you want to launch some machines in your cluster. How would you do it? This is the reason we have Spark Standalone mode. Using this method, we can manually launch any number of machines independently in our private cluster and make them listen to our master machine. It gives us a lot of flexibility! Let’s go ahead and see how to do it, shall we? Continue reading
Launching A Spark Standalone Cluster
Reply