Microsoft rolled out a version of HDInsight – its cloud Hadoop distribution – that runs on Ubuntu at this week’s Strata + Hadoop World Conference in San Jose, California. While Linux instances have been available on Azure Infrastructure-as-a-Service cloud, this is the first Microsoft-managed and Azure-hosted service that runs on Linux.
Most Hadoop deployments run on Linux, and Microsoft’s move puts HDInsight in a better position to attract more developers with Hadoop experience. The company has worked on making Hadoop more Windows-friendly since at least 2012, when it first submitted a proposal to improve Hadoop for Windows Server and Azure to Apache. Hortonworks, a leading enterprise Hadoop distribution vendor, has also been involved in that effort.
One potentially big new capability is extending Linux-based Hadoop clusters sitting in customers’ own data centers to Azure and using the same Linux tools, documentation, and templates across both environments.
The service is currently available in four Azure regions – two in U.S., one in Europe, and one in Asia – but the company plans to launch it in all regions over the next several months.
Microsoft chose Ubuntu, the popular enterprise Linux distribution by Canonical, because it was “the leading scale-out Linux,” T.K. Rengarajan, corporate vice president of Data Platform at Microsoft, wrote in a blog post announcing the news. Canonical will also provide support for HDInsight on Azure, according to a separate blog post by Corey Sanders, director of program management at Azure.
“Customers can leverage the same skills and tools they use for their Linux Hadoop deployments on premises and migrate to Azure or create simple hybrid deployments without having to manage the infrastructure,” Sanders wrote.
The Azure cloud VM service supports six Linux distributions. According to Sanders, one out of five customer VMs deployed on Azure run on Linux.
Microsoft has been steadily changing its old reputation of a stuffy software giant opposed to all things open source. Besides being more relaxed about Linux, the company recently open sourced .NET, the popular developer framework for building Windows applications. Azure supports Docker, the popular open source application container technology that runs on Linux. Microsoft has also created a Windows command line interface for Docker.