Wednesday, 12 August 2015

Hadoop 2.7.1 for Windows 10 binary build with Visual Studio 2015 (unofficial)

I was looking for windows binaries for the latest hadoop version, to deploy on my dev pc. Unfortunately, there aren't binaries available for the windows platform, and the best way may be building it yourself.
I've tried to follow the instructions from https://wiki.apache.org/hadoop/Hadoop2OnWindows (BUILDING.tx), but here is what I found:
  • Windows SDK no longer includes a command prompt, so I was left with the Visual Studio  2010 option.
  • It's pretty tricky to find a Visual Studio 2010 version on MS's website. If it hasn't been removed at all.

A combination of a bit of lazyness to find VS 2010 and liking to have the latest and greatest, I decided to build hadoop with VS 2015. 

For the impatient

You can get my tailored hadoop-2.7.1 win64 build from https://onedrive.live.com/redir?resid=CA785B9261F68AF4!447&authkey=!AGnl0VM-t9Wt5-M&ithint=file%2cgz

This is an unofficial build, it is unsupported, and intended for use on a dev box. Definitely not intended for production.

All other instructions below are on how to build the binaries from the official sources. So you can build it and support it yourself.

My environment


Here's my shopping list:

  • Windows 10
  • JDK 1.8.0_51
  • Maven 3.3.3
  • Findbugs 1.3.9 (I haven't used this)
  • ProtocolBuffer 2.5.0 (I didn't pick the latest and greatest here - it has to be 2.5.0)
  • CMake 3.3.0
  • Visual Studio 2015 Community Edition
  • GnuWin32 0.6.3 - a bit painful to install but so is cygwin
  • zlib 1.2.8
  • internet connection

Windows System Environment variables

JAVA_HOME = "C:\Program Files\Java\jdk1.8.0_51"
MAVEN_HOME=c:\apache-maven-3.3.3
(make sure you point the above to your JDK version and maven installation)

I appended the following to my windows system environment Path variable:

;%MAVEN_HOME%\bin;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;c:\zlib

The weird "C:\Windows\Microsoft.NET\Framework64\v4.0.30319" path is the location of MSBuild.exe, which is required during the build process.


Protoc Buffers 2.5.0

Oh no, another unix/linux only build? I've downloaded the google package named protoc-2.5.0-win32.zip. Then extracted the binary file (protoc.exe) to  c:\windows\system32 - just a lazy way to put it on the path.

I'm not 100% sure of the effect of having a win32 component for this win64 build. But:
"Hadoop 0.23+ requires the protocol buffers JAR (protobufs.jar) to be on the classpath of both clients and servers; the native binaries are required to compile this and later versions of Hadoop." - http://wiki.apache.org/hadoop/ProtocolBuffers.

So I understand the win32 executable is used only during the build process (the jar equivalent should be packaged in the build).

If it is used in any way to compile native code, we may have left with some pointers out of order. I'll come back to this when I can.

Tweaking the Hadoop sources

Well, this was necessary to allow to build to execute. It shouldn't affect the quality of the build itself, but let's keep in mind the result is an unofficial, unsupported, use at your own risk hadoop, intended for a development environment.

Migrating VS projects


The following files need to be open with Visual Studio 2015:

  • <hadoop_src_folder>\hadoop-common-project\hadoop-common\src\main\winutils\winutils.vcxproj
  • <hadoop_src_folder>\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj

Visual Studio will complain of them being of an old version. All you have to do is to save all and close.

Enabling cmake VS 2015 project generation for hdfs

On the line 441 of <hadoop_src_folder>\hadoop-hdfs-project\hadoop-hdfs\pom.xml, edit the else value as the following:
 <condition property="generator" value="Visual Studio 10" else="Visual Studio 14 2015 Win64">

(the "value" value applies to win32 - you may want to edit it if building for win32).


Building it

You should try and find on windows the "Development Command Prompt for VS2015". I'm still wondering what is so special about this, but the fact is that it will only work with that.

More Environment variables

Those should be done on the command prompt:
  • set Platform=x64
  • set ZLIB_HOME=C:\zlib\include (unlike the official instructions, this should be pointing to the include folder).

Finally building it

Go to the hadoop source folder and issue:

mvn package -Pdist,native-win -DskipTests -Dtar



Still broken????

Investigate !!!

And then try to rebuild it with
mvn clean package -Pdist,native-win -DskipTests -Dtar

Or just throw the towel and get the binaries from: https://onedrive.live.com/redir?resid=CA785B9261F68AF4!447&authkey=!AGnl0VM-t9Wt5-M&ithint=file%2cgz

What next?


Follow the official docs to get your hadoop instance configured and up and running.




22 comments:

  1. Hi Kplitz,
    many thanks for your post, it is one of the most helpful I found regarding this topic.
    One question before a throw my towel. Did you need to build winutils and native using visual studio?
    I am getting the error "[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin ..." while bulding and tried to build the mentioned projects with visual studio 2015 but started to get weird error and warning.
    Any comment will be appreciated.
    Kind regards, Paul

    ReplyDelete
  2. Hi Paul,

    Thanks for your comment.

    I spent so much time doing this build that I decided to share, so it could be helpful to others.

    It's been a while since I've done this, but if I remember well winutils and native will get built as part of your maven build, though you should be able to built them in your visual studio, otherwise you get the maven errors you described.


    I would double check the environment variables. Make sure they are all defined - especially Path.

    ReplyDelete
    Replies
    1. Hi Kplitz,
      I already solved it. I shared my solution on my blog. Of course I reference your post as the main material. Here the link:
      https://hernandezpaul.wordpress.com/2016/05/08/my-experience-building-hadoop-2-7-2-on-windows-server-2012/

      Cheers,
      Paul

      Delete
  3. Thank you.

    I built hadoop 2.7.2 (64-bit) successfully with this environment:
    Windows 10 x64
    Visual Studio 2015 Community
    ProtocolBuffer 2.5.0
    CMake 3.5.2
    JDK 1.8.0_92
    Maven 3.3.9

    I didn't use these tools:
    zlib
    Findbugs
    GnuWin32

    ReplyDelete
  4. Thanks for sharing this article.. You may also refer http://www.s4techno.com/blog/2016/07/11/hadoop-administrator-interview-questions/..

    ReplyDelete
  5. great thanks. This is the most helpful blog I've ever found.

    ReplyDelete
  6. Great and helpful blog to everyone.. Installation procedure are very clear and step by so easy to understand.. All installation commands are very clear and i learnt installation procedure easily form this blog so i install hadoop in my system very quickly.. thanks a lot for sharing this blog to us...

    big data training institute in tambaram | hadoop training in chennai tambaram | big data training in chennai tambaram

    ReplyDelete
  7. Great post!I am actually getting ready to across this information, I am very happy to this commands.Also great blog here with all of the valuable information you have.Well done, it's a great knowledge.
    Software Testing Training in Chennai | Software Testing Training

    ReplyDelete
  8. Hey, would you mind if I share your blog with my twitter group? There’s a lot of folks that I think would enjoy your content. Please let me know. Thank you.
    safety course in chennai

    ReplyDelete
  9. Write more; that’s all I have to say. It seems as though you relied on the video to make your point. You know what you’re talking about, why waste your intelligence on just posting videos to your blog when you could be giving us something enlightening to read?
    Check out the best python training in chennai at SLA

    ReplyDelete
  10. I enjoy what you guys are usually up too. This sort of clever work and coverage! Keep up the wonderful works guysl.Good going.
    apple service center chennai
    apple service center in chennai
    apple mobile service centre in chennai

    ReplyDelete
  11. Vanskeligheter( van bi ) vil passere. På samme måte som( van điện từ ) regnet utenfor( van giảm áp ) vinduet, hvor nostalgisk( van xả khí ) er det som til slutt( van cửa ) vil fjerne( van công nghiệp ) himmelen.

    ReplyDelete
  12. Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading it. Any way I'll be subscribing to your feed and I hope you post again soon.
    Software Testing Training in Chennai | Software Testing Training in Anna Nagar | Software Testing Training in OMR | Software Testing Training in Porur | Software Testing Training in Tambaram | Software Testing Training in Velachery


    ReplyDelete
  13. I am sure that this is going to help a lot of individuals. Keep up the good work. It is highly convincing and I enjoyed going through the entire blog.
    Java Training in Chennai

    Java Training in Velachery

    Java Training inTambaram

    Java Training in Porur

    Java Training in Omr

    Java Training in Annanagar

    ReplyDelete