I've tried to follow the instructions from https://wiki.apache.org/hadoop/Hadoop2OnWindows (BUILDING.tx), but here is what I found:
- Windows SDK no longer includes a command prompt, so I was left with the Visual Studio 2010 option.
- It's pretty tricky to find a Visual Studio 2010 version on MS's website. If it hasn't been removed at all.
For the impatient
You can get my tailored hadoop-2.7.1 win64 build from https://onedrive.live.com/redir?resid=CA785B9261F68AF4!447&authkey=!AGnl0VM-t9Wt5-M&ithint=file%2cgz
This is an unofficial build, it is unsupported, and intended for use on a dev box. Definitely not intended for production.
This is an unofficial build, it is unsupported, and intended for use on a dev box. Definitely not intended for production.
All other instructions below are on how to build the binaries from the official sources. So you can build it and support it yourself.
My environment
Here's my shopping list:
- Windows 10
- JDK 1.8.0_51
- Maven 3.3.3
- Findbugs 1.3.9 (I haven't used this)
- ProtocolBuffer 2.5.0 (I didn't pick the latest and greatest here - it has to be 2.5.0)
- CMake 3.3.0
- Visual Studio 2015 Community Edition
- GnuWin32 0.6.3 - a bit painful to install but so is cygwin
- zlib 1.2.8
- internet connection
Windows System Environment variables
JAVA_HOME = "C:\Program Files\Java\jdk1.8.0_51"
MAVEN_HOME=c:\apache-maven-3.3.3
(make sure you point the above to your JDK version and maven installation)
I appended the following to my windows system environment Path variable:
;%MAVEN_HOME%\bin;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;c:\zlib
The weird "C:\Windows\Microsoft.NET\Framework64\v4.0.30319" path is the location of MSBuild.exe, which is required during the build process.
Protoc Buffers 2.5.0
Oh no, another unix/linux only build? I've downloaded the google package named protoc-2.5.0-win32.zip. Then extracted the binary file (protoc.exe) to c:\windows\system32 - just a lazy way to put it on the path.
I'm not 100% sure of the effect of having a win32 component for this win64 build. But:
"Hadoop 0.23+ requires the protocol buffers JAR (protobufs.jar) to be on the classpath of both clients and servers; the native binaries are required to compile this and later versions of Hadoop." - http://wiki.apache.org/hadoop/ProtocolBuffers.
So I understand the win32 executable is used only during the build process (the jar equivalent should be packaged in the build).
So I understand the win32 executable is used only during the build process (the jar equivalent should be packaged in the build).
If it is used in any way to compile native code, we may have left with some pointers out of order. I'll come back to this when I can.
Tweaking the Hadoop sources
Well, this was necessary to allow to build to execute. It shouldn't affect the quality of the build itself, but let's keep in mind the result is an unofficial, unsupported, use at your own risk hadoop, intended for a development environment.
Migrating VS projects
The following files need to be open with Visual Studio 2015:
- <hadoop_src_folder>\hadoop-common-project\hadoop-common\src\main\winutils\winutils.vcxproj
- <hadoop_src_folder>\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj
Visual Studio will complain of them being of an old version. All you have to do is to save all and close.
Enabling cmake VS 2015 project generation for hdfs
On the line 441 of <hadoop_src_folder>\hadoop-hdfs-project\hadoop-hdfs\pom.xml, edit the else value as the following:
<condition property="generator" value="Visual Studio 10" else="Visual Studio 14 2015 Win64">
(the "value" value applies to win32 - you may want to edit it if building for win32).
Building it
You should try and find on windows the "Development Command Prompt for VS2015". I'm still wondering what is so special about this, but the fact is that it will only work with that.
More Environment variables
Those should be done on the command prompt:
- set Platform=x64
- set ZLIB_HOME=C:\zlib\include (unlike the official instructions, this should be pointing to the include folder).
Finally building it
Go to the hadoop source folder and issue:
mvn package -Pdist,native-win -DskipTests -Dtar
Still broken????
Investigate !!!
And then try to rebuild it with
mvn clean package -Pdist,native-win -DskipTests -Dtar
And then try to rebuild it with
mvn clean package -Pdist,native-win -DskipTests -Dtar
Or just throw the towel and get the binaries from: https://onedrive.live.com/redir?resid=CA785B9261F68AF4!447&authkey=!AGnl0VM-t9Wt5-M&ithint=file%2cgz
Follow the official docs to get your hadoop instance configured and up and running.
What next?
Follow the official docs to get your hadoop instance configured and up and running.