Blog Archives

CentOS 7.2 VM hangs on startup

Host :  Windows 10
Guest : CentOS 7.2 64x
Tool : Oracle VirtualBox 5.2


For an unknown reason, your VM won’t startup and hangs on boot. You have tried like me many VBoxManage commands, transform raw to vdi, file access rights, … but you’re still on this screen:

CentOS hangs here


Solution :

When your machine gives you the boot choice, press “e” to enter the grub configuration. Look for “rhgb” parameter in the kernel boot commands and delete it.


This command should be in the line starting with “linux16”.

Add the command “” at the end of this line.

Press Ctrl+x to start the VM and maybe everything will work fine.

Running sample Mahout Job on Hadoop Multinode cluster

This slideshare introduction is quite interresting. It explains how K-Means algorithm works.

(Credit to Subhas Kumar Ghosh)

One common problem with Hadoop, is the unexplained hang when running a sample job. For instance, I’ve been testing Mahout (cluster-reuters) on a Hadoop multinode cluster (1 namenode, 2 slaves). A sample trace in my case looks like this listing:

15/10/17 12:09:06 INFO YarnClientImpl: Submitted application application_1445072191101_0026
15/10/17 12:09:06 INFO Job: The url to track the job:
15/10/17 12:09:06 INFO Job: Running job: job_1445072191101_0026
15/10/17 12:09:14 INFO Job: Job job_1445072191101_0026 running in uber mode : false
15/10/17 12:09:14 INFO Job:  map 0% reduce 0%

The jobs web console told me that the job State=Accepted, Final Status = UNDEFINED and the tracking UI was UNASSIGNED.

First thing I suspected, was a warning thrown by hadoop binary:

WARN NativeCodeLoader: Unable to load native-hadoop library for your 
platform... using builtin-java classes where applicable

Absolutely nothing to do with my problem. I rebuilt this jar from sources, but the job still hangs.

I reviewed the namenode logs. Nothing special. Then the Yarn different logs(resourcemanager, nodemanager). No problems. Slaves logs. Same thing.

As we don’t have much information from Hadoop logs, I went through the net for similar problems. It seems that this is a common problem related to memory configuration. I wonder why such problems are not yet logged (Hadoop 2.6). Even if I analyzed the memory consumption using JConsole, but nothing was alarming with it.

All the used machines are CentOS 6.5 virtual machines hosted on a 16 Gb RAM, i7-G5 laptop. After connecting and configuring the three machines, I realized that the allowed disk space (15Gb for the namenode, 6Gb for slaves ) is not enough. Checking the disk space usage (df  -h), only 3% of the disk space were available on the two slaves. This could be an issue, but Hadoop reports such errors.

Looking in yarn-site.xml, I remembered that I gave Yarn 2Gb to run this test jobs.


I tried doubling this value on the namenode:


Then I propagated changes to slaves and restarted everything ( && && && then


It works, finally 🙂

Now, I’m trying to visualize the clustering results using Gephi right now.

Gephi Sample clusters graph