De-Googling myself! (Step 1)

Many of us will never realize how much we are Google dependent until we reach the maximum free storage capacity.

2016-09-15_10-59-05

But, what if google decides to remove the free offer? what can prevent them from doing it? Whatsapp tried but let down after a while … Google is not Whatsapp. Did ever think how much google knows about you?

So I decided to start degoogling my self. There are a lot of alternatives, I have just to be more patient/tolerant with open source ones and choose carefully. Uninstalled the greedy  Google Chrome, downloaded all my files from the Google Drive and looking for a new solution for mailing and remote storage.

Many “Alternatives To” website suggest using “mail.com”. Seems clean and quite interresting as a domain name. But, when I receive this kind of messages on registration, I’m not sure I can go further.

2016-09-15_10-56-13

Then, I came across this beautiful “TutaNota”.

2016-09-15_11-10-38

A replacement for Google Services. This is what I’m looking for.

And here where you can join now : sba@keemail.me

Advertisements

Roadmap to master BigData World

data_scientist

Source(nirvacana.com)

SMS continuous service with Smslib

In Order to create and SMS service, to send sms periodically, you can use Quartz crons mixed with Smslib on a Spring Framework boil.

To set up a quartz cron, please refer this post. Next, we’ll need three Sql tables:

– Active Sms
– Archive Sms
– Smpp Gateway Configuration

Prerequiste library can be found here

Provided Senario:

Scenario

Download links for the latest offline Flash Player Installer

Latest Update (05 April 2016) – Flash Player 21

Tired of searching, here the direct download links of the offline Flash Player Installer :

Found after some search here:

https://helpx.adobe.com/flash-player/kb/installation-problems-flash-player-windows.html#main-pars_header

Load balancing with Apache HTTP

Balanced on the same server

 

workers.properties

worker.list=router,status
worker.worker1.port=8009
worker.worker1.host=localhost
worker.worker1.type=ajp13
worker.worker1.lbfactor=1
worker.worker1.local_worker=1
worker.worker1.ping_timeout=1000
worker.worker1.socket_timeout=10
worker.worker1.ping_mode=A
#sticky session is not interresting here, just for explanation
#sticky session = stick to the server that gave you the first session-id
worker.worker1.sticky_session=true
#this property will make your load balancer woprking in active passive mode
#All requests would be mapped to worker1, but
#on worker1 (server1:8009) failure, all requests will be redirected to worker2
worker.worker1.redirect=worker2

worker.worker2.port=8090
worker.worker2.host=localhost
worker.worker2.type=ajp13
worker.worker2.lbfactor=1
worker.worker2.activation=disabled
worker.worker2.ping_timeout=1000
worker.worker2.socket_timeout=10
worker.worker2.ping_mode=A
worker.worker2.local_worker=0
worker.worker2.sticky_session=true

worker.router.type=lb
worker.router.balanced_workers=worker1,worker2
worker.router.sticky_session=true
worker.status.type=status

httpd.conf

...
LoadModule jk_module modules/mod_jk.so
...
#go to the end of the file and add the
JkWorkersFile /etc/httpd/conf/workers.properties
JkShmFile /etc/httpd/logs/mod_jk.shm
JkLogFile /etc/httpd/logs/mod_jk.log
JkLogLevel debug
# Configure monitoring the LB using jkstatus
JkMount /jkstatus/* status
# Configure your applications (may be using context root)
JkMount /myRootContext* router

ANT could not find home… variables

Ant lost, could not find home

I was tring to compile scoop using the “ant package” command when I ran into the following error:

Error:

Could not find or load main class org.apache.tools.ant.launch.Launcher

Debug:

ant –execdebug

exec “/usr/lib/jvm/jdk1.7.0_79//bin/java” -classpath “/usr/bin/build-classpath: error: JVM_LIBDIR /usr/lib/jvm-exports/jdk1.7.0_79 does not exist or is not a directory:/usr/bin/build-classpath: error: JVM_LIBDIR /usr/lib/jvm-exports/jdk1.7.0_79 does not exist or is not a directory:/usr/lib/jvm/jdk1.7.0_79//lib/tools.jar” -Dant.home=”/usr/share/ant” -Dant.library.dir=”/usr/share/ant/lib” org.apache.tools.ant.launch.Launcher -cp “”
Error: Could not find or load main class org.apache.tools.ant.launch.Launcher

Quick fix:

Missing directory, create an empty one.

             mkdir /usr/lib/jvm-exports/jdk1.7.0_79

Running sample Mahout Job on Hadoop Multinode cluster

This slideshare introduction is quite interresting. It explains how K-Means algorithm works.

(Credit to Subhas Kumar Ghosh)

One common problem with Hadoop, is the unexplained hang when running a sample job. For instance, I’ve been testing Mahout (cluster-reuters) on a Hadoop multinode cluster (1 namenode, 2 slaves). A sample trace in my case looks like this listing:

15/10/17 12:09:06 INFO YarnClientImpl: Submitted application application_1445072191101_0026
15/10/17 12:09:06 INFO Job: The url to track the job: http://master.phd.net:8088/proxy/application_1445072191101_0026/
15/10/17 12:09:06 INFO Job: Running job: job_1445072191101_0026
15/10/17 12:09:14 INFO Job: Job job_1445072191101_0026 running in uber mode : false
15/10/17 12:09:14 INFO Job:  map 0% reduce 0%

The jobs web console told me that the job State=Accepted, Final Status = UNDEFINED and the tracking UI was UNASSIGNED.

First thing I suspected, was a warning thrown by hadoop binary:

WARN NativeCodeLoader: Unable to load native-hadoop library for your 
platform... using builtin-java classes where applicable

Absolutely nothing to do with my problem. I rebuilt this jar from sources, but the job still hangs.

I reviewed the namenode logs. Nothing special. Then the Yarn different logs(resourcemanager, nodemanager). No problems. Slaves logs. Same thing.

As we don’t have much information from Hadoop logs, I went through the net for similar problems. It seems that this is a common problem related to memory configuration. I wonder why such problems are not yet logged (Hadoop 2.6). Even if I analyzed the memory consumption using JConsole, but nothing was alarming with it.

All the used machines are CentOS 6.5 virtual machines hosted on a 16 Gb RAM, i7-G5 laptop. After connecting and configuring the three machines, I realized that the allowed disk space (15Gb for the namenode, 6Gb for slaves ) is not enough. Checking the disk space usage (df  -h), only 3% of the disk space were available on the two slaves. This could be an issue, but Hadoop reports such errors.

Looking in yarn-site.xml, I remembered that I gave Yarn 2Gb to run this test jobs.

    <property>    
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>2024</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
    </property>

I tried doubling this value on the namenode:

    <property>    
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>4096</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
    </property>

Then I propagated changes to slaves and restarted everything (stop-dfs.sh && stop-yarn.sh && start-yarn.sh && start-dfs.sh). then

    ./cluster-reuters.sh

It works, finally 🙂

Now, I’m trying to visualize the clustering results using Gephi right now.

Gephi Sample clusters graph

CentOS hangs on startup after yum update

CentOS

Usually it will hang after Starting certmonger: [OK]

Startup the hanging machine (H1)
From another machine do:

> ssh H1
> su
> mv /etc/X11/xorg.conf /etc/X11/xorg.conf.bak

Now restart H1.

You can have more explanations by doing a “tail -n 200 /var/logXorg.0.log” on the hanging machine.

How to remove duplicate lines in MySql in one command!

This simple mysql command in InnonDB engine may save you 30 minutes if you want to remove the duplicate copies of a row. Consider a table called “your_table_name” having many columns : column1, column2, column3, …. You define a duplicate row as a row having the same values as in column1 and column2. You have just to create a unique index based on those columns.

ALTER IGNORE TABLE your_table_name ADD UNIQUE INDEX give_ur_index_a_name (column1, column2 );

Commit, and check your table again… duplicate rows went away!

Update: In some versions of MySQL, “ALTER IGNORE” won’t ignore the duplicate key problem. So you may have to run the following command :

set session old_alter_table=1;

After adding the index, do not forget to set old_alter_table to “0” again.

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A New York City subway train holds 1,200 people. This blog was viewed about 3,800 times in 2014. If it were a NYC subway train, it would take about 3 trips to carry that many people.

Click here to see the complete report.