One of my old reported vulnerabilities was published: CVE-2014-8733


It was fun to work with Hadoop security in 2014… This vuln was a tricky one because I was responsible for Hadoop managed service platform security, and our clients had SSH access to Hadoop cluster nodes in some cases.

If I remember correctly, fix wasn’t easy – required release of new CDH version which moved configuration parameters between files (world readable access was required for Hadoop client to function). And then, several months later, it reappeared again after another patch.

ESXi and Kali weekend

Installed Kali Linux in my virtual lab this weekend – just to make a snapshot of currently available packages and, as usual, steal a couple of ideas for my own pentest Linux VM. Two ideas I will never steal from Kali are Safari Icon for Firefox and use of Gnome 3.

Hadoop without Kerberos – simple attack examples

In this post, I am going to illustrate that it’s practically impossible to protect any data in Hadoop clusters without Kerberos (‘Secure mode’) enabled. I hope this will help admins and security folks see that Kerberos is the only way to make Hadoop more or less secure – without it, there is no authentication in Hadoop at all. But as you can see from my previous posts about Hadoop, even with Kerberos enabled, there are still very serious challenges, so Kerberos is just a start, not the final solution.

At this time, I will focus on the most important component of Hadoop ecosystem – HDFS, Hadoop’s distributed file system which is used to store all data in Hadoop in most cases.

Continue reading examples

In my previous post “An important Hadoop security configuration parameter you may have missed” I was talking about importance of the configuration parameter and promised to provide some solutions using this parameter.

I want to focus on a couple of practical use examples in this post, and if you want to learn more about this, here are links to the existing documentation:

Continue reading

Kaspersky Antivirus appears to became just another bloatware nowadays

So much disappointed… After all these years I finally decided to buy Kaspersky, and appeared it became just another bloatware now. Guess the folks don’t care about their firm’s karma anymore…


Configuring Cloudera Navigator to use external authentication

Cloudera, author of one of the most popular Hadoop distributions, has created a great tool for Hadoop security monitoring and auditing, called Cloudera Navigator. I find its initial configuration process a little bit tricky, so I wanted to document it in this post. Cloudera’s original document on how to do this is located here:

I currently use the latest version of Cloudera Hadoop distribution with Cloudera Manager 5.3.1 (trial enterprise license) and Navigator 2.2.1. It openly shows its full version and build in a tool-tip on its logo and in ‘About’ section right at the login page (so in case there’s a vulnerability published in future, hackers won’t need to spend time finding out target’s version ;-) ):


Continue reading

An important Hadoop security configuration parameter you may have missed

Hadoop has one security parameter, which importance I think is not stressed well enough in currently published documentation. While there are instructions on how to configure it, I did not see anyone talking about the consequences of leaving this parameter with its default value, and as far as I know, almost nobody ever changes it due to complexity. This parameter is – “Maps kerberos principals to local user names”

(description from current core-default.xml)

It’s telling Hadoop how to translate Kerberos principals into Hadoop user names. By default, it simply translates <user>/<part2>@<DOMAIN> into <user> for default domain (ignores the 2nd part of Kerberos principal). Here’s what current Apache Hadoop documentation says about it:

“By default, it picks the first component of principal name as a user name if the realms matches to the default_realm (usually defined in /etc/krb5.conf). For example, host/ is mapped to host by default rule.”

This means that for example if you have users with names hdfs, Alyce and Bob, and they use the following principals to authenticate with your cluster:

Alyce – alyce@YOUR.DOMAIN,

If auth_to_local is not configured in your cluster, those are actually not the only principals that can authenticate as your Hadoop users, because the following principals, if exist, will also become your HDFS, Alyce and Bob per the default mapping:

hdfs/host123.your.domain@YOUR.DOMAIN => hdfs
hdfs/clusterB@YOUR.DOMAIN => hdfs
alyce/team2@YOUR.DOMAIN => Alyce
alyce/something.else@YOUR.DOMAIN => Alyce
bob/library@YOUR.DOMAIN => Bob
bob/research@YOUR.DOMAIN => Bob

… (very, very large list of possible combinations of second part of Kerberos principal and domain name) …

hdfs/<anything>@YOUR.DOMAIN is HDFS
alyce/<anything>@YOUR.DOMAIN is Alyce
bob/<anything>@YOUR.DOMAIN is Bob

For many regulatory bodies and auditing companies, this is a baseline security requirement for every user on the system to have only one unique identity. As we just learned, in Hadoop, by default, users de-facto can be identified with almost an infinite number of IDs. And this can be exploited by malicious users inside company to get access to sensitive data or fully take over control of the cluster.

Let’s look at an example:

First, user Bob with principal bob@LAB.LOCAL uploads a file ‘secret.txt’ to his home directory in HDFS and ensures its protected by access lists:


Continue reading