Skip to content

Commit

Permalink
Make webhdfs tests work for cdh
Browse files Browse the repository at this point in the history
The webhdfs tests were only working for hdp, which relied on a debug
output line from an old hdp version.

This commit:

  * Fixes port discovery for webhdfs tests for new hadoop versions
  * Updates hdp verion dependency (from Hadoop version 2.2 to 2.6)
  • Loading branch information
Tarrasch committed Jan 7, 2016
1 parent 4460b98 commit fb9c94c
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 5 deletions.
4 changes: 3 additions & 1 deletion scripts/ci/setup_hadoop_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@ mkdir -p $HADOOP_HOME
if [ $HADOOP_DISTRO = "cdh" ]; then
URL="http://archive.cloudera.com/cdh5/cdh/5/hadoop-latest.tar.gz"
elif [ $HADOOP_DISTRO = "hdp" ]; then
URL="http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.0.6.0/tars/hadoop-2.2.0.2.0.6.0-76.tar.gz"
# This site provides good URLs:
# https://github.com/saltstack-formulas/hadoop-formula/blob/5034a2204da691eceb9c2d8cd8260f11d5cc06f3/hadoop/settings.sls
URL="http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.2.6.0/tars/hadoop-2.6.0.2.2.6.0-2800.tar.gz"
else
echo "No/bad HADOOP_DISTRO='${HADOOP_DISTRO}' specified" >&2
exit 1
Expand Down
3 changes: 3 additions & 0 deletions test/contrib/hdfs/webhdfs_client_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,6 @@ def test_actually_using_webhdfs(self):
test_glob_exists = None
test_with_close = None
test_with_exception = None

# This one fails when run together with the whole test suite
test_write_cleanup_no_close = None
10 changes: 6 additions & 4 deletions test/webhdfs_minicluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,15 +59,17 @@ def _start_mini_cluster(self, nnport=None):
stderr=subprocess.PIPE, universal_newlines=True)

def _get_namenode_port(self):
just_seen_webhdfs = False
while self.hdfs.poll() is None:
rlist, wlist, xlist = select.select([self.hdfs.stderr, self.hdfs.stdout], [], [])
for f in rlist:
line = f.readline()
print(line,)
# luigi webhdfs version (different regex)
m = re.match(".*namenode.NameNode: Web-server up at: localhost:(\d+).*", line)
if m:
print(line.rstrip())

m = re.match(".*Jetty bound to port (\d+).*", line)
if just_seen_webhdfs and m:
return int(m.group(1))
just_seen_webhdfs = re.match(".*namenode.*webhdfs.*", line)


class WebHdfsMiniClusterTestCase(MiniClusterTestCase):
Expand Down

0 comments on commit fb9c94c

Please sign in to comment.