Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial Aggregating CSV Data (Spatial Binning) Issue #30

Closed
dataunleashed opened this issue May 19, 2015 · 3 comments
Closed

Tutorial Aggregating CSV Data (Spatial Binning) Issue #30

dataunleashed opened this issue May 19, 2015 · 3 comments

Comments

@dataunleashed
Copy link

I am using Cloudera CDH 5.3.3 with Hive 0.13.1 (/opt/cloudera/parcels/CDH-5.3.3-1.cdh5.3.3.p0.5/jars/hive-common-0.13.1-cdh5.3.3.jar) on a 4-node cluster VM environment.
I followed the Aggregating CSV Data tutorial (https://github.com/Esri/gis-tools-for-hadoop/wiki/Aggregating-CSV-Data-%28Spatial-Binning%29) and everything appeared to be working. However, after step 11, I ran a simple SELECT query and it returned nothing even though the files existed in the Hive folder. As a result, when I followed step 12 to get the result taxi_agg table to ArcMap, the json file is empty with no data.

Here are some screenshots:

  1. Table taxi_demo location:
    image
  2. Files in the HDFS folder for Hive taxi_demo table:
    image
  3. Data in the taxi_demo table Hive folder:
    image
  4. Hive query returns nothing:
    image

image

Please help.

Thanks,

Tom

@smambrose
Copy link
Contributor

Can you try the solution at the bottom here : #25

Please let us know if it works, or if you run into further issues.

@dataunleashed
Copy link
Author

@smambrose, thank you for your quick response. I've followed the solution at the bottom of #25 as you referenced and it didn't work either. Very much everything worked and I could see the JSON-format result bins in the output file in the Hive folder, but when the I ran the Hive query or the ArcMap Copy from HDFS tool, I just got an empty file.

hive> select * from agg_samp limit 1;
OK
Time taken: 0.048 seconds

I don't know if this is a Cloudera CDH 5.3.3 issue or not, but it worked fine when I tried the solution on the latest Hortonworks HDP 2.2.4 Sandbox with Hive 0.14. Is there anything else that I can try or may have overlooked?

We are looking to implement your great work on the ESRI for Hadoop Framework on my company Cloudera production cluster, any help would be greatly appreciated.

Thanks,

Tom

@sarah1billion
Copy link

Hi @dataunleashed,

I have not been able to reproduce this yet, and have been trying using a Cloudera 5.3.0 VM. What is the result you get when you type select * from taxi_demo limit 1;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants