In joining two tables, I would like to select all columns except 2 of them from a large table with many columns on pyspark sql on databricks. My pyspark sql: %
I am attempting to use Scala with Apache Spark locally to query Hive table which is secured with Kerberos. I have no issues connecting and querying the data pro
I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro
HBase Table rowkey: 2020-02-02^ghfgewr3434555, cf:1 timestamp=1604405829275, value=true rowkey: 2020-02-02^ghfgewr3434555, cf:2 timestamp=1604405829275, value=
I am trying this query in Hive and it's not working. select ( ( select count(*) from click_streaming where page_
I'm trying to import mongodb data into hive. The jar versions that i have used are ADD JAR /root/HDL/mongo-java-driver-3.4.2.jar; ADD JAR /root/HDL/mongo-hado
I create a table in hive: CREATE TABLE `test3`.`shop_dim` ( `shop_id` bigint, `shop_name` string, `shop_company_id`
I'm new to hiveSQL and I'm trying to extract a value from the column col_a from the data df which is in this format: \\\"id\\\":\\\"101_12345\\\" I only need to
I am running one SQL query in Hive and it gives different results with CBO enabled and disabled. The results are wrong when CBO is enabled (set hive.cbo.enable=
I am trying to create an external table using hive with hadoop but somehow it failed. These are the error I get when I try to run my queries. 02:23:29.516 [Hive
What's the data format of the .csv.metadata files written by Amazon Athena? Alongside the output file of every query there is a metadata file. It looks like it
I have been working on hive and found something peculiar. Basically, while using double as a datatype for your column we need not have any precision specified (
my query looks like but I am getting error select a.account_number,b.reference_acc from hdd.master_record format1 a join hdd.monetary b on a.load_date = b.load_
When I run: hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console It shows Starting HiveServer2
From hive -h : --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable subsitution to apply to hive
Hive has min(col) to find the minimum value of a column. But how about finding the minimum of multiple values (NOT one column), for example select min(2,1,3,4
Our one of the Gateway machines getting a continuous error on Hive. While we are trying to execute any(select, Insert and drop) command in a beeline, frequent
I have a use case to store dynamic JSON objects in a column in Big Query. The schema of the object is dynamically generated by the source and not known beforeha
I have got twitter data using flume on HDFS. Have 3 node cluster and MySQL Metastore for hive. When i execute below query select user_name.screen_name, user_n
I have written a hive query language as below. It is giving me error as written in title. the query is : SELECT clnt_nbr, CASE WHEN clnt_nbr i