Category "hive"

What does the HiveWarehouseConnector executeUpdate() function return?

I can't believe I have to ask this here but there seems to be no documentation on what the HWC actually does. All I can find is that it returns a boolean: publi

How to select all columns except 2 of them from a large table on pyspark sql?

In joining two tables, I would like to select all columns except 2 of them from a large table with many columns on pyspark sql on databricks. My pyspark sql: %

How to use Apache Spark to query Hive table with Kerberos?

I am attempting to use Scala with Apache Spark locally to query Hive table which is secured with Kerberos. I have no issues connecting and querying the data pro

How to extract the query result from a Hive job output logs using DataprocHiveOperator?

I am trying to build a data migration pipeline using Airflow, source being a Hive table on a Dataproc cluster and the destination is BigQuery. I'm using Datapro

How to split HBase row key into 2 columns in Hive table

HBase Table rowkey: 2020-02-02^ghfgewr3434555, cf:1 timestamp=1604405829275, value=true rowkey: 2020-02-02^ghfgewr3434555, cf:2 timestamp=1604405829275, value=

Hive query to find conversion ratio

I am trying this query in Hive and it's not working. select ( ( select count(*) from click_streaming where page_

Import MongoDB data into Hive Error: Splitter implementation is incompatible

I'm trying to import mongodb data into hive. The jar versions that i have used are ADD JAR /root/HDL/mongo-java-driver-3.4.2.jar; ADD JAR /root/HDL/mongo-hado

hive record inserted but then get a error

I create a table in hive: CREATE TABLE `test3`.`shop_dim` ( `shop_id` bigint, `shop_name` string, `shop_company_id`

Hive SQL regexp_extract (number)_(number)

I'm new to hiveSQL and I'm trying to extract a value from the column col_a from the data df which is in this format: \\\"id\\\":\\\"101_12345\\\" I only need to

HIVE CBO. Wrong results with Hive SQL query with MULTIPLE IN conditions in where clause

I am running one SQL query in Hive and it gives different results with CBO enabled and disabled. The results are wrong when CBO is enabled (set hive.cbo.enable=

Error while trying to create external table in hive

I am trying to create an external table using hive with hadoop but somehow it failed. These are the error I get when I try to run my queries. 02:23:29.516 [Hive

What's the data format of Athena's .csv.metadata files?

What's the data format of the .csv.metadata files written by Amazon Athena? Alongside the output file of every query there is a metadata file. It looks like it

Hive - double precision

I have been working on hive and found something peculiar. Basically, while using double as a datatype for your column we need not have any precision specified (

encountered : identifier expected cross, having,inner left, limit,order,right,where,commacausedby:exception syntax error

my query looks like but I am getting error select a.account_number,b.reference_acc from hdd.master_record format1 a join hdd.monetary b on a.load_date = b.load_

'hiveserver2 not listening on port 10000 and 10001'

When I run: hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console It shows Starting HiveServer2

What is the difference between -hivevar and -hiveconf?

From hive -h : --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable subsitution to apply to hive

How to find the min of multiple values in HIVE?

Hive has min(col) to find the minimum value of a column. But how about finding the minimum of multiple values (NOT one column), for example select min(2,1,3,4

Error :org.apache.thrift.transport.TTransportException java.net.SocketException: Broken pipe (Write failed) (State=08S01,code=0)

Our one of the Gateway machines getting a continuous error on Hive. While we are trying to execute any(select, Insert and drop) command in a beeline, frequent

How to store dynamically generated JSON object in Big Query Table?

I have a use case to store dynamic JSON objects in a column in Big Query. The schema of the object is dynamically generated by the source and not known beforeha

Hive Error : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

I have got twitter data using flume on HDFS. Have 3 node cluster and MySQL Metastore for hive. When i execute below query select user_name.screen_name, user_n