Category "scala"

Create a collection of case class names

I'm working in Spark 3.1 with Scala 2.12.10. I'd like to create a collection (Seq, whatever) of case classes that have implemented a common trait, so I can exec

ChannelClosedException - Upstream Address: Not Available

Our scala/thrift service is using twitter finagle 2.12-18.10.0 with java8. The Service is working fine for single request or handful of requests but when we try

tail recursion count, split & get the prefix of a list functions

Language: Scala I'm working on some tail recursion questions in Scala. (GOAL) A function that counts the number of ones in a list. The function takes a list of

Performing a groupBy on a dataframe while limiting the number of rows

I have a dataframe that contains an "id" column and a "publication" column. The "id" column contains duplicates, and represents a researcher. The "publication"

Using K0.ProductInstances in shapeless3

I have a library with typeclasses that I am migrating to Scala 3 using shapeless-3. One of my typeclasses is: trait Parser[T] { def parse(ctx: Context): (Opt

Cannot pass arguments using sbt to gatling simulation

Regarding to Gatling SBT execute a specific simulation topic is there any way to pass argument to simulation? I've been trying passing command from any CLI like

flink cluster with zookeeper HA always shutdown: [RECEIVED SIGNAL 15: SIGTERM]

Environment: flink1.14.4 standalone application mode in kubernetes according to official steps: flink cluster: https://nightlies.apache.org/flink/flink-docs-rel

How can I use snowflake jar in Bitnami Spark Docker container?

I was able to create docker based bitnami stand alone spark instance and run spark jobs on it. However I'm not able not able to write data to snowflake from the

Extract value from ArrayType column in Scala and reshape to long

I have a DataFrame that consists of Column that is ArrayType, and the array may have a different length in each row of the data. I have provide some example cod

How to find position of substring in another column of dataframe using spark scala

I have a Spark scala DataFrame with two columns, text and subtext, where subtext is guaranteed to occur somewhere within text. How would I calculate the positio

Create Map from Elements from List of case class

case class Student(id:String, name:String, teacher:String ) val myList = List( Student("1","Ramesh","Isabela"), Student("2","Elena","Mark"),Student(

Spark 3.2.1 fetch HBase data not working with NewAPIHadoopRDD

Below is the sample code snippet that is used for data fetch from HBase. This worked fine with Spark 3.1.2. However after upgrading to Spark 3.2.1, it is not wo

Count files in HDFS directory with Scala

In Scala, I am trying to count the files from an Hdfs directory. I tryed to get a list of the files with val files = fs.listFiles(path, false) and make a count

Kafka Admin Client giving Timeout Error for ListTopic

Hi I am trying to run this code in but it is working fine in another EC2 Azkaban instance but not giving below error for another instance. private val adminprop

YAML Environment Variable Interpolation in SnakeYAML scala

Leveraging the best from SnakeYAML & Jackson in scala, I am using the following method to parse YAML files. This method supports the usage of anchors in YAM

How to implement subtype resolution of typeclass in scala

I want to understand how to go about implementing the following use-case using typeclasses in Scala (or find out if it is even possible). Given a sealed trait a

Return ids after upsert Slick

I have a query that upserts the data to the database via Slick. I'd like to return the ids of the entities that were inserted. How can I do this using Slick in

Why spark bucket number not equal to the number of files in the partition?

val spark = SparkSession.builder().appName("Spark SQL basic example").config("spark.master", "local").getOrCreate() import spark.implicits._ case class Someth

Scala Test: how to assert lenghty exception message securly and clean without hardcoding?

I have the following code, which is used to (sha) hash columns in a spark dataframe: import org.apache.spark.sql.DataFrame import org.apache.spark.sql.functions

Import custom udf from jar to Spark

I am using Jupyter notebook for running Spark. My problem arises when I am trying to register a UDF from my custom imported jar. This is how I create th UDF in

Category "scala"

Other Categories