Category "scala"

sonarQube Findbug error and ##[error]java.lang.IllegalStateException: Can not execute Findbugs

I am really struggling from months. We are trying to scan SCALA code with SonarQube in Azure Devops which is in Databricks. We were getting around 30 error. But

Finding how many times a given string is a substring of another string using scala

what is the elegant way for finding how many times a given string is a substring of another string using scala? The below test cases that should make clear wh

dataframe Spark scala explode json array

Let's say I have a dataframe which looks like this: +--------------------+--------------------+--------------------------------------------------------------+

How to escape ( parentheses in Spark Scala?

I am trying to replace parentheses in a string (i.e. column names). It is working fine with white spaces but not with ( parentheses. I tried """, \(, \\( but I

Spark 3.0 is much slower to read json files than Spark 2.4

I have large amount of json files that Spark can read in 36 seconds but Spark 3.0 takes almost 33 minutes to read the same. On closer analysis, looks like Spark

Akka: persist to Cassandra and publish to Kafka multiple events

I need to store to Cassandra and publish to Kafka multiple events, and call some final handler() only after all events are stored and published. I came across U

cats-effect:How to transform Map[x,IO[y]] to IO[Map[x,y]]

I have a map of string to IO like this Map[String, IO[String]], I want to transform it into IO[Map[String, String]]. How to do it?

How to organize java and scala code in Play?

activator new results in: Fetching the latest list of templates... Browse the list of templates: http://lightbend.com/activator/templates Choose from these f

List All objects in S3 with given Prefix in scala

I am trying list all objects in AWS S3 Buckets with input Bucket Name & Filter Prefix using following code. import scala.collection.JavaConverters._ import

Trigger IF Statement only when two Spark dataframe meet the conditions

I have two identical Spark DataFrame. They have the same columns. I am trying to create a IF-Else statement in one line but couldnt find a better way to do it.

Find the maximum value from JSON data in Scala

I am very new to programming in Scala. I am writing a test program to get maximum value from JSON data. I have following code: import scala.io.Source import sc

How do I verbalize the term F[_] in scala/cats-effect

I'm learning the concept of F[_] as a constructor for other types, but how do you pronounce this to another human or say it in your head (for us internal monolo

Cannot connect to Cassandra in spark-shell

I am trying to connect to a remote cassandra cluster in my spark shell using the Spark-cassandra connector. But its throwing some unusual errors. I do the usual

Provider com.google.cloud.spark.bigquery.BigQueryRelationProvider could not be instantiated while reading from bigquery in Jupyter lab

I have followed this post pyspark error reading bigquery: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class and followed the resolution

Programmatically add/remove executors to a Spark Session

I'm looking for a reliable way in Spark (v2+) to programmatically adjust the number of executors in a session. I know about dynamic allocation and the ability

What should I use instead deprecated FlinkKafkaConsumer? Scala Flink

I try to get data from Kafka to Flink, I use FlinkKafkaConsumer but Intellij shows me that it is depricated and also ssh console in Google Cloud shows me this e

sbt package is trying to download a package whose path does not exist

These are the contents of my build.sbt file: name := "WordCounter" version := "0.1" scalaVersion := "2.13.1" libraryDependencies ++= Seq( "org.apache.spar

Find the Max value of an Array column and find associated value in another Array with in the dataframe

I have a csv file with below data. Id Subject Marks 1 M,P,C 10,8,6 2 M,P,C 5,7,9 3 M,P,C 6,7,4 I Need to find out Max value in the Marks column for each Id an

spark save simple string to text file

I have a spark job that needs to store the last time it ran to a text file. This has to work both on HDFS but also on local fs (for testing). However it seems

Akka Finite State Machine and how to protocol Behaviors.unhandled?

I have a question about Behaviors.unhandled, I know that Akka sends the unhandled message to the Dead Letter and with the following configuration it also logs i