Category "scala"

Provider com.google.cloud.spark.bigquery.BigQueryRelationProvider could not be instantiated while reading from bigquery in Jupyter lab

I have followed this post pyspark error reading bigquery: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class and followed the resolution

Programmatically add/remove executors to a Spark Session

I'm looking for a reliable way in Spark (v2+) to programmatically adjust the number of executors in a session. I know about dynamic allocation and the ability

What should I use instead deprecated FlinkKafkaConsumer? Scala Flink

I try to get data from Kafka to Flink, I use FlinkKafkaConsumer but Intellij shows me that it is depricated and also ssh console in Google Cloud shows me this e

sbt package is trying to download a package whose path does not exist

These are the contents of my build.sbt file: name := "WordCounter" version := "0.1" scalaVersion := "2.13.1" libraryDependencies ++= Seq( "org.apache.spar

Find the Max value of an Array column and find associated value in another Array with in the dataframe

I have a csv file with below data. Id Subject Marks 1 M,P,C 10,8,6 2 M,P,C 5,7,9 3 M,P,C 6,7,4 I Need to find out Max value in the Marks column for each Id an

spark save simple string to text file

I have a spark job that needs to store the last time it ran to a text file. This has to work both on HDFS but also on local fs (for testing). However it seems

Akka Finite State Machine and how to protocol Behaviors.unhandled?

I have a question about Behaviors.unhandled, I know that Akka sends the unhandled message to the Dead Letter and with the following configuration it also logs i

Improvement of State Machine implemented in scala

I am working on an implementation of a state machine in scala. The original version is written in python, therefore I have a lot of if /else clauses in the co

Json4s not serializing Java classes

I have some scala code that needs to be able to serialize/deserialize some Java classes using Json4s. I am using "org.json4s" %% "json4s-ext" % "4.0.5" and "org

Error building maven project in Intellij : "object apache is not a member of package org"

Whenever I try to run my main program directly in IntelliJ I get this error: Error:(5, 12) object apache is not a member of package org import org.apache.common

Using scalamock: Could not find implicit value for evidence parameter of type error

I am writing unit tests for my spark/scala application. I am using scalamock as well to mock objects, specifically Session / Session Factory. In one of my test

Update DeltaTable properties on S3

I have a DeltaTable at aws S3 location (s3://bucket/myDeltaTable) which has a default table property delta.logRetentionDuration set to 30 days. Is there a way I

How to make sure my DataFrame frees its memory?

I have a Spark/Scala job in which I do this: 1: Compute a big DataFrame df1 + cache it into memory 2: Use df1 to compute dfA 3: Read raw data into df2 (again,

How to run ConfigMapWrapperSuite?

I need to write an integration test and it requires starting a server executable. I want to make location of the server configurable, so that I could set it on

Range queries in Elasticsearch Java API

I have two fields in my ES index: min_duration and max_duration. I want to create a query to find all the documents for input duration such that : min_duration

Installing new Scala SDK 3.x.x on IntelliJ

I need to run Scala 3.x.x from IntelliJ. This is how I am trying to install it: Click File > Project Structure... Click on the "+" button (New Project Libra

Slick using mapped column type in update statement

I have a trouble in using slick MappedColumnType, the code snippet is as following: private trait UserTable { self: HasDatabaseConfigProvider[JdbcProfile] =&

Scala Quill - queries always dynamic

I am always getting dynamic queries in Quill, even with simplest query I get dynamic query compiler warning log: type DbContext = PostgresAsyncContext[Liter

How to convert scala.some to scala.mutable.Map?

i am trying to write code to mask nested json fields.. def maskRecursively(map :mutable.Map[String,Object]):mutable.Map[String,Object] ={ val maskColumns = PII

Ways to make maven build faster?

I have a multi module java project. Maven takes almost around 40 secs to build it. I have tried maven with multi threaded builds too by specifying -T and -C arg