Category "scala"

Error to write dataframe in Cassandra table on Amazon Keyspaces

I'm trying to write a dataframe on AWS (Keyspace), but I'm getting the following messages below: Stack: dfExploded.write.cassandraFormat(table = "table", keyspa

CodecNotFoundException while writing to Amazon Keyspaces

I am trying to write a Spark DF to AWS Keyspaces. Randomly some of the records are getting updated and some of the records are throwing this exception com.datas

Spark/Scala approximate group by

Is there a way of counting approximately after a group by on an sql dataset in Spark? Or more generally, what is the fastest way of group by counting in Spark?

How to Install specific version of spark using specific version of scala

I'm running spark 2.4.5 in my mac. When I execute spark-submit --version ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/

Scala 3 Explicit Nulls flag makes String operations quite unusable

When using the new Scala 3's flag -Yexplicit-nulls, every Java code which doesn't have explicit non-null annotations is treated as nullable, thus every Java met

standardized method for writing an arbitrary typesafe Config to a hocon file?

in a Scala research application, i load a hocon file using PureConfig's ConfigSource.file() method, which represents the default configuration for a research ex

How I can run all performance tests from fat jar with Gatling?

I have been trying to execute all my performance tests from my gatling fat-jar created with the assemble plugin, however, when I try to execute my performance t

How to ignore a field from serializing when using circe in scala

I am using circe in scala and have a following requirement : Let's say I have some class like below and I want to avoid password field from being serialised the

How did spark RDD map to Cassandra table?

I am new to Spark, and recently I saw a code is saving data in RDD format to Cassandra table. But I am not able to figure it out how it is doing the column mapp

Scala error - Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

I have a requirement where i am reading data from a CSV file and writing data to a Delta table over scala on window OS. My scala code is given below:- import co

How to define a Monad for a function type?

I am trying Cats for the first time and am using Scala 3, and I am trying to implement a set of parser combinators for self-pedagogy, however; I am stuck on the

It is possible to Stream data from beam (Scio) to an S3 bucket?

Currently, I'm working on a project which extracts data from a BigQuery table using Scio in Scala. I'm able to extract and ingest the data into ElasticSearch, b

Connection between kafka and spark : Failed to find data source : kafka

I am trying to do link between kafka and spark by reading data from one topic and tryy to print the content of this topic into a DataFrame, but by doing connect

How to get Unit Test counts in SonarQube for a Scala SBT build

Note: We are executing this as part of CI build in Teamcity Step 1: Getting coverage details addSbtPlugin("org.scoverage" % "sbt-scoverage" % "1.6.1") Step 2: S

Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$

Hi I try to run spark on my local laptop. I created a mvn project in intelijidea and in my main class I have one line like bellow and when I try to run a projec

Run scalafmtCheck in an sbt assembly

I would like to run a scalafmtCheck in sbt assembly. I tried to add: (compile in Compile) := ((compile in Compile) dependsOn scalafmtCheck).value I got that e

Type inference in ZIO giving Any in for comprehension

So I have written a method to count the number of lines in a file in ZIO. def lines(file: String): Task[Long] = { def countLines(reader: BufferedReader): Ta

How to execute scala tests programmatically

I'm looking for a way to execute scala tests (implemented in munit, but it could be also ScalaTest) programmatically. I want to perform more or less what sbt te

Does AWS Glue support positional arguments

How to capture a Glue job's arguments by position rather than using the getResolvedOptions function and passing the arguments as key value pairs?

scala spark partitionby and get current partition name

I'm using scala spark and have a DataFrame: Source | Column1 | Column2 A ... ... B ... ... B ... ... C ...