My project structure is: logs - data - pubs - invent.proto - common - num.proto NOTE - The .proto files are not under src/main/protobu
I am evaluating different load testing tools. After trying JMeter and having two exceptions when running and viewing the test result, I would like to give Gatli
I am trying to create a Scala UDF for Spark, that can be used in Spark SQL. The objective of the function is to accept any column type as input, and put it in a
Spark: 3.0.0 Scala: 2.12.8 My data frame has a column with JSON string and I want to create a new column from it with the StructType. |temp_json_string
Every day I build another case class and wish I could define a property called type on it, but to do so requires using the highly annoying backtick syntax: dooh
I have code similar to this in Spark(Scala). I would like to know the number of records this code updated/inserted when execute() is complete. Is there a way?
I'm developing a kafka producer code in scala with those libs (I have to use version >6.X in kafka avro serializer to use TLS comunication): <dependency&g
I'm trying to write a dataframe on AWS (Keyspace), but I'm getting the following messages below: Stack: dfExploded.write.cassandraFormat(table = "table", keyspa
I am trying to write a Spark DF to AWS Keyspaces. Randomly some of the records are getting updated and some of the records are throwing this exception com.datas
Is there a way of counting approximately after a group by on an sql dataset in Spark? Or more generally, what is the fastest way of group by counting in Spark?
I'm running spark 2.4.5 in my mac. When I execute spark-submit --version ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/
When using the new Scala 3's flag -Yexplicit-nulls, every Java code which doesn't have explicit non-null annotations is treated as nullable, thus every Java met
in a Scala research application, i load a hocon file using PureConfig's ConfigSource.file() method, which represents the default configuration for a research ex
I have been trying to execute all my performance tests from my gatling fat-jar created with the assemble plugin, however, when I try to execute my performance t
I am using circe in scala and have a following requirement : Let's say I have some class like below and I want to avoid password field from being serialised the
I am new to Spark, and recently I saw a code is saving data in RDD format to Cassandra table. But I am not able to figure it out how it is doing the column mapp
I have a requirement where i am reading data from a CSV file and writing data to a Delta table over scala on window OS. My scala code is given below:- import co
I am trying Cats for the first time and am using Scala 3, and I am trying to implement a set of parser combinators for self-pedagogy, however; I am stuck on the
Currently, I'm working on a project which extracts data from a BigQuery table using Scio in Scala. I'm able to extract and ingest the data into ElasticSearch, b
I am trying to do link between kafka and spark by reading data from one topic and tryy to print the content of this topic into a DataFrame, but by doing connect