Category "apache-arrow"

How to read parquet files into tables in Java using Apache Arrow?

I want to do the same like pq_table = pq.read_table(file_path) in pyarrow. I tried Arrow 8.0.0 with the examples from here but it does not work. https://arrow.a

Reading schema & metadata from a parquet file

I am reading a third-party parquet file using parquetjs-lite const parquet = require("parquetjs-lite"); : reader = await parquet.ParquetReader.openFile(fileNam

How does Pyarrow read_csv handle different file encodings?

I have a .dat file that I had been reading with pd.read_csv and always needed to use encoding="latin" for it to read properly / without error. When I use pyarr

Comparison of protobuf and arrow

Both are language-neutral and platform-neutral data exchange libraries. I wonder what are the difference of them and which library is good for which situations.

How to properly create an apache plasma store from within a python program?

If I run the following as a program: import subprocess subprocess.run(['plasma_store -m 10000000000 -s /tmp/plasma'], shell=True, capture_output=True) and then

How to build an Apache Arrow message containing a list of structs with arrow-rs?

I'm using the arrow-rs crate (version 4.4) to declared the following schema: Schema::new(vec![ Field::new("name", DataType::Utf8, false),

Apache Arrow in Scala: AbstractMethodError on loadBatch

I'm trying to load Arrow file into scala. But every time I call ethier arrowStreamReader.loadNextBatch() nor arrowFileReader.loadRecordBatch(arrowBlock), the JV