'Scalable/Iterative Large Data Frame Dimensionality Reduction R
I often have truly large data frames (ie 10 to 40 columns, millions to hundreds of millions of rows) that I would like to perform dimensionality reduction on in R/CRAN. Projects are Remote Sensing / Raster / Classification / Regression problems. I have around 170GB of ram, but most of the algorithms I've tried still frequently exceed this on most projects. Are there sample based or incremental / batch packages I should be aware of, or other standard approaches I'm missing? I understand that it may be possible to calculate the necessary statistics incremental my self, but I'm surprised I can't find a package ready made to do this, as I dont believe this situation is rare. I am aware of some GIS/Remote Sensing Software that have options (Orfeo, Arcgis, ENVI, ERDAS etc...) for raster based solutions. I'm most interested in techniques outside of PCA.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|