I'm an aspiring data scientist. I stumbled across the titanic dataset. I tried to use logistic regression for the problem. However, I got stuck. Since I have tw
I have a mining dataset which has a following features Rock_type, Gold in grams(AU). Rock type has 8 different rock types and Gold (AU) has pr
I ran a series of simulations and want to create a response surface of the performance based off my two parameters, tol and eta. The issue I'm having is actuall
If I have data that easily fits into memory, but I need to iterate over it hundreds or thousands of times, is there a faster way? For instance, if I have 400k d
My y going in and both y_train and y_eval are binary int, what am I doing wrong? I noticed the predictions going out are like this [0.,1.,0. ...] which is proba
parent_folder / subfolder1 / subsubfolder1/ a.py b.py subsubfolder2/ c.py d.py e.py subfolder2 / subsubfolder2/ f.py g.py subfolder3 / h.py i.py g.py I want to
While working on a project I have come across a weird error, where fitting my model works perfectly but when I apply gridsearch it gives me an error. The code p
I have the following 2 dfs: diag id encounter_key start_of_period end_of_period 1 AAA 2020-06-12 2021-07-07 1 BBB 2021-12-31 2022-01-04 drug id start_datetime
Following is my sample data: data = {850.0: 6, -852.0: 5, 992.0: 29, -993.0: 25, 990.0: 27, -992.0: 28, 965.0: 127, 988.0: 37, -994.0: 24, 996.0: 14, -996.0: 1
[here] I tried to do it with sp.hstack() and with
I'm new in machine learning and I'm trying to train a model. I'm using this Keras oficial example as a guide to set my dataset and feed it into the model: https
You can see my dataframe below, x values are different value, but other values are same with left values, for example, column 15 and column 16 are same value. I
I have this struggle with a dataheavy project. I can run a file that uses a query file -- Al the query's and converters are in here -- without problems, but whe
My current code functions and produces a graph if there is only 1 sensor, i.e. if col2, and col3 are deleted in the example data provided below, leaving one col
I have a pandas data frame like given below Id1 YEAR CLAIM_STATUS no_of_claims 1 2019-01 4 1 1 2019-01 5 1
trying to create a new column on a cudf dataframe based on VWMA from ta_py : #creating df CJ_m30 = cudf.read_csv("/media/f333a/Data/CJ_m30.csv",
I am trying to install the CUDA toolkit in order to be able to use Thundersvm in my personal computer. However I keep getting the following message in the GUI i
One of the biggest struggle with ML research is the creation of objective functions which capture the researcher's goals. Especially when talk
This code block is from OR-Tools docs, and I want to remove these for-loops. Is there a way to vectorize the code? The issue here is that I expect to have the n
I'm getting a keyerror 'initialized_diffuse' while calling the following API, probably after joblib.load(). import joblib .......... @routes.route("/forecast",