Category "r"

How do I replace values in a category by group in R

Hi I have a dataframe sleep_data where I am attempting to change Id values to user1:user33 based on groups. So where Id == 1503960366 change to user_1, Id == 16

Manually estimate probit model with autoregressive structure in R

I would like to write an R algorithm which would perform the Maximum Likelihood estimation of a binary choice model (probit/logit, it does not really matter) wi

How can I make time series plot?

I created a dataset with these variables. Can you help me please.

Merging in R returns columns full of NA values

I am trying to finish a business case in Kaggle but I am having some issues when merging two data frames. Having dataframe "dailyActivity" as: dailyActivity <

Error in tq_get, x = 'MKL', get = 'stock.prices': Error in new.session(): Could not establish session after 5 attempts [duplicate]

I am having difficulties pulling stock data. The code has worked earlier today, however this afternoon I became unable to pull any stock data

Avoid labels overlap in bar chart

How to avoid my y-axis label overlap in this case(top 5 most frequent words for each year)? Because I may need to change the number of words(ie.top10, top20) la

(R) How can a bigger expression be written by combining shorter variable expressions?

Brief introduction, I have multiple data file that can be fitted by models based in mathematical equations that are a combination (by sum, mutiplication, etc...

Statistical Tests in R

I want to run Bonferroni P Adjusted Value Test on a stacked data set. This is my code: stat.2 <- stack.2 %>% group_by(modules) %>% t_test(values ~ phen

Calculate changes in totals of subgroups in R

I have the following dataframe: # A tibble: 8 x 5 Year Group Unit Profit Sales <dbl> <chr> <chr> <dbl> <dbl> 1 2021 One

Configuring reticulate to use conda environment

I'm trying to get R's reticulate package to detect the Python interpreter in one of my conda-environments. Following several posts and questions, I've attempted

Converting categorical data into numerical values

I have a dataset with a lot of categorical mixed with numerical. I am trying to run a regression about obesity where the variables I'm trying to include are sta

Why is my dplyr code to create multiple variables using mutate and zoo incredibly slow?

I am using dplyr to create multiple variables in my data frame using mutate. At the same time, I am using zoo to calculate a rolling average. As an example, I h

MDCEV model estimation - all observations have zero probaility at starting value for model component

I am running an MDCEV model on location choice dataset and at first I ran into an error as "Log-likelihood calculation fails at values close to the starting val

Multiple layers on TMAP

I have a dataframe looking like this: dataframe Id is the zip code and the columns go from 2015 to 2019. Link to download the database (with the .shp file neede

Dplyr summarize "sum" function works correctly only for subset not the larger dataset in R

I have a dataset where I sampled abundance of 4 species across 12 months, 6 sites (5 replicates within a site). I am trying to calculate various summary stats (

What is the R equivalent to the e(sample) command in Stata?

I'm running conditional logistic regression models in R as part of a discordant sibling pair analysis and I need to isolate the total n for each model. Also, I

SMOTE_NC function in R: error in the ouput

thank you in advance for your time! I'm having some trouble with the SMOTE_NC function in R (https://rdrr.io/github/dongyuanwu/RSBID/man/SMOTE_NC.html). Shortly

kwic() function returns less rows than it should

I'm currently trying to perform a sentiment analysis on a kwic object, but I'm afraid that the kwic() function does not return all rows it should return. I'm no

Copy column from one data.frame to another based on index

The problem is similar to what posted in Combine dataframe based on index R I am trying to copy one column from df2 (huge df) to df1 (small df) but based on ind

Lag and lead a variable in a dataframe by 1 month and 6 business days for panel data

I have a large panel data set and I would like to lag and lead a variable by 1 month and 6 business days. I know, for instance, from dplyr there is the lag or