Category "dataframe"

Fill in missing variables of family relationship matrix

I have a dataframe of family relationships (parent, child, spouse, etc.) which is partially filled as per example below. I am trying to use R to fill in the mis

How to plot two plotly figures with common animation_frame

I am trying to plot both a scatterplot and a line plot, in the same figure. One is for objects and the other for lane markers. The outcome should be one figure

How to merge two dfs in pandas (based on datetime period), and add rows if duplicates

I have the following 2 dfs: diag id encounter_key start_of_period end_of_period 1 AAA 2020-06-12 2021-07-07 1 BBB 2021-12-31 2022-01-04 drug id start_datetime

How do I pass arguments to srvyr inside of a function?

so I'm using srvyr to calculate survey means of a variable (y) from a survey object, grouping by a categorical variable (x) from that same survey object, and th

How to estimate similarity between sensor data based on the number of occurrence?

Following is my sample data: data = {850.0: 6, -852.0: 5, 992.0: 29, -993.0: 25, 990.0: 27, -992.0: 28, 965.0: 127, 988.0: 37, -994.0: 24, 996.0: 14, -996.0: 1

Python DataFrame manipulation: How to extract a set of columns in a fast way

I need to access and extract information from a Dataframe that is used for other colleagues in a research group. The DataFrame structure is: zee.loc[zee['layer'

Dataframe add new row if the index does not exist like a dictionary without checking existence

import pandas as pd a = [['a', 1, 2, 3], ['b', 4, 5, 6], ['c', 7, 8, 9]] df = pd.DataFrame(a, columns=['alpha', 'one', 'two', 'three']) df.set_index(['alpha'],

Constant warning message with reshape::melt in r

I am constantly getting warning message like : as.is should be specified by the caller using true Code is like : difficulty_data <- data_original[,c(-1)] %

Highlight element based on boolean pandas df

I have 2 data frames with identical indices/columns: df = pd.DataFrame({'A':[5.5, 3, 0, 3, 1], 'B':[2, 1, 0.2, 4, 5],

Simultaneously remove the first and last rows of a data frame until reaching a row that does not have an NA

I have a dataframe that contains NA values, and I want to remove some rows that have an NA (i.e., not complete cases). However, I only want to remove rows at th

Replace values in a dataframe based on lookup table

I am having some trouble replacing values in a dataframe. I would like to replace values based on a separate table. Below is an example of what I am trying to d

Interactive filtering data table in Plotly by using a dropdown

I am trying to make an interactive table where the values of the table change by selecting a value from a dropdown. This should be done only in Plotly (not Dash

Creating a new dataframe column with the number of overlapping words between dataframe and list

I'm having some trouble fixing the following problem: I have a dataframe with tokenised text on every row that looks (something) like the following index feelin

How to map single column in pandas using multiple columns (text and numbers) in a separate df

I'm trying to convert U.S. geolocation codes for states, counties and cities. The problem is, the county and city codes are duplicated -- meaning, multiple stat

How to select all columns whose names start with X in a pandas DataFrame

I have a DataFrame: import pandas as pd import numpy as np df = pd.DataFrame({'foo.aa': [1, 2.1, np.nan, 4.7, 5.6, 6.8], 'foo.fighters': [0

Show Method for Dynamic Frame in AWS glue returns empty field

When I try to use the dyF.show() it returns an empty field, even though I checked the schema and count() and I know the table is populated. I transformed it int

Retrieving data from the Air Quality Index (AQI) website through the API and only recieving small nr. of stations

I'm working on a personal project and I'm trying to retrieve air quality data from the https://aqicn.org website using their API. I've used this code, which I'v

Get specific rows which match condition pandas [duplicate]

I have the following dataframe My current code is as follows: Outcome is to only show instances where ImageFileName is services.exe and the P

How to join two very large dataframes together with same columns?

I have two datasets that look like this: df1: Date City State Quantity 2019-01 Chicago IL 35 2019-01 Orlando FL 322 ... .... ... ... 2021-07 Chicago IL 334 202

Get records that are a time interval away from a given date and specific conditions on a pandas DataFrame

Let it be the following Python Panda DataFrame: | ID | date | direction | country_ID | |-----------|-------------------------|----