'Is there a way to dynamically create new arrays from a dataframe
I have a table that looks like
|Category|number|absorbance|protein1|protein2|
|--------|------|----------|--------|--------|
|a|int|float|float|float|
|a|int|float|float|float|
|a|int|float|float|float|
|a|int|float|float|float|
|b|int|float|float|float|
|b|int|float|float|float|
|b|int|float|float|float|
|b|int|float|float|float|
|b|int|float|float|float|
|c|int|float|float|float|
|c|int|float|float|float|
|c|int|float|float|float|
|c|int|float|float|float|
|c|int|float|float|float|
and I want to extract vectors of absorbance, protein1, and protein2 for a,b, and c.
There could be more than 3 categories and each category need not have the same number in another category. Is there something built into dataframes that can extract these subarrays?
I tried dynamically coding this using
julia
for sample_num=1:number_samples
for i=1:3
#determine col type
if i==1
data_col_name="abs"
elseif i==2
data_col_name="B-actin"
else
data_col_name="pink-1"
end
temp=string(sample_num,"_",data_col_name)
column_count=4*(sample_num-1)+i
@eval($:string(sample_num,data_col_name)=$file_contents[:,column_count])
#string_as_varname(temp,file_contents[:,column_count])
end
end
But I got error in method definition:
function Base.string must be explicitly imported to be extended
top-level scope@none:0
top-level scope@none:1
[email protected]:360[inlined]
top-level scope@Local: 13[inlined]
Solution 1:[1]
I am not fully clear what you need exactly, but most likely what you are looking for is GroupedDataFrame
.
You start with creating it:
julia> using DataFrames
julia> df = DataFrame(id=["a", "a", "b", "b"], col1=1:4, col2=11:14)
4×3 DataFrame
Row ? id col1 col2
? String Int64 Int64
????????????????????????????
1 ? a 1 11
2 ? a 2 12
3 ? b 3 13
4 ? b 4 14
julia> gdf = groupby(df, :id)
GroupedDataFrame with 2 groups based on key: id
First Group (2 rows): id = "a"
Row ? id col1 col2
? String Int64 Int64
????????????????????????????
1 ? a 1 11
2 ? a 2 12
?
Last Group (2 rows): id = "b"
Row ? id col1 col2
? String Int64 Int64
????????????????????????????
1 ? b 3 13
2 ? b 4 14
Now to get column :col1
from group "a"
you can just write:
julia> gdf[("a",)].col1
2-element view(::Vector{Int64}, [1, 2]) with eltype Int64:
1
2
You can also define the group value and column name programmatically e.g.:
julia> group = "b"
"b"
julia> column = "col2"
"col2"
julia> gdf[(group,)][:, column]
2-element Vector{Int64}:
13
14
EDIT
How to get a dictionary mapping (key, column) pairs into subvectors:
julia> Dict((key[1], col) => gdf[key][:, col] for key in keys(gdf), col in valuecols(gdf))
Dict{Tuple{String, Symbol}, Vector{Int64}} with 4 entries:
("a", :col2) => [11, 12]
("b", :col2) => [13, 14]
("a", :col1) => [1, 2]
("b", :col1) => [3, 4]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |