'How can I use a HashMap with f64 as key in Rust?
I want to use a HashMap<f64, f64>
, for saving the distances of a point with known x and key y to another point. f64
as value shouldn't matter here, the focus should be on key.
let mut map = HashMap<f64, f64>::new();
map.insert(0.4, f64::hypot(4.2, 50.0));
map.insert(1.8, f64::hypot(2.6, 50.0));
...
let a = map.get(&0.4).unwrap();
As f64
is neither Eq
nor Hash
, but only PartialEq
, f64
is not sufficient as a key. I need to save the distances first, but also access the distances later by y. The type of y needs to be floating point precision, but if doesn't work with f64
, I'll use an i64
with an known exponent.
I tried some hacks by using my own struct Dimension(f64)
and then implementing Hash
by converting the float into a String
and then hashing it.
#[derive(PartialEq, Eq)]
struct DimensionKey(f64);
impl Hash for DimensionKey {
fn hash<H: Hasher>(&self, state: &mut H) {
format!("{}", self.0).hash(state);
}
}
It seems very bad and both solutions, my own struct or float as integers with base and exponent seem to be pretty complicated for just a key.
Update:
I can guarantee that my key never will be NaN
, or an infinite value. Also, I won't calculate my keys, only iterating over them and using them. So there should no error with the known error with 0.1 + 0.2 ≠ 0.3
.
How to do a binary search on a Vec of floats? and this question have in common to implement total ordering and equality for a floating number, the difference lies only in the hashing or iterating.
Solution 1:[1]
You could split the f64
into the integral and fractional part and store them in a struct in the following manner:
#[derive(Hash, Eq, PartialEq)]
struct Distance {
integral: u64,
fractional: u64
}
The rest is straightforward:
use std::collections::HashMap;
#[derive(Hash, Eq, PartialEq)]
struct Distance {
integral: u64,
fractional: u64
}
impl Distance {
fn new(i: u64, f: u64) -> Distance {
Distance {
integral: i,
fractional: f
}
}
}
fn main() {
let mut map: HashMap<Distance, f64> = HashMap::new();
map.insert(Distance::new(0, 4), f64::hypot(4.2, 50.0));
map.insert(Distance::new(1, 8), f64::hypot(2.6, 50.0));
assert_eq!(map.get(&Distance::new(0, 4)), Some(&f64::hypot(4.2, 50.0)));
}
Edit: As Veedrac said, a more general and efficient option would be to deconstruct the f64
into a mantissa-exponent-sign triplet. The function that can do this, integer_decode()
, is deprecated in std
, but it can be easily found in Rust GitHub.
The integer_decode()
function can be defined as follows:
use std::mem;
fn integer_decode(val: f64) -> (u64, i16, i8) {
let bits: u64 = unsafe { mem::transmute(val) };
let sign: i8 = if bits >> 63 == 0 { 1 } else { -1 };
let mut exponent: i16 = ((bits >> 52) & 0x7ff) as i16;
let mantissa = if exponent == 0 {
(bits & 0xfffffffffffff) << 1
} else {
(bits & 0xfffffffffffff) | 0x10000000000000
};
exponent -= 1023 + 52;
(mantissa, exponent, sign)
}
The definition of Distance
could then be:
#[derive(Hash, Eq, PartialEq)]
struct Distance((u64, i16, i8));
impl Distance {
fn new(val: f64) -> Distance {
Distance(integer_decode(val))
}
}
This variant is also easier to use:
fn main() {
let mut map: HashMap<Distance, f64> = HashMap::new();
map.insert(Distance::new(0.4), f64::hypot(4.2, 50.0));
map.insert(Distance::new(1.8), f64::hypot(2.6, 50.0));
assert_eq!(map.get(&Distance::new(0.4)), Some(&f64::hypot(4.2, 50.0)));
}
Solution 2:[2]
Presented with no comment beyond read all the other comments and answers to understand why you probably don't want to do this:
use std::{collections::HashMap, hash};
#[derive(Debug, Copy, Clone)]
struct DontUseThisUnlessYouUnderstandTheDangers(f64);
impl DontUseThisUnlessYouUnderstandTheDangers {
fn key(&self) -> u64 {
self.0.to_bits()
}
}
impl hash::Hash for DontUseThisUnlessYouUnderstandTheDangers {
fn hash<H>(&self, state: &mut H)
where
H: hash::Hasher,
{
self.key().hash(state)
}
}
impl PartialEq for DontUseThisUnlessYouUnderstandTheDangers {
fn eq(&self, other: &DontUseThisUnlessYouUnderstandTheDangers) -> bool {
self.key() == other.key()
}
}
impl Eq for DontUseThisUnlessYouUnderstandTheDangers {}
fn main() {
let a = DontUseThisUnlessYouUnderstandTheDangers(0.1);
let b = DontUseThisUnlessYouUnderstandTheDangers(0.2);
let c = DontUseThisUnlessYouUnderstandTheDangers(0.3);
let mut map = HashMap::new();
map.insert(a, 1);
map.insert(b, 2);
println!("{:?}", map.get(&a));
println!("{:?}", map.get(&b));
println!("{:?}", map.get(&c));
}
Basically, if you want to treat a f64
as a set of bits that have no meaning, well, we can treat them as an equivalently sized bag of bits that know how to be hashed and bitwise-compared.
Don't be surprised when one of the 16 million NaN
values doesn't equal another one.
Solution 3:[3]
Unfortunately, floating types equality is hard and counter-intuitive:
fn main() {
println!("{} {} {}", 0.1 + 0.2, 0.3, 0.1 + 0.2 == 0.3);
}
// Prints: 0.30000000000000004 0.3 false
And therefore hashing is hard too, since hashes of equal values should be equal.
If, in your case, you have a small enough range to fit your number in a i64
and you can accept the loss of precision, then a simple solution is to canonicalize first and then define equal/hash in terms of the canonical value:
use std::cmp::Eq;
#[derive(Debug)]
struct Distance(f64);
impl Distance {
fn canonicalize(&self) -> i64 {
(self.0 * 1024.0 * 1024.0).round() as i64
}
}
impl PartialEq for Distance {
fn eq(&self, other: &Distance) -> bool {
self.canonicalize() == other.canonicalize()
}
}
impl Eq for Distance {}
fn main() {
let d = Distance(0.1 + 0.2);
let e = Distance(0.3);
println!("{:?} {:?} {:?}", d, e, d == e);
}
// Prints: Distance(0.30000000000000004) Distance(0.3) true
Hash
just follows, and from then on you can use Distance
as a key in the hash map:
impl Hash for Distance {
fn hash<H>(&self, state: &mut H) where H: Hasher {
self.canonicalize().hash(state);
}
}
fn main() {
let d = Distance(0.1 + 0.2);
let e = Distance(0.3);
let mut m = HashMap::new();
m.insert(d, "Hello");
println!("{:?}", m.get(&e));
}
// Prints: Some("Hello")
Warning: To reiterate, this strategy only works if (a) the dynamic range of values is small enough to be captured in a i64
(19 digits) and if (b) the dynamic range is known in advance as the factor is static. Fortunately, this holds for many common problems, but it is something to document and test...
Solution 4:[4]
You can use the ordered_float crate which does this for you.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | Matthieu M. |
Solution 4 | optevo |