'New state of the art in unlimited generation of Hamming sequence
(this is exciting!) I know, the subject matter is well known. The state of the art (in Haskell as well as other languages) for efficient generation of unbounded increasing sequence of Hamming numbers, without duplicates and without omissions, has long been the following (AFAIK - and by the way it is equivalent to the original Edsger Dijkstra's solution, too):
hamm :: [Integer]
hamm = 1 : map (2*) hamm `union` map (3*) hamm `union` map (5*) hamm
where
union a@(x:xs) b@(y:ys) = case compare x y of
LT -> x : union xs b
EQ -> x : union xs ys
GT -> y : union a ys
The question I'm asking is, can you find the way to make it more efficient in any significant measure? Is it still the state of the art or is it in fact possible to improve this to run twice faster?
If your answer is yes, please show the code and discuss its speed and empirical orders of growth in comparison to the above (it runs at about ~ n1.05…1.10
for the first few hundreds of thousands of numbers produced). Also, if it exists, can this efficient algorithm be extended to producing a sequence of smooth numbers with any given set of primes?
(clarification: I'm not asking about the much faster direct generation of an nth Hamming number, but rather generating all first n numbers in the sequence.)
Solution 1:[1]
If a constant factor(1) speedup counts as significant, then I can offer a significantly more efficient version:
hamm :: [Integer]
hamm = mrg1 hamm3 (map (2*) hamm)
where
hamm5 = iterate (5*) 1
hamm3 = mrg1 hamm5 (map (3*) hamm3)
merge a@(x:xs) b@(y:ys)
| x < y = x : merge xs b
| otherwise = y : merge a ys
mrg1 (x:xs) ys = x : merge xs ys
You can easily generalise it to smooth numbers for a given set of primes:
hamm :: [Integer] -> [Integer]
hamm [] = [1]
hamm [p] = iterate (p*) 1
hamm ps = foldl' next (iterate (q*) 1) qs
where
(q:qs) = sortBy (flip compare) ps
next prev m = let res = mrg1 prev (map (m*) res) in res
merge a@(x:xs) b@(y:ys)
| x < y = x : merge xs b
| otherwise = y : merge a ys
mrg1 (x:xs) ys = x : merge xs ys
It's more efficient because that algorithm doesn't produce any duplicates and it uses less memory. In your version, when a Hamming number near h
is produced, the part of the list between h/5
and h
has to be in memory. In my version, only the part between h/2
and h
of the full list, and the part between h/3
and h
of the 3-5-list needs to be in memory. Since the 3-5-list is much sparser, and the density of k-smooth numbers decreases, those two list parts need much less memory that the larger part of the full list.
Some timings for the two algorithms to produce the k
th Hamming number, with empirical complexity of each target relative to the previous, excluding and including GC time:
k Yours (MUT/GC) Mine (MUT/GC)
10^5 0.03/0.01 0.01/0.01 -- too short to say much, really
2*10^5 0.07/0.02 0.02/0.01
5*10^5 0.17/0.06 0.968 1.024 0.06/0.04 1.199 1.314
10^6 0.36/0.13 1.082 1.091 0.11/0.10 0.874 1.070
2*10^6 0.77/0.27 1.097 1.086 0.21/0.21 0.933 1.000
5*10^6 1.96/0.71 1.020 1.029 0.55/0.59 1.051 1.090
10^7 4.05/1.45 1.047 1.043 1.14/1.25 1.052 1.068
2*10^7 8.73/2.99 1.108 1.091 2.31/2.65 1.019 1.053
5*10^7 21.53/7.83 0.985 1.002 6.01/7.05 1.044 1.057
10^8 45.83/16.79 1.090 1.093 12.42/15.26 1.047 1.084
As you can see, the factor between the MUT times is about 3.5, but the GC time is not much different.
(1) Well, it looks constant, and I think both variants have the same computational complexity, but I haven't pulled out pencil and paper to prove it, nor do I intend to.
Solution 2:[2]
So basically, now that Daniel Fischer gave his answer, I can say that I came across this recently, and I think this is an exciting development, since the classical code was known for ages, since Dijkstra.
Daniel correctly identified the redundancy of the duplicates generation which must then be removed, in the classical version.
The credit for the original discovery (AFAIK) goes to Rosettacode.org's contributor Ledrug, as of 2012-08-26. And of course the independent discovery by Daniel Fischer, here (2012-09-18).
Re-written slightly, that code is:
import Data.Function (fix)
hamm = 1 : foldr (\n s -> fix (merge s . (n:) . map (n*))) [] [2,3,5]
with the usual implementation of merge,
merge a@(x:xs) b@(y:ys) | x < y = x : merge xs b
| otherwise = y : merge a ys
merge [] b = b
merge a [] = a
It gives about 2.0x - 2.5x a speedup vs. the classical version.
Solution 3:[3]
Well this was easier than I thought. This will do 1000 Hammings in 0.05 seconds on my slow PC at home. This afternoon at work and a faster PC times of less than 600 were coming out as zero seconds.
This take Hammings from Hammings. It's based on doing it fastest in Excel.
I was getting wrong numbers after 250000, with Int
. The numbers grow very big very fast, so Integer
must be used to be sure, because Int
is bounded.
mkHamm :: [Integer] -> [Integer] -> [Integer] -> [Integer]
-> Int -> (Integer, [Int])
mkHamm ml (x:xs) (y:ys) (z:zs) n =
if n <= 1
then (last ml, map length [(x:xs), (y:ys), (z:zs)])
else mkHamm (ml++[m]) as bs cs (n-1)
where
m = minimum [x,y,z]
as = if x == m then xs ++ [m*2] else (x:xs) ++ [m*2]
bs = if y == m then ys ++ [m*3] else (y:ys) ++ [m*3]
cs = if z == m then zs ++ [m*5] else (z:zs) ++ [m*5]
Testing,
> mkHamm [1] [2] [3] [5] 5000
(50837316566580,[306,479,692]) -- (0.41 secs)
> mkHamm [1] [2] [3] [5] 10000
(288325195312500000,[488,767,1109]) -- (1.79 secs)
> logBase 2 (1.79/0.41) -- log of times ratio =
2.1262637726461726 -- empirical order of growth
> map (logBase 2) [488/306, 767/479, 1109/692] :: [Float]
[0.6733495, 0.6792009, 0.68041545] -- leftovers sizes ratios
This means that this code's run time's empirical order of growth is above quadratic (~n^2.13
as measured, interpreted, at GHCi prompt).
Also, the sizes of the three dangling overproduced segments of the sequence are each ~n^0.67
i.e. ~n^(2/3)
.
Additionally, this code is non-lazy: the resulting sequence's first element can only be accessed only after the very last one is calculated.
The state of the art code in the question is linear, overproduces exactly 0 elements past the point of interest, and is properly lazy: it starts producing its numbers immediately.
So, though an immense improvement over the previous answers by this poster, it is still significantly worse than the original, let alone its improvement as appearing in the top two answers.
12.31.2018
Only the very best people educate. @Will Ness also has authored or co-authored 19 chapters in GoalKicker.com “Haskell for Professionals”. The free book is a treasure.
I had carried around the idea of a function that would do this, like this. I was apprehensive because I thought it would be convoluted and involved logic like in some modern languages. I decided to start writing and was amazed how easy Haskell makes the realization of even bad ideas.
I've not had difficulty generating unique lists. My problem is the lists I generate do not end well. Even when I use diagonalization they leave residual values making their use unreliable at best.
Here is a reworked 3's and 5's list with nothing residual at the end. The denationalization is to reduce residual values not to eliminate duplicates which are never included anyway.
g3s5s n=[t*b|(a,b)<-[ (((d+n)-(d*2)), 5^d) | d <- [0..n]],
t <-[ 3^e | e <- [0..a+8]],
(t*b)<-(3^(n+6))+a]
ham2 n = take n $ ham2' (drop 1.sort.g3s5s $ 48) [1]
ham2' o@(f:fo) e@(h:hx) = if h == min h f
then h:ham2' o (hx ++ [h*2])
else f:ham2' fo ( e ++ [f*2])
The twos
list can be generated with all 2^e
s multiplied by each of the 3s5s
but when identity 2^0
is included, then, in total, it is the Hammings.
3/25/2019
Well, finally. I knew this some time ago but could not implement it without excess values at the end. The problem was how to not generate the excess that is the result of a Cartesian Product. I use Excel a lot and could not see the pattern of values to exclude from the Cartesian Product worksheet. Then, eureka! The functions generate lists of each lead factor. The value to limit the values in each list is the end point of the first list. When this is done, all Hammings are produced with no excess.
Two functions for Hammings. The first is a new 3's & 5's list which is then used to create multiples with the 2's. The multiples are Hammings.
h35r x = h3s5s x (5^x)
h3s5s x c = [t| n<-[3^e|e<-[0..x]],
m<-[5^e|e<-[0..x]],
t<-[n*m],
t <= c ]
a2r n = sort $ a2s n (2^n)
a2s n c = [h| b<-h35r n,
a<-[2^e| e<-[0..n]],
h<-[a*b],
h <= c ]
last $ a2r 50
1125899906842624
(0.16 secs, 321,326,648 bytes)
2^50
1125899906842624
(0.00 secs, 95,424 bytes
This is an alternate, cleaner & faster with less memory usage implementation.
gnf n f = scanl (*) 1 $ replicate f n
mk35 n = (\c-> [m| t<- gnf 3 n, f<- gnf 5 n, m<- [t*f], m<= c]) (2^(n+1))
mkHams n = (\c-> sort [m| t<- mk35 n, f<- gnf 2 (n+1), m<- [t*f], m<= c]) (2^(n+1))
last $ mkHams 50
2251799813685248
(0.03 secs, 12,869,000 bytes)
2^51
2251799813685248
5/6/2019
Well, I tried limiting differently but always come back to what is simplest. I am opting for the least memory usage as also seeming to be the fastest.
I also opted to use map
with an implicit parameter.
I also found that mergeAll
from Data.List.Ordered
is faster that sort
or sort
and concat
.
I also like when sublists are created so I can analyze the data much easier.
Then, because of @Will Ness switched to iterate
instead of scanl
making much cleaner code. Also because of @Will Ness I stopped using the last of of 2s list and switched to one value determining all lengths.
I do think recursively defined lists are more efficient, the previous number multiplied by a factor.
Just separating the function into two doesn't make a difference so the 3 and 5 multiples would be
m35 lim = mergeAll $
map (takeWhile (<=lim).iterate (*3)) $
takeWhile (<=lim).iterate (*5) $ 1
And the 2s each multiplied by the product of 3s and 5s
ham n = mergeAll $
map (takeWhile (<=lim).iterate (*2)) $ m35 lim
where lim= 2^n
After editing the function I ran it
last $ ham 50
1125899906842624
(0.00 secs, 7,029,728 bytes)
then
last $ ham 100
1267650600228229401496703205376
(0.03 secs, 64,395,928 bytes)
It is probably better to use 10^n
but for comparison I again used 2^n
5/11/2019
Because I so prefer infinite and recursive lists I became a bit obsessed with making these infinite.
I was so impressed and inspired with @Daniel Wagner and his Data.Universe.Helpers
I started using +*+
and +++
but then added my own infinite list. I had to mergeAll
my list to work but then realized the infinite 3 and 5 multiples were exactly what they should be. So, I added the 2s and mergeAll
d everything and they came out. Before, I stupidly thought mergeAll
would not handle infinite list but it does most marvelously.
When a list is infinite in Haskell, Haskell calculates just what is needed, that is, is lazy. The adjunct is that it does calculate from, the start.
Now, since Haskell multiples until the limit of what is wanted, no limit is needed in the function, that is, no more takeWhile
. The speed up is incredible and the memory lowered too,
The following is on my slow home PC with 3GB of RAM.
tia = mergeAll.map (iterate (*2)) $
mergeAll.map (iterate (*3)) $ iterate (*5) 1
last $ take 10000 tia
288325195312500000
(0.02 secs, 5,861,656 bytes)
6.5.2019
I learned how to ghc -02
So the following is for 50000 Hammings to 2.38E+30. And this is further proof my code is garbage.
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.000s ( 0.916s elapsed)
GC time 0.047s ( 0.041s elapsed)
EXIT time 0.000s ( 0.005s elapsed)
Total time 0.047s ( 0.962s elapsed)
Alloc rate 0 bytes per MUT second
Productivity 0.0% of total user, 95.8% of total elapsed
6.13.2019
@Will Ness rawks. He provided a clean and elegant revision of tia
above and it proved to be five times as fast in GHCi
. When I ghc -O2 +RTS -s
his against mine, mine was several times as fast. There had to be a compromise.
So, I started reading about fusion that I had encountered in R. Bird's Thinking Functionally with Haskell and almost immediately tried this.
mai n = mergeAll.map (iterate (*n))
mai 2 $ mai 3 $ iterate (*5) 1
It matched Will's at 0.08 for 100K Hammings in GHCi
but what really surprised me is (also for 100K Hammings.) this and especially the elapsed times. 100K is up to 2.9e+38.
TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.000s ( 0.002s elapsed)
GC time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.000s ( 0.002s elapsed)
Alloc rate 0 bytes per MUT second
Productivity 100.0% of total user, 90.2% of total elapsed
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 |