'Setting the values ​of a COUNT(*)

I'm using SAS and I'm stuck when I want to relay this in SQL :

I have a column that counts the number of line repetitions :

DUPLICATES | NUMBER   |  DATE
---------------------------------
 2         |  123     |  22-05
 2         |  123     |  22-05
 3         |  222     |  21-05
 3         |  222     |  21-05
 3         |  222     |  21-05
 1         |  111     |  23-05
 ...

Code :

CREATE WORK.TABLE TABLE_TEST AS
   SELECT COUNT(*) AS DUPLICATES,
          NUMBER,
          DATE
   FROM WORK.INITIAL_TABLE
   GROUP BY NUMBER, DATE;

I would like to block the result of the COUNT(*) so that when I do a SELECT DISTINCT... on the table, I only have one row per value but with its initial number of repetitions :

DUPLICATES | NUMBER   |  DATE
---------------------------------
 2         |  123     |  22-05
 3         |  222     |  21-05
 1         |  111     |  23-05
 ...

I tried to make a copy of the DUPLICATES column but it also copied the calculation method COUNT(*) which gives me a bad result when I do a SELECT DISTINCT :

DUPLICATES_V2 | NUMBER   |  DATE
---------------------------------
 1            |  123     |  22-05
 1            |  222     |  21-05
 1            |  111     |  23-05
 ...

If you have an idea of ​​how to proceed or documentation it would help me a lot. Thanks in advance



Solution 1:[1]

There are several ways to do it, IMHO the simplest is proc sort:

proc sort data=work.INITIAL_TABLE out=TABLE_TEST nodupkey;
  by NUMBER DATE;
run;

Solution 2:[2]

I think you want PROC FREQ here. I suspect your query isn't as shown and you have another column included, but not in the GROUP BY which is causing your issues. The query as shown has errors and would not successfully run.

proc freq data=initial_data noprint;
table number*date / out=want;
run;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Egor Lipchinskiy
Solution 2 Reeza