sql - How can I normalize the capitalization of a group-by column? -
on sql server configured case-insensitive, group by
can have interesting results when [n][var]char
column not first group by
column. essentially, looks whatever row encounters "first" (where "first" undefined in absence of order): wins grouping. example:
select x.[day], x.[name], count(1) [count] ( select 1 [day], 'a' [name] union select 1, 'a' union select 2, 'a' union select 2, 'a' ) x group x.[day], x.[name]
which returns, me:
day name count ----------- ---- ----------- 1 2 2 2
using min(x.[name])
has no effect, since grouping happened.
i can't add order by
before group by
, illegal; , adding order by
after group by
defines output order after grouping - still gives a
, a
.
so: there sane way of doing capitalization @ least consistent groupings? (i'll leave day problem of being consistent separate runs)
desired output, either:
day name count ----------- ---- ----------- 1 2 2 2
or:
day name count ----------- ---- ----------- 1 2 2 2
edit: without destroying capitalisation when consistent between groups. no upper/lower. if 1 of groups consistently has value bcdef
, want result of row bcdef
, not bcdef
or bcdef
.
i use windowing functions this. using row_number
, partitioning using case insensitive collation, ordering case sensitive one, choose consistently 1 result original capitalisation, group them if same:
with cte ( select *, rn = row_number() over(partition [day], [name] order [name] collate sql_latin1_general_cp1_cs_as), n = count(*) over(partition [day], [name]) ( select 1 [day], 'a' [name] union select 1, 'a' union select 2, 'a' union select 2, 'a' union select 3, 'bcdef' union select 3, 'bcdef') x ) select * cte rn = 1;
it returns:
╔═════╦═══════╦════╦═══╗ ║ day ║ name ║ rn ║ n ║ ╠═════╬═══════╬════╬═══╣ ║ 1 ║ ║ 1 ║ 2 ║ ║ 2 ║ ║ 1 ║ 2 ║ ║ 3 ║ bcdef ║ 1 ║ 2 ║ ╚═════╩═══════╩════╩═══╝
following @andriym's comment, if want same capitalisation on whole result set, , not same day, can use:
with cte ( select *, rn = row_number() over(partition [day], [name] order [name] collate sql_latin1_general_cp1_cs_as), n = count(*) over(partition [day], [name]) ( select 1 [day], 'a' [name] union select 1, 'a' union select 2, 'a' union select 2, 'a' union select 3, 'bcdef' union select 3, 'bcdef') x ) select [day], max([name] collate sql_latin1_general_cp1_cs_as) on (partition [name]) [name], n cte rn = 1;
Comments
Post a Comment