Removing true duplicates from greenplum tableHow can I remove duplicate rows?Add a column with a default value to an existing table in SQL ServerHow do you remove duplicates from a list whilst preserving order?How do I UPDATE from a SELECT in SQL Server?Finding duplicate values in a SQL tableRemoving duplicates in listsRemove duplicate values from JS arrayResult of one query into another queryHow to populate the data from different schema and different table to other schema to other tableMySQL create index sum of two columns data in one table
The No-Free-Lunch Theorem and K-NN consistency
Disambiguation of "nobis vobis" and "nobis nobis"
Compelling story with the world as a villain
How do I prevent other wifi networks from showing up on my computer?
How would a Creature that needs to be seen by Humans evolve?
Very slow boot time and poor perfomance
“T” in subscript in formulas
Are the A380 engines interchangeable (given they are not all equipped with reverse)?
Can RMSE and MAE have the same value?
How do proponents of Sola Scriptura address the ministry of those Apostles who authored no parts of Scripture?
How do we calculate energy of food?
How to determine car loan length as a function of how long I plan to keep a car
What is the difference between Major and Minor Bug?
Prevent use of CNAME Record for Untrusted Domain
Network helper class with retry logic on failure
Why do all fields in a QFT transform like *irreducible* representations of some group?
Lost property on Portuguese trains
Is gzip atomic?
Tex Quotes(UVa 272)
Duplicate instruments in unison in an orchestra
What verb is かまされる?
Did anyone try to find the little box that held Professor Moriarty and his wife after the crash?
Can I get temporary health insurance while moving to the US?
Did a flight controller ever answer Flight with a no-go?
Removing true duplicates from greenplum table
How can I remove duplicate rows?Add a column with a default value to an existing table in SQL ServerHow do you remove duplicates from a list whilst preserving order?How do I UPDATE from a SELECT in SQL Server?Finding duplicate values in a SQL tableRemoving duplicates in listsRemove duplicate values from JS arrayResult of one query into another queryHow to populate the data from different schema and different table to other schema to other tableMySQL create index sum of two columns data in one table
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am trying to remove true duplicates from a table. I have removed dupes multiple times in past but I'm not able to figure what's wrong with my syntax with this one.
My code -
DELETE
FROM my_table_name
WHERE (
column1, column2, column3, column4, column5, column6, column7, column8, column9) IN
(
SELECT Row_number() OVER( partition BY column1, column2,column3, column4,column5,column6,column7,column8 ORDER BY column2 DESC, column3 ASC ) AS row_num,
column1,
column2,
column3,
column4,
column5,
column6,
column7,
column8,
column9
FROM my_table_name
WHERE column1='some_value') a
WHERE row_num=2;
Error
********** Error **********
ERROR: syntax error at or near ""a""
SQL state: 42601
Character: 1607
I can see that the error is on creating the alias a subquery. But I'm not able to pin point what's wrong here.
Any help is appreciated
Edit 1 -
If I remove a, I get the below error
********** Error **********
ERROR: syntax error at or near "where"
SQL state: 42601
Character: 1608
sql duplicates greenplum
add a comment |
I am trying to remove true duplicates from a table. I have removed dupes multiple times in past but I'm not able to figure what's wrong with my syntax with this one.
My code -
DELETE
FROM my_table_name
WHERE (
column1, column2, column3, column4, column5, column6, column7, column8, column9) IN
(
SELECT Row_number() OVER( partition BY column1, column2,column3, column4,column5,column6,column7,column8 ORDER BY column2 DESC, column3 ASC ) AS row_num,
column1,
column2,
column3,
column4,
column5,
column6,
column7,
column8,
column9
FROM my_table_name
WHERE column1='some_value') a
WHERE row_num=2;
Error
********** Error **********
ERROR: syntax error at or near ""a""
SQL state: 42601
Character: 1607
I can see that the error is on creating the alias a subquery. But I'm not able to pin point what's wrong here.
Any help is appreciated
Edit 1 -
If I remove a, I get the below error
********** Error **********
ERROR: syntax error at or near "where"
SQL state: 42601
Character: 1608
sql duplicates greenplum
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15
add a comment |
I am trying to remove true duplicates from a table. I have removed dupes multiple times in past but I'm not able to figure what's wrong with my syntax with this one.
My code -
DELETE
FROM my_table_name
WHERE (
column1, column2, column3, column4, column5, column6, column7, column8, column9) IN
(
SELECT Row_number() OVER( partition BY column1, column2,column3, column4,column5,column6,column7,column8 ORDER BY column2 DESC, column3 ASC ) AS row_num,
column1,
column2,
column3,
column4,
column5,
column6,
column7,
column8,
column9
FROM my_table_name
WHERE column1='some_value') a
WHERE row_num=2;
Error
********** Error **********
ERROR: syntax error at or near ""a""
SQL state: 42601
Character: 1607
I can see that the error is on creating the alias a subquery. But I'm not able to pin point what's wrong here.
Any help is appreciated
Edit 1 -
If I remove a, I get the below error
********** Error **********
ERROR: syntax error at or near "where"
SQL state: 42601
Character: 1608
sql duplicates greenplum
I am trying to remove true duplicates from a table. I have removed dupes multiple times in past but I'm not able to figure what's wrong with my syntax with this one.
My code -
DELETE
FROM my_table_name
WHERE (
column1, column2, column3, column4, column5, column6, column7, column8, column9) IN
(
SELECT Row_number() OVER( partition BY column1, column2,column3, column4,column5,column6,column7,column8 ORDER BY column2 DESC, column3 ASC ) AS row_num,
column1,
column2,
column3,
column4,
column5,
column6,
column7,
column8,
column9
FROM my_table_name
WHERE column1='some_value') a
WHERE row_num=2;
Error
********** Error **********
ERROR: syntax error at or near ""a""
SQL state: 42601
Character: 1607
I can see that the error is on creating the alias a subquery. But I'm not able to pin point what's wrong here.
Any help is appreciated
Edit 1 -
If I remove a, I get the below error
********** Error **********
ERROR: syntax error at or near "where"
SQL state: 42601
Character: 1608
sql duplicates greenplum
sql duplicates greenplum
edited Mar 27 at 18:42
Pirate X
asked Mar 27 at 18:32
Pirate XPirate X
1,7663 gold badges20 silver badges36 bronze badges
1,7663 gold badges20 silver badges36 bronze badges
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15
add a comment |
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15
add a comment |
1 Answer
1
active
oldest
votes
If you have duplicate rows, you can't just delete all but one of the records in a single command. You have to delete all duplicates and then insert just one version for each duplicate row or build new table (preferred) without duplicates.
Let's start with the preferred method which is to create a new table without the duplicates. This solution utilizes disk space in the most efficient way possible rather than having a fragmented table.
Example:
create table foo
(id int, fname text)
with (appendonly=true)
distributed by (id);
Insert some data with duplicates:
insert into foo values (1, 'jon');
insert into foo values (1, 'jon');
insert into foo values (2, 'bill');
insert into foo values (2, 'bill');
insert into foo values (3, 'sue');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
Create a new version of the table without the duplicates:
create table foo_new with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by id) as row_num, id, fname
from foo
) as sub
where sub.row_num = 1
distributed by (id);
And now rename the tables:
alter table foo rename to foo_old;
alter table foo_new rename to foo;
The second method is to use DELETE but you'll see that it needs more steps to complete.
First, create a temp table with the IDs you want to delete. You typically don't have primary keys enforced in Greenplum but you still have a logical PK. Columns like customer_id, product_id, etc are all in your data. So, find the dups first based on the PK.
drop table if exists foo_pk_delete;
create temporary table foo_pk_delete with (appendonly=true) as
select id
from foo
group by id
having count(*) > 1
distributed by (id);
Next, get the entire row for each duplicate but only one version of it.
drop table if exists foo_dedup;
create temporary table foo_dedup with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by f.id) as row_num, f.id, f.fname
from foo f
join foo_pk_delete fd on f.id = fd.id
) as sub
where sub.row_num = 1
distributed by (id);
Now you can delete the duplicates:
delete
from foo f
using foo_pk_delete fk
where f.id = fk.id;
And then you can insert the deduplicated data back into the table.
insert into foo (id, fname)
select id, fname from foo_dedup;
You'll want to vacuum your table after this data manipulation.
vacuum foo;
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384282%2fremoving-true-duplicates-from-greenplum-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If you have duplicate rows, you can't just delete all but one of the records in a single command. You have to delete all duplicates and then insert just one version for each duplicate row or build new table (preferred) without duplicates.
Let's start with the preferred method which is to create a new table without the duplicates. This solution utilizes disk space in the most efficient way possible rather than having a fragmented table.
Example:
create table foo
(id int, fname text)
with (appendonly=true)
distributed by (id);
Insert some data with duplicates:
insert into foo values (1, 'jon');
insert into foo values (1, 'jon');
insert into foo values (2, 'bill');
insert into foo values (2, 'bill');
insert into foo values (3, 'sue');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
Create a new version of the table without the duplicates:
create table foo_new with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by id) as row_num, id, fname
from foo
) as sub
where sub.row_num = 1
distributed by (id);
And now rename the tables:
alter table foo rename to foo_old;
alter table foo_new rename to foo;
The second method is to use DELETE but you'll see that it needs more steps to complete.
First, create a temp table with the IDs you want to delete. You typically don't have primary keys enforced in Greenplum but you still have a logical PK. Columns like customer_id, product_id, etc are all in your data. So, find the dups first based on the PK.
drop table if exists foo_pk_delete;
create temporary table foo_pk_delete with (appendonly=true) as
select id
from foo
group by id
having count(*) > 1
distributed by (id);
Next, get the entire row for each duplicate but only one version of it.
drop table if exists foo_dedup;
create temporary table foo_dedup with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by f.id) as row_num, f.id, f.fname
from foo f
join foo_pk_delete fd on f.id = fd.id
) as sub
where sub.row_num = 1
distributed by (id);
Now you can delete the duplicates:
delete
from foo f
using foo_pk_delete fk
where f.id = fk.id;
And then you can insert the deduplicated data back into the table.
insert into foo (id, fname)
select id, fname from foo_dedup;
You'll want to vacuum your table after this data manipulation.
vacuum foo;
add a comment |
If you have duplicate rows, you can't just delete all but one of the records in a single command. You have to delete all duplicates and then insert just one version for each duplicate row or build new table (preferred) without duplicates.
Let's start with the preferred method which is to create a new table without the duplicates. This solution utilizes disk space in the most efficient way possible rather than having a fragmented table.
Example:
create table foo
(id int, fname text)
with (appendonly=true)
distributed by (id);
Insert some data with duplicates:
insert into foo values (1, 'jon');
insert into foo values (1, 'jon');
insert into foo values (2, 'bill');
insert into foo values (2, 'bill');
insert into foo values (3, 'sue');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
Create a new version of the table without the duplicates:
create table foo_new with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by id) as row_num, id, fname
from foo
) as sub
where sub.row_num = 1
distributed by (id);
And now rename the tables:
alter table foo rename to foo_old;
alter table foo_new rename to foo;
The second method is to use DELETE but you'll see that it needs more steps to complete.
First, create a temp table with the IDs you want to delete. You typically don't have primary keys enforced in Greenplum but you still have a logical PK. Columns like customer_id, product_id, etc are all in your data. So, find the dups first based on the PK.
drop table if exists foo_pk_delete;
create temporary table foo_pk_delete with (appendonly=true) as
select id
from foo
group by id
having count(*) > 1
distributed by (id);
Next, get the entire row for each duplicate but only one version of it.
drop table if exists foo_dedup;
create temporary table foo_dedup with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by f.id) as row_num, f.id, f.fname
from foo f
join foo_pk_delete fd on f.id = fd.id
) as sub
where sub.row_num = 1
distributed by (id);
Now you can delete the duplicates:
delete
from foo f
using foo_pk_delete fk
where f.id = fk.id;
And then you can insert the deduplicated data back into the table.
insert into foo (id, fname)
select id, fname from foo_dedup;
You'll want to vacuum your table after this data manipulation.
vacuum foo;
add a comment |
If you have duplicate rows, you can't just delete all but one of the records in a single command. You have to delete all duplicates and then insert just one version for each duplicate row or build new table (preferred) without duplicates.
Let's start with the preferred method which is to create a new table without the duplicates. This solution utilizes disk space in the most efficient way possible rather than having a fragmented table.
Example:
create table foo
(id int, fname text)
with (appendonly=true)
distributed by (id);
Insert some data with duplicates:
insert into foo values (1, 'jon');
insert into foo values (1, 'jon');
insert into foo values (2, 'bill');
insert into foo values (2, 'bill');
insert into foo values (3, 'sue');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
Create a new version of the table without the duplicates:
create table foo_new with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by id) as row_num, id, fname
from foo
) as sub
where sub.row_num = 1
distributed by (id);
And now rename the tables:
alter table foo rename to foo_old;
alter table foo_new rename to foo;
The second method is to use DELETE but you'll see that it needs more steps to complete.
First, create a temp table with the IDs you want to delete. You typically don't have primary keys enforced in Greenplum but you still have a logical PK. Columns like customer_id, product_id, etc are all in your data. So, find the dups first based on the PK.
drop table if exists foo_pk_delete;
create temporary table foo_pk_delete with (appendonly=true) as
select id
from foo
group by id
having count(*) > 1
distributed by (id);
Next, get the entire row for each duplicate but only one version of it.
drop table if exists foo_dedup;
create temporary table foo_dedup with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by f.id) as row_num, f.id, f.fname
from foo f
join foo_pk_delete fd on f.id = fd.id
) as sub
where sub.row_num = 1
distributed by (id);
Now you can delete the duplicates:
delete
from foo f
using foo_pk_delete fk
where f.id = fk.id;
And then you can insert the deduplicated data back into the table.
insert into foo (id, fname)
select id, fname from foo_dedup;
You'll want to vacuum your table after this data manipulation.
vacuum foo;
If you have duplicate rows, you can't just delete all but one of the records in a single command. You have to delete all duplicates and then insert just one version for each duplicate row or build new table (preferred) without duplicates.
Let's start with the preferred method which is to create a new table without the duplicates. This solution utilizes disk space in the most efficient way possible rather than having a fragmented table.
Example:
create table foo
(id int, fname text)
with (appendonly=true)
distributed by (id);
Insert some data with duplicates:
insert into foo values (1, 'jon');
insert into foo values (1, 'jon');
insert into foo values (2, 'bill');
insert into foo values (2, 'bill');
insert into foo values (3, 'sue');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
insert into foo values (4, 'ted');
Create a new version of the table without the duplicates:
create table foo_new with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by id) as row_num, id, fname
from foo
) as sub
where sub.row_num = 1
distributed by (id);
And now rename the tables:
alter table foo rename to foo_old;
alter table foo_new rename to foo;
The second method is to use DELETE but you'll see that it needs more steps to complete.
First, create a temp table with the IDs you want to delete. You typically don't have primary keys enforced in Greenplum but you still have a logical PK. Columns like customer_id, product_id, etc are all in your data. So, find the dups first based on the PK.
drop table if exists foo_pk_delete;
create temporary table foo_pk_delete with (appendonly=true) as
select id
from foo
group by id
having count(*) > 1
distributed by (id);
Next, get the entire row for each duplicate but only one version of it.
drop table if exists foo_dedup;
create temporary table foo_dedup with (appendonly=true) as
select id, fname
from (
select row_number() over (partition by f.id) as row_num, f.id, f.fname
from foo f
join foo_pk_delete fd on f.id = fd.id
) as sub
where sub.row_num = 1
distributed by (id);
Now you can delete the duplicates:
delete
from foo f
using foo_pk_delete fk
where f.id = fk.id;
And then you can insert the deduplicated data back into the table.
insert into foo (id, fname)
select id, fname from foo_dedup;
You'll want to vacuum your table after this data manipulation.
vacuum foo;
answered Mar 27 at 19:47
Jon RobertsJon Roberts
1,5684 silver badges8 bronze badges
1,5684 silver badges8 bronze badges
add a comment |
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55384282%2fremoving-true-duplicates-from-greenplum-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Try removing the 'a' alias, you are not even using it.
– Michael Muryn
Mar 27 at 18:36
I'm not familiar with greenplum tables at all (I'm specifically T-SQL), but help me understand this: why do you have 3 where clauses for 2 queries - albeit nested? If I were in T-SQL, I'd probably suggest changing the 3rd 'Where' statement to an 'AND' filter.
– Tiny Haitian
Mar 27 at 19:11
I tried with 'and' instead of 'where' in the last line as well. No help. The reason I have last 'where' clause is because I cannot have row_num filter inside the subquery 'a' because it's a function and not column name.
– Pirate X
Mar 27 at 19:15