Count number of episodes that has a hash Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30 pm US/Eastern) Data science time! April 2019 and salary with experience The Ask Question Wizard is Live!Display number with leading zerosHow do I check if a string is a number (float)?How to get line count cheaply in Python?Count the number occurrences of a character in a stringHow do I get the number of elements in a list in Python?How to print number with commas as thousands separators?How can I count the occurrences of a list item?How do I get the row count of a pandas DataFrame?SQLAlchemy + MySQL Large Table Performance AdviceHow to count child table items with or without join to parent table using SQLAlchemy?
Could a cockatrice have parasitic embryos?
Is there a way to fake a method response using Mock or Stubs?
Is it OK if I do not take the receipt in Germany?
How do I deal with an erroneously large refund?
Where to find documentation for `whois` command options?
Why isn't everyone flabbergasted about Bran's "gift"?
Determinant of a matrix with 2 equal rows
RIP Packet Format
What's called a person who work as someone who puts products on shelves in stores?
Is it appropriate to mention a relatable company blog post when you're asked about the company?
What is the numbering system used for the DSN dishes?
Was there ever a LEGO store in Miami International Airport?
"Working on a knee"
Has a Nobel Peace laureate ever been accused of war crimes?
Writing a T-SQL stored procedure to receive 4 numbers and insert them into a table
Does using the Inspiration rules for character defects encourage My Guy Syndrome?
/bin/ls sorts differently than just ls
Suing a Police Officer Instead of the Police Department
Why does Java have support for time zone offsets with seconds precision?
All ASCII characters with a given bit count
Eigenvalues of the Laplacian of the directed De Bruijn graph
How was Lagrange appointed professor of mathematics so early?
Why I cannot instantiate a class whose constructor is private in a friend class?
Variable does not exist: sObjectType (Task.sObjectType)
Count number of episodes that has a hash
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30 pm US/Eastern)
Data science time! April 2019 and salary with experience
The Ask Question Wizard is Live!Display number with leading zerosHow do I check if a string is a number (float)?How to get line count cheaply in Python?Count the number occurrences of a character in a stringHow do I get the number of elements in a list in Python?How to print number with commas as thousands separators?How can I count the occurrences of a list item?How do I get the row count of a pandas DataFrame?SQLAlchemy + MySQL Large Table Performance AdviceHow to count child table items with or without join to parent table using SQLAlchemy?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I would like some help with a sql query, i'm using SQLAlchemy but I don't even understand how i can express the query in raw sql.
Im phashing every frames of all videos in a season and adding them to the db.
My goal is to find intros the videos checking for the same reaccuring frames in the videos.
My table looks like:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
---+------+-----+------+-------+-------
|2 |1337 |a1a1a|1 |1 |68
---+------+-----+------+-------+-------
|3 |1337 |a1a1b|1 |2 |92
---+------+-----+------+-------+-------
|4 |1337 |a1a1a|1 |2 |116
---+------+-----+------+-------+-------
|5 |1337 |a1a1a|1 |3 |42
---+------+-----+------+-------+-------
|6 |1337 |a1a1a|1 |3 |42
The result im looking for a is a list of rows where the hash matches in n number of episodes(it can only match on episode at the time) and has the same tvdbid and season number.
At the moment i'm doing:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Hashes(Base):
__tablename__ = 'hashes'
id = sa.Column(sa.Integer, primary_key=True)
season = sa.Column(sa.Integer)
episode = sa.Column(sa.Integer)
tvdbid = sa.Column(sa.Text(length=100))
hash = sa.Column(sa.Text(length=16))
offset = sa.Column(sa.Integer)
h = Hashes.__table__
async def some_web_request(request):
# I need to use raw sql or core as the db library requires it.
# my cli tool uses a sync method to insert the rows in the db.
query = h.select().where(sa.and_(h.c.tvdbid ==
request.path_params['tvdbid'],
h.c.season == request.path_params['season'])).group_by('hash', 'episode')
result = await DB.fetch_all(query)
return result
This seems to work just fine, but it isn't exactly what I want so I have to clean up up with python and it will not be viable in the long run. The table will have have between 5 - 500 million rows.
My current "work around":
from collections import defaultdict
def clean_up(result):
d = defaultdict(set)
for row in result:
d[row.hash].add(row.episode)
final_result = []
for k, v in d.items():
if (l) > 4: # 4 is the number of episodes.
final_result.append(k)
return final_result
The desired output should have been:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
as the hash needs to present in atleast 50% of the episodes.
or it could simply be a1a1a i dont really need to entire rows now. (this will needed laster to check for recaps etc.)
python sqlalchemy
add a comment |
I would like some help with a sql query, i'm using SQLAlchemy but I don't even understand how i can express the query in raw sql.
Im phashing every frames of all videos in a season and adding them to the db.
My goal is to find intros the videos checking for the same reaccuring frames in the videos.
My table looks like:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
---+------+-----+------+-------+-------
|2 |1337 |a1a1a|1 |1 |68
---+------+-----+------+-------+-------
|3 |1337 |a1a1b|1 |2 |92
---+------+-----+------+-------+-------
|4 |1337 |a1a1a|1 |2 |116
---+------+-----+------+-------+-------
|5 |1337 |a1a1a|1 |3 |42
---+------+-----+------+-------+-------
|6 |1337 |a1a1a|1 |3 |42
The result im looking for a is a list of rows where the hash matches in n number of episodes(it can only match on episode at the time) and has the same tvdbid and season number.
At the moment i'm doing:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Hashes(Base):
__tablename__ = 'hashes'
id = sa.Column(sa.Integer, primary_key=True)
season = sa.Column(sa.Integer)
episode = sa.Column(sa.Integer)
tvdbid = sa.Column(sa.Text(length=100))
hash = sa.Column(sa.Text(length=16))
offset = sa.Column(sa.Integer)
h = Hashes.__table__
async def some_web_request(request):
# I need to use raw sql or core as the db library requires it.
# my cli tool uses a sync method to insert the rows in the db.
query = h.select().where(sa.and_(h.c.tvdbid ==
request.path_params['tvdbid'],
h.c.season == request.path_params['season'])).group_by('hash', 'episode')
result = await DB.fetch_all(query)
return result
This seems to work just fine, but it isn't exactly what I want so I have to clean up up with python and it will not be viable in the long run. The table will have have between 5 - 500 million rows.
My current "work around":
from collections import defaultdict
def clean_up(result):
d = defaultdict(set)
for row in result:
d[row.hash].add(row.episode)
final_result = []
for k, v in d.items():
if (l) > 4: # 4 is the number of episodes.
final_result.append(k)
return final_result
The desired output should have been:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
as the hash needs to present in atleast 50% of the episodes.
or it could simply be a1a1a i dont really need to entire rows now. (this will needed laster to check for recaps etc.)
python sqlalchemy
I'm sorry if I didn't understand your data well, but wouldn't it be enough to justSELECT DISTINCT
? Or rather make aSELECT DISTINCT
out of yourSELECT
results ?
– reportgunner
Mar 22 at 15:00
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36
add a comment |
I would like some help with a sql query, i'm using SQLAlchemy but I don't even understand how i can express the query in raw sql.
Im phashing every frames of all videos in a season and adding them to the db.
My goal is to find intros the videos checking for the same reaccuring frames in the videos.
My table looks like:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
---+------+-----+------+-------+-------
|2 |1337 |a1a1a|1 |1 |68
---+------+-----+------+-------+-------
|3 |1337 |a1a1b|1 |2 |92
---+------+-----+------+-------+-------
|4 |1337 |a1a1a|1 |2 |116
---+------+-----+------+-------+-------
|5 |1337 |a1a1a|1 |3 |42
---+------+-----+------+-------+-------
|6 |1337 |a1a1a|1 |3 |42
The result im looking for a is a list of rows where the hash matches in n number of episodes(it can only match on episode at the time) and has the same tvdbid and season number.
At the moment i'm doing:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Hashes(Base):
__tablename__ = 'hashes'
id = sa.Column(sa.Integer, primary_key=True)
season = sa.Column(sa.Integer)
episode = sa.Column(sa.Integer)
tvdbid = sa.Column(sa.Text(length=100))
hash = sa.Column(sa.Text(length=16))
offset = sa.Column(sa.Integer)
h = Hashes.__table__
async def some_web_request(request):
# I need to use raw sql or core as the db library requires it.
# my cli tool uses a sync method to insert the rows in the db.
query = h.select().where(sa.and_(h.c.tvdbid ==
request.path_params['tvdbid'],
h.c.season == request.path_params['season'])).group_by('hash', 'episode')
result = await DB.fetch_all(query)
return result
This seems to work just fine, but it isn't exactly what I want so I have to clean up up with python and it will not be viable in the long run. The table will have have between 5 - 500 million rows.
My current "work around":
from collections import defaultdict
def clean_up(result):
d = defaultdict(set)
for row in result:
d[row.hash].add(row.episode)
final_result = []
for k, v in d.items():
if (l) > 4: # 4 is the number of episodes.
final_result.append(k)
return final_result
The desired output should have been:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
as the hash needs to present in atleast 50% of the episodes.
or it could simply be a1a1a i dont really need to entire rows now. (this will needed laster to check for recaps etc.)
python sqlalchemy
I would like some help with a sql query, i'm using SQLAlchemy but I don't even understand how i can express the query in raw sql.
Im phashing every frames of all videos in a season and adding them to the db.
My goal is to find intros the videos checking for the same reaccuring frames in the videos.
My table looks like:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
---+------+-----+------+-------+-------
|2 |1337 |a1a1a|1 |1 |68
---+------+-----+------+-------+-------
|3 |1337 |a1a1b|1 |2 |92
---+------+-----+------+-------+-------
|4 |1337 |a1a1a|1 |2 |116
---+------+-----+------+-------+-------
|5 |1337 |a1a1a|1 |3 |42
---+------+-----+------+-------+-------
|6 |1337 |a1a1a|1 |3 |42
The result im looking for a is a list of rows where the hash matches in n number of episodes(it can only match on episode at the time) and has the same tvdbid and season number.
At the moment i'm doing:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Hashes(Base):
__tablename__ = 'hashes'
id = sa.Column(sa.Integer, primary_key=True)
season = sa.Column(sa.Integer)
episode = sa.Column(sa.Integer)
tvdbid = sa.Column(sa.Text(length=100))
hash = sa.Column(sa.Text(length=16))
offset = sa.Column(sa.Integer)
h = Hashes.__table__
async def some_web_request(request):
# I need to use raw sql or core as the db library requires it.
# my cli tool uses a sync method to insert the rows in the db.
query = h.select().where(sa.and_(h.c.tvdbid ==
request.path_params['tvdbid'],
h.c.season == request.path_params['season'])).group_by('hash', 'episode')
result = await DB.fetch_all(query)
return result
This seems to work just fine, but it isn't exactly what I want so I have to clean up up with python and it will not be viable in the long run. The table will have have between 5 - 500 million rows.
My current "work around":
from collections import defaultdict
def clean_up(result):
d = defaultdict(set)
for row in result:
d[row.hash].add(row.episode)
final_result = []
for k, v in d.items():
if (l) > 4: # 4 is the number of episodes.
final_result.append(k)
return final_result
The desired output should have been:
|id|tvdbid|hash |season|episode|offset
---+------+-----+------+-------+------
|1 |1337 |a1a1a|1 |1 |42
as the hash needs to present in atleast 50% of the episodes.
or it could simply be a1a1a i dont really need to entire rows now. (this will needed laster to check for recaps etc.)
python sqlalchemy
python sqlalchemy
edited Mar 22 at 15:35
steffen fredriksen
asked Mar 22 at 14:58
steffen fredriksensteffen fredriksen
62
62
I'm sorry if I didn't understand your data well, but wouldn't it be enough to justSELECT DISTINCT
? Or rather make aSELECT DISTINCT
out of yourSELECT
results ?
– reportgunner
Mar 22 at 15:00
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36
add a comment |
I'm sorry if I didn't understand your data well, but wouldn't it be enough to justSELECT DISTINCT
? Or rather make aSELECT DISTINCT
out of yourSELECT
results ?
– reportgunner
Mar 22 at 15:00
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36
I'm sorry if I didn't understand your data well, but wouldn't it be enough to just
SELECT DISTINCT
? Or rather make a SELECT DISTINCT
out of your SELECT
results ?– reportgunner
Mar 22 at 15:00
I'm sorry if I didn't understand your data well, but wouldn't it be enough to just
SELECT DISTINCT
? Or rather make a SELECT DISTINCT
out of your SELECT
results ?– reportgunner
Mar 22 at 15:00
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55302403%2fcount-number-of-episodes-that-has-a-hash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55302403%2fcount-number-of-episodes-that-has-a-hash%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I'm sorry if I didn't understand your data well, but wouldn't it be enough to just
SELECT DISTINCT
? Or rather make aSELECT DISTINCT
out of yourSELECT
results ?– reportgunner
Mar 22 at 15:00
could you please update the question by adding some more rows of sample data & the desired result. That would help in understanding the problem better
– Haleemur Ali
Mar 22 at 15:12
I have updated what im trying to do, with more sample data and the desired output. Thanks!
– steffen fredriksen
Mar 22 at 15:36