Can I have tasks under one DAG with different start dates in Airflow?How to manage a million airflow tasks with different start dates?Airflow: changing the crontab time for a DAG in AirflowRunning dags with different frequency | AirflowAssigning tasks to specific machines with airflowTask lineage between Dependant Dags in Airflowclear an upstream task in airflow within the dagAirflow DAG does not skip tasks after BranchPythonOperator or ShortCircuitOperatorRunning airflow DAG with proper schedulingAirflow 1.9.0 is queuing but not launching tasksAirFlow DAG Get stuck in running stateAirflow dag gets stuck after renaming a task

Does WiFi affect the quality of images downloaded from the internet?

Why is the concept of the Null hypothesis associated with the student's t distribution?

Are athlete's college degrees discounted by employers and graduate school admissions?

When editor does not respond to the request for withdrawal

Convert GE Load Center to main breaker

If absolute velocity does not exist, how can we say a rocket accelerates in empty space?

Is it advisable to add a location heads-up when a scene changes in a novel?

ISP is not hashing the password I log in with online. Should I take any action?

I am caught when I was about to steal some candies

Was the Lonely Mountain, where Smaug lived, a volcano?

Nth term of Van Eck Sequence

About the paper by Buekenhout, Delandtsheer, Doyen, Kleidman, Liebeck and Saxl

What do I need to do, tax-wise, for a sudden windfall?

Is Jesus the last Prophet?

What is the theme of analysis?

Why would a home insurer offer a discount based on credit score?

How (un)safe is it to ride barefoot?

I sent an angry e-mail to my interviewers about a conflict at my home institution. Could this affect my application?

Is plausible to have subspecies with & without separate sexes?

Placement of positioning lights on A320 winglets

Is this Homebrew Eldritch Invocation, Accursed Memory, balanced?

What does BREAD stand for while drafting?

When to use the uncountable form of a noun?

Simple log rotation script



Can I have tasks under one DAG with different start dates in Airflow?


How to manage a million airflow tasks with different start dates?Airflow: changing the crontab time for a DAG in AirflowRunning dags with different frequency | AirflowAssigning tasks to specific machines with airflowTask lineage between Dependant Dags in Airflowclear an upstream task in airflow within the dagAirflow DAG does not skip tasks after BranchPythonOperator or ShortCircuitOperatorRunning airflow DAG with proper schedulingAirflow 1.9.0 is queuing but not launching tasksAirFlow DAG Get stuck in running stateAirflow dag gets stuck after renaming a task






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I have a DAG which runs two tasks: A and B.



Instead of specifying the start_date on DAG level, I have added it as an attribute to the operators (I am using a PythonOperator in this case) and removed it form the DAG dictionary. Both tasks run daily.



The start_date for A is 2013-01-01 and the start_date for B is 2015-01-01. My problem is that Airflow runs for 16 days for tasks A (because I guess in my airflow.cfg I have left the default dag_concurrency = 16)from 2013-01-01 and after that it stops. The DAGs are in state running and the tasks for B are in state with no status.



Clearly I am doing something wrong and I can simply set the start_date on DAG level and have B run from the start_date of A, but that's not what i want to do.



Alternatively I can split them in separate DAGs, but again, that's not how I want to monitor them.



Is there a way to have a DAG with multiple tasks each having its own start_date? If so, how to do this?



UPDATE:



I know that a ShortCircuitOperator can be added, but this seems to work only for a flow of tasks which are dependent and there is a downstream. In my case A is independent of B.










share|improve this question
























  • How about SubDagOperator?

    – RyanTheCoder
    Mar 25 at 1:14











  • How can I use this to achieve this?

    – Newskooler
    Mar 25 at 1:50











  • Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

    – dorvak
    Mar 25 at 8:13











  • They are completely idempotent, but it makes logical sense to group them as they are very similar.

    – Newskooler
    Mar 25 at 12:03

















1















I have a DAG which runs two tasks: A and B.



Instead of specifying the start_date on DAG level, I have added it as an attribute to the operators (I am using a PythonOperator in this case) and removed it form the DAG dictionary. Both tasks run daily.



The start_date for A is 2013-01-01 and the start_date for B is 2015-01-01. My problem is that Airflow runs for 16 days for tasks A (because I guess in my airflow.cfg I have left the default dag_concurrency = 16)from 2013-01-01 and after that it stops. The DAGs are in state running and the tasks for B are in state with no status.



Clearly I am doing something wrong and I can simply set the start_date on DAG level and have B run from the start_date of A, but that's not what i want to do.



Alternatively I can split them in separate DAGs, but again, that's not how I want to monitor them.



Is there a way to have a DAG with multiple tasks each having its own start_date? If so, how to do this?



UPDATE:



I know that a ShortCircuitOperator can be added, but this seems to work only for a flow of tasks which are dependent and there is a downstream. In my case A is independent of B.










share|improve this question
























  • How about SubDagOperator?

    – RyanTheCoder
    Mar 25 at 1:14











  • How can I use this to achieve this?

    – Newskooler
    Mar 25 at 1:50











  • Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

    – dorvak
    Mar 25 at 8:13











  • They are completely idempotent, but it makes logical sense to group them as they are very similar.

    – Newskooler
    Mar 25 at 12:03













1












1








1


1






I have a DAG which runs two tasks: A and B.



Instead of specifying the start_date on DAG level, I have added it as an attribute to the operators (I am using a PythonOperator in this case) and removed it form the DAG dictionary. Both tasks run daily.



The start_date for A is 2013-01-01 and the start_date for B is 2015-01-01. My problem is that Airflow runs for 16 days for tasks A (because I guess in my airflow.cfg I have left the default dag_concurrency = 16)from 2013-01-01 and after that it stops. The DAGs are in state running and the tasks for B are in state with no status.



Clearly I am doing something wrong and I can simply set the start_date on DAG level and have B run from the start_date of A, but that's not what i want to do.



Alternatively I can split them in separate DAGs, but again, that's not how I want to monitor them.



Is there a way to have a DAG with multiple tasks each having its own start_date? If so, how to do this?



UPDATE:



I know that a ShortCircuitOperator can be added, but this seems to work only for a flow of tasks which are dependent and there is a downstream. In my case A is independent of B.










share|improve this question
















I have a DAG which runs two tasks: A and B.



Instead of specifying the start_date on DAG level, I have added it as an attribute to the operators (I am using a PythonOperator in this case) and removed it form the DAG dictionary. Both tasks run daily.



The start_date for A is 2013-01-01 and the start_date for B is 2015-01-01. My problem is that Airflow runs for 16 days for tasks A (because I guess in my airflow.cfg I have left the default dag_concurrency = 16)from 2013-01-01 and after that it stops. The DAGs are in state running and the tasks for B are in state with no status.



Clearly I am doing something wrong and I can simply set the start_date on DAG level and have B run from the start_date of A, but that's not what i want to do.



Alternatively I can split them in separate DAGs, but again, that's not how I want to monitor them.



Is there a way to have a DAG with multiple tasks each having its own start_date? If so, how to do this?



UPDATE:



I know that a ShortCircuitOperator can be added, but this seems to work only for a flow of tasks which are dependent and there is a downstream. In my case A is independent of B.







airflow






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 25 at 0:35







Newskooler

















asked Mar 25 at 0:07









NewskoolerNewskooler

71121229




71121229












  • How about SubDagOperator?

    – RyanTheCoder
    Mar 25 at 1:14











  • How can I use this to achieve this?

    – Newskooler
    Mar 25 at 1:50











  • Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

    – dorvak
    Mar 25 at 8:13











  • They are completely idempotent, but it makes logical sense to group them as they are very similar.

    – Newskooler
    Mar 25 at 12:03

















  • How about SubDagOperator?

    – RyanTheCoder
    Mar 25 at 1:14











  • How can I use this to achieve this?

    – Newskooler
    Mar 25 at 1:50











  • Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

    – dorvak
    Mar 25 at 8:13











  • They are completely idempotent, but it makes logical sense to group them as they are very similar.

    – Newskooler
    Mar 25 at 12:03
















How about SubDagOperator?

– RyanTheCoder
Mar 25 at 1:14





How about SubDagOperator?

– RyanTheCoder
Mar 25 at 1:14













How can I use this to achieve this?

– Newskooler
Mar 25 at 1:50





How can I use this to achieve this?

– Newskooler
Mar 25 at 1:50













Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

– dorvak
Mar 25 at 8:13





Can you elaborate on your use case? If the tasks are completely idempotent, just make two DAGs.

– dorvak
Mar 25 at 8:13













They are completely idempotent, but it makes logical sense to group them as they are very similar.

– Newskooler
Mar 25 at 12:03





They are completely idempotent, but it makes logical sense to group them as they are very similar.

– Newskooler
Mar 25 at 12:03












1 Answer
1






active

oldest

votes


















1














Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.



enter image description here



However, I would recommend using a Separate DAG.



Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching






share|improve this answer

























  • Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

    – Newskooler
    Mar 25 at 12:02












  • After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

    – Newskooler
    Mar 25 at 12:11











  • @Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

    – kaxil
    Mar 25 at 12:43












  • I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

    – Newskooler
    Mar 25 at 13:28











  • If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

    – kaxil
    Mar 25 at 19:37











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329782%2fcan-i-have-tasks-under-one-dag-with-different-start-dates-in-airflow%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.



enter image description here



However, I would recommend using a Separate DAG.



Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching






share|improve this answer

























  • Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

    – Newskooler
    Mar 25 at 12:02












  • After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

    – Newskooler
    Mar 25 at 12:11











  • @Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

    – kaxil
    Mar 25 at 12:43












  • I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

    – Newskooler
    Mar 25 at 13:28











  • If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

    – kaxil
    Mar 25 at 19:37















1














Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.



enter image description here



However, I would recommend using a Separate DAG.



Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching






share|improve this answer

























  • Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

    – Newskooler
    Mar 25 at 12:02












  • After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

    – Newskooler
    Mar 25 at 12:11











  • @Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

    – kaxil
    Mar 25 at 12:43












  • I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

    – Newskooler
    Mar 25 at 13:28











  • If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

    – kaxil
    Mar 25 at 19:37













1












1








1







Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.



enter image description here



However, I would recommend using a Separate DAG.



Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching






share|improve this answer















Use BranchPythonOperator and check in that task that your execution_date >= '2015-01-01' or not. If true it should execute Task B, if not it should execute a Dummy Task.



enter image description here



However, I would recommend using a Separate DAG.



Documentation on branching: https://airflow.readthedocs.io/en/1.10.2/concepts.html#branching







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 25 at 12:41

























answered Mar 25 at 11:32









kaxilkaxil

4,550929




4,550929












  • Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

    – Newskooler
    Mar 25 at 12:02












  • After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

    – Newskooler
    Mar 25 at 12:11











  • @Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

    – kaxil
    Mar 25 at 12:43












  • I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

    – Newskooler
    Mar 25 at 13:28











  • If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

    – kaxil
    Mar 25 at 19:37

















  • Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

    – Newskooler
    Mar 25 at 12:02












  • After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

    – Newskooler
    Mar 25 at 12:11











  • @Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

    – kaxil
    Mar 25 at 12:43












  • I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

    – Newskooler
    Mar 25 at 13:28











  • If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

    – kaxil
    Mar 25 at 19:37
















Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

– Newskooler
Mar 25 at 12:02






Adding a dummy task does not sound like a very good deasing. Could you please provide an example and also advice whether this is indeed the best way to address such issues? Wouldn't having two DAGs be cleaner?

– Newskooler
Mar 25 at 12:02














After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

– Newskooler
Mar 25 at 12:11





After reading about this operation, it does not do what I wish it ides, since it branches, so when I check the UI, the job will not be "skipped"

– Newskooler
Mar 25 at 12:11













@Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

– kaxil
Mar 25 at 12:43






@Newskooler I would definitely recommend using a separate DAG but I read in the comments below the question that you are looking for a different solution other than separating it to another DAG. Also, have a look at airflow.readthedocs.io/en/1.10.2/concepts.html#branching - which explains branching in more detail. And would should in the UI if your task is skipped. I have updated my answer as well.

– kaxil
Mar 25 at 12:43














I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

– Newskooler
Mar 25 at 13:28





I have about 750k tasks with different start dates. In this case, would you still have them as separate DAGs?

– Newskooler
Mar 25 at 13:28













If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

– kaxil
Mar 25 at 19:37





If you have 750k tasks, I am sure you are generating them dynamically, if so use BranchPythonOperator. I would have separated them as different DAGs based on logically-dependent groups. If that wasn't possible, I would have used BranchPythonOperator so that I can see when a task was skipped or ran.

– kaxil
Mar 25 at 19:37



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55329782%2fcan-i-have-tasks-under-one-dag-with-different-start-dates-in-airflow%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript