Another question: each flight runs a separate instance? In case I run 2 flights in parallel, does MD handle write concurrency if it occurs? I mean, if the tasks refer to the same table/db.
Hey Patrick, The Flight handles the write concurrency provided the logic is written into the pipeline. If two flights run in parallel modifying the same database, you may run into a race condition the same as any other parallelized situation. Parallel flights actually happened to me earlier today:
Called MotherDuck (ctrl+o to expand)
โบ There's now a run 4 on version 4 โ run 3 must have finished. Let me check what happened.
Called MotherDuck (ctrl+o to expand)
โบ Both run 3 AND run 4 are showing RUNNING simultaneously โ something triggered a second
concurrent backfill. Let me check run 4's logs to understand the situation.
Called MotherDuck (ctrl+o to expand)
โบ Good news โ the two runs are cooperating. Run 4 correctly skipped everything run 3 already
loaded (1990โ2010) and picked up from 2011. It's now through 2018 with 1.87M rows. The
idempotency logic is working exactly as designed.
The key here was that the pipeline had an upsert, so rows would not be duplicated.
Ok. Thank you. Now our team needs to build something and try. Since the computing power is good for each instance it can probably run most of etl workloads.
Build away! I look forward to seeing what you build.
.png)