Async Python Is Secretly Deterministic - Comments

Async Python Is Secretly Deterministic

lexicality

> This makes it possible to write simple code that’s both concurrent and safe.

Yeah, great, my hello world program is deterministic.

What happens when you introduce I/O? Is every network call deterministic? Can you depend on reading a file taking the same amount of time and being woken up by the scheduler in the same order every time?

KraftyOne

That's the cool thing about this behavior--it doesn't matter how complex your program is, your async functions start in the same order they're called (though after that, they may interleave and finish in any order).

PufPufPuf

This is about durable execution -- being able to resume execution "from the middle", which is often done by executing from the beginning but skipping external calls. Second time around, the I/O is exactly replayed from stored values, and the "deterministic" part only refers to the async scheduler which behaves the same as long as the results are the same.

Coincidentally I have been experimenting with something very similar in JavaScript in the past and there the scheduler also has the same property.

arn3n

While not production ready, I’ve been happily surprised at this functionality when building with it. I love my interpreters to be deterministic, or when random to be explicitly seeded. It makes debugging much easier when I can rerun the same program multiple times and expect identical results.

frizlab

Interestingly I think things that should not be deterministic should actually forced not to be.

Swift for instance will explicitly make iterating on a dictionary not deterministic (by randomizing the iteration), in order to catch weird bugs early if a client relies (knowingly or not) on the specific order the elements of the dictionary are ordered.

12_throw_away

No, determinstic scheduling is not a property of async python.

Yes, the stdlib asyncio event loop does have deterministic scheduling, but that's an implementation detail and I would not rely on it for anything critical. Other event loops - for instance trio [1] - explicitly randomize startup order so that you won't accidentally write code that relies on it.

[1] https://github.com/python-trio/trio/issues/32

KraftyOne

It's been a stable (and documented) behavior of the Python standard library for almost a decade now. It's possible it may change--nothing is ever set in stone--but that would be a large change in Python that would come with plenty of warning and time for adjustment.

StableAlkyne

> but that's an implementation detail

That sounds familiar...

https://stackoverflow.com/questions/39980323/are-dictionarie...

whinvik

Is this guaranteed by the async specification? Or is this just current behavior which could be changed in a future update. Feels like a brittle dependency if its not part of the spec.

KraftyOne

It's documented behavior for the low-level API (e.g. asyncio.call_soon https://docs.python.org/3/library/asyncio-eventloop.html#asy...). More broadly, this has been a stable behavior of the Python standard library for almost a decade now. If it does change, that would be a huge behavioral change that would come with plenty of warning and time for adjustment.

jpollock

That's deterministic dispatch, as soon as it forks or communicates, it is non deterministic again?

Don't you need something like a network clock to get deterministic replay?

It can't use immediate return on replay, or else the order will change.

This makes me twitchy. The dependencies should be better modelled, and idempotency used instead of logging and caching.

lexicality

> This makes it possible to write simple code that’s both concurrent and safe.

Yeah, great, my hello world program is deterministic.

What happens when you introduce I/O? Is every network call deterministic? Can you depend on reading a file taking the same amount of time and being woken up by the scheduler in the same order every time?

KraftyOne

That's the cool thing about this behavior--it doesn't matter how complex your program is, your async functions start in the same order they're called (though after that, they may interleave and finish in any order).

PufPufPuf

This is about durable execution -- being able to resume execution "from the middle", which is often done by executing from the beginning but skipping external calls. Second time around, the I/O is exactly replayed from stored values, and the "deterministic" part only refers to the async scheduler which behaves the same as long as the results are the same.

Coincidentally I have been experimenting with something very similar in JavaScript in the past and there the scheduler also has the same property.

arn3n

While not production ready, I’ve been happily surprised at this functionality when building with it. I love my interpreters to be deterministic, or when random to be explicitly seeded. It makes debugging much easier when I can rerun the same program multiple times and expect identical results.

frizlab

Interestingly I think things that should not be deterministic should actually forced not to be.

Swift for instance will explicitly make iterating on a dictionary not deterministic (by randomizing the iteration), in order to catch weird bugs early if a client relies (knowingly or not) on the specific order the elements of the dictionary are ordered.

12_throw_away

No, determinstic scheduling is not a property of async python.

Yes, the stdlib asyncio event loop does have deterministic scheduling, but that's an implementation detail and I would not rely on it for anything critical. Other event loops - for instance trio [1] - explicitly randomize startup order so that you won't accidentally write code that relies on it.

[1] https://github.com/python-trio/trio/issues/32

KraftyOne

It's been a stable (and documented) behavior of the Python standard library for almost a decade now. It's possible it may change--nothing is ever set in stone--but that would be a large change in Python that would come with plenty of warning and time for adjustment.

StableAlkyne

> but that's an implementation detail

That sounds familiar...

https://stackoverflow.com/questions/39980323/are-dictionarie...

whinvik

Is this guaranteed by the async specification? Or is this just current behavior which could be changed in a future update. Feels like a brittle dependency if its not part of the spec.

KraftyOne

It's documented behavior for the low-level API (e.g. asyncio.call_soon https://docs.python.org/3/library/asyncio-eventloop.html#asy...). More broadly, this has been a stable behavior of the Python standard library for almost a decade now. If it does change, that would be a huge behavioral change that would come with plenty of warning and time for adjustment.

jpollock

That's deterministic dispatch, as soon as it forks or communicates, it is non deterministic again?

Don't you need something like a network clock to get deterministic replay?

It can't use immediate return on replay, or else the order will change.

This makes me twitchy. The dependencies should be better modelled, and idempotency used instead of logging and caching.