关于scala中的一些异步知识

2017-12-07 12:18:53来源:oschina作者:Littlebox人点击

分享

Asynchrony is everywhere and it subsumes concurrency. This article explains what asynchronous processing is and its challenges.


Table of Contents1. Introduction
2. The Big Illusion
3. Callback Hell

3.1. Sequencing (Purgatory of Side-effects)
3.2. Parallelism (Limbo of Nondeterminism)
3.3. Recursivity (Wrath of StackOverflow)4. Futures and Promises

4.1. Sequencing
4.2. Parallelism
4.3. Recursivity
4.4. Performance Considerations5. Task, Scala's IO Monad

5.1. Sequencing
5.2. Parallelism
5.3. Recursivity6. Functional Programming and Type-classes

6.1. Monad (Sequencing and Recursivity)
6.2. Applicative (Parallelism)
6.3. Can We Define a Type-class for Async Evaluation?7. Picking the Right Tool 1.Introduction

As a concept it is more general thanmultithreading, although some people confuse the two. If you're looking for a relationship, you could say:


Multithreading <: Asynchrony

We can represent asynchronous computations with a type:


type Async[A] = (Try[A] => Unit) => Unit

If it looks ugly with thoseUnitreturn types, that's because asynchrony is ugly. An asynchronous computation is any task, thread, process, node somewhere on the network that:

executes outside of your program's main flow or from the point of view of the caller, it doesn't execute on the current call-stack
receives a callback that will get called once the result is finished processing
it provides no guarantee about when the result is signaled, no guarantee that a result will be signaled at all

It's important to note asynchrony subsumesconcurrency, but not necessarilymultithreading. Remember that in Javascript the majority of all I/O actions (input or output) are asynchronous and even heavy business logic is made asynchronous (withsetTimeoutbased scheduling) in order to keep the interface responsive. But no kernel-level multithreading is involved, Javascript being an N:1 multithreaded platform.


Introducing asynchrony into your program means you'll have concurrency problems because you never know when asynchronous computations will be finished, socomposingthe results of multiple asynchronous computations running at the same time means you have to do synchronization, as you can no longer rely on ordering. And not relying on an order is a recipe fornondeterminism.


Wikipedia says: anondeterministicalgorithm is an algorithm that, even for the same input, can exhibit different behaviors on different runs, as opposed to adeterministicalgorithm … Aconcurrentalgorithm can perform differently on different runs due to a race condition.


Nondet


The astute reader could notice that the type in question can be seeneverywhere, with some modifications depending on use-case and contract:

in theObserver patternfrom theGang of Four
in Scala'sFuture, which is defined by its abstractonCompletemethod
in Java'sExecutorService.submit(Callable)
in Javascript'sEventTarget.addEventListener
inAkkaactors, although there the given callback is replaced by thesender()reference
in the MonixTask.Asyncdefinition
in the MonixObservableandObserverpair
in theReactive Streamsspecification

What do all of these abstractions have in common? They provide ways to deal with asynchrony, some more successful than others.


2.The Big Illusion

We like to pretend that we can describe functions that can convert asynchronous results to synchronous ones:


def await[A](fa: Async[A]): A

Fact of the matter is that we can't pretend that asynchronous processes are equivalent with normal functions. If you need a lesson in history for why we can't pretend that, you only need to take a look at why CORBA failed.


With asynchronous processes we have the following very commonfallacies of distributed computing:

The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
Topology doesn't change
There is one administrator
Transport cost is zero
The network is homogeneous

None of them are true of course. Which means code gets written with little error handling for network failures, ignorance of network latency or packet loss, ignorance of bandwidth limits and in general ignorance of the ensuing nondeterminism.


People have tried to cope with this by:

callbacks, callbacks everywhere, equivalent to basically ignoring the problem, as it happens in Javascript, which leads to the well known effect ofcallback hell, paid for with the sweat and blood of programmers that constantly imagine having chosen a different life path
blocking threads, on top of1:1 (kernel-level) multithreadingplatforms
first-class continuations, implemented for example by Scheme incall/cc, being the ability to save the execution state at any point and return to that point at a later point in the program
Theasync/awaitlanguage extension from C#, also implemented in thescala-asynclibrary and in thelatest ECMAScript
Green threadsmanaged by the runtime, possibly in combination withM:N multithreading, to simulate blocking for asynchronous actions; examples including Golang but also Haskell
Theactor modelas implemented in Erlang or Akka, orCSPsuch as inClojure's core.asyncor in Golang
Monads being used for ordering and composition, such as Haskell'sAsynctype in combination with theIOtype, orF# asynchronous workflows, orScala's Futures and Promises, or theMonix Taskor theScalaz Task, etc, etc.

If there are so many solutions, that's because none of them is suitable as a general purpose mechanism for dealing with asynchrony. Theno silver bulletdilemma is relevant here, with memory management and concurrency being the biggest problems that we face as software developers.


WARNING - personal opinion and rant:People like to boast about M:N platforms like Golang, however I prefer 1:1 multithreaded platforms, like the JVM or dotNET.Because you can build M:N multithreading on top of 1:1 given enough expressiveness in the programming language (e.g. Scala's Futures and Promises, Task, Clojure's core.async, etc), but if that M:N runtime starts being unsuitable for your usecase, then you can't fix it or replace it without replacing the platform. And yes, most M:N platforms are broken in one way or another.Indeed learning about all the possible solutions and making choices is freaking painful, but it is much less painful than making uninformed choices, with the TOOWTDI and "worse is better" mentalities being in this case actively harmful. People complaining about the difficulty of learning a new and expressive language like Scala or Haskell are missing the point, because if they have to deal with concurrency, then learning a new programming language is going to be the least of their problems. I know people that have quit the software industry because of the shift to concurrency.


3.Callback Hell

Let's build an artificial example made to illustrate our challenges. Say we need to initiate two asynchronous processes and combine their result.


First let's define a function that executes stuff asynchronously:


import scala.concurrent.ExecutionContext.global
type Async[A] = (A => Unit) => Unit
def timesTwo(n: Int): Async[Int] =
onFinish => {
global.execute(new Runnable {
def run(): Unit = {
val result = n * 2
onFinish(result)
}
})
}
// Usage
timesTwo(20) { result => println(s"Result: $result") }
//=> Result: 403.1.Sequencing (Purgatory of Side-effects)

Let's combine two asynchronous results, with the execution happening one after another, in a neat sequence:


def timesFour(n: Int): Async[Int] =
onFinish => {
timesTwo(n) { a =>
timesTwo(n) { b =>
// Combining the two results
onFinish(a + b)
}
}
}
// Usage
timesFour(20) { result => println(s"Result: $result") }
//=> Result: 80

Looks simple now, but we are only combining two results, one after another.


The big problem however is thatasynchrony infects everything it touches. Let's assume for the sake of argument that we start with a pure function:


def timesFour(n: Int): Int = n * 4

But then your enterprise architect, after hearing about these Enterprise JavaBeans and a lap dance, decides that you should depend on this asynchronoustimesTwofunction. And now ourtimesFourimplementation changes from a pure mathematical function to a side-effectful one and we have no choice in the matter. And without a well grownAsynctype, we are forced to deal with side-effectful callbacks for the whole pipeline. And blocking for the result won't help, as you're just hiding the problem, seesection 2for why.


But wait, things are about to get worse

相关文章

    无相关信息

微信扫一扫

第七城市微信公众平台