Skip to content
This repository was archived by the owner on Aug 12, 2022. It is now read-only.

Conversation

mickael-menu
Copy link
Member

A first stab at readium/swift-toolkit#117

There are a few issues to address.

Cancelled WKURLSchemeTask

The scheme handler is called on the main thread, so we must spawn a background task to read the resource and notify back the WKURLSchemeTask. However, if we received a cancellation notification for this task, any new call to WKURLSchemeTask crashes the app (low-level exception). But guarding the access to the isCancelled property with a serialized queue creates some random deadlocks.

Byte range requests

We don't get any byte range information, however the WKURLSchemeTask is cancelled before loading the full video content when first loading the page. We can also serve the video progressively so there's no need to load everything in memory at once, however I didn't find any way to start streaming from a particular offset. Need some more testing with large videos to see if random access would work fine.

According to this bug report on Mac we get byte range info but not on iOS. This Twitter thread provides more context.

ifNotCancelled {
isCancelled = true
}
assert(Thread.isMainThread)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit like a hack :) meaning, there's no reason from this point of view to demand that cancel must be invoked on the main thread, since there's no UI logic in the body. From my (very limited!) understanding of the context this is in, this is maybe because there are multithreading synchronization requirements here? If so I think it would be better to use our own queue without forcing ourself into the main thread.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially implemented my own serial queue to synchronise the access to isCancelled, but unfortunately it triggered deadlocks. This is actually the main reason I raised this PR to Apple's engineer.

Basically it goes like this:

Now the trouble is, if we call didReceive() (or any other API) on a cancelled task, the app crashes with a low-level Objective-C exception.

Screen Shot 2021-06-09 at 20 40 44

So like you suggested I tried to fix this by protecting the access to isCancelled with a serial queue, but got a deadlock because:

  1. The request to cancel a task is called from the main thread.

Screen Shot 2021-06-09 at 20 45 36

  1. The task's didReceive(Data) API (which is always called from my serial queue) is also synchronously calling to the main thread...

Screen Shot 2021-06-09 at 20 46 05

At this point, we agreed with Apple's engineer that something was fishy in the WKWebView behavior and he asked me to post a bug report on WebKit's issue tracker.

This morning though I came up with this hack to resolve the situation without waiting for an hypothetic release: using the main thread as the "serial queue" protecting isCancelled. This definitely sucks because it depends on undocumented behavior, but I don't know how to solve this differently at this point. I'm open for suggestions!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I could fix WKWebView's behavior, I would either:

  • Not crash when calling APIs on a cancelled task, just ignore it.
  • Take a DispatchQueue as argument that will be used to start/cancel tasks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, now I understand. Would you mind adding a link to this discussion in the code so that the reasoning behind these choices is not lost once you merge the code?

regarding the serial queue solution, would it be possible to schedule work on the serial queue asynchronously instead of using sync?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, now I understand. Would you mind adding a link to this discussion in the code so that the reasoning behind these choices is not lost once you merge the code?

Sure, good idea.

regarding the serial queue solution, would it be possible to schedule work on the serial queue asynchronously instead of using sync?

The calls to didReceive(Data) must happen in sequence, otherwise the resources will be mangled in the web view. However, we probably could spawn multiple background tasks in sequence instead of a single background task with a for loop reading the bytes. That would probably solve the deadlock introduced with the custom serial queue.

But taking your other comment into consideration, I wonder if using the main thread as the synchronization queue is not the proper approach here. I considered WKURLSchemeHandler to be background task, but it is true that it is coming from a UI framework. That explains why its APIs are coming from or going to (synchronously) the main thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calls to didReceive(Data) must happen in sequence, otherwise the resources will be mangled in the web view. However, we probably could spawn multiple background tasks in sequence instead of a single background task with a for loop reading the bytes. That would probably solve the deadlock introduced with the custom serial queue.

using the main thread to synchronize all the pieces is clever and it does work, so that's a big +1 and we should probably leave it in the commit history at the very least.

That being said we are doing various work on the main thread that ideally we'd do in a background thread. I don't have a clear idea about how expensive that work can be: if it can't be very expensive, probably we could leave it on the main thread as-is and keep it simple.

If it can be expensive, and since there are pieces of work that have to be synchronous and sequential, if we can rewrite this code in a way that we can post the chunks of work we need asynchronously to the serial queue, this would still preserve the sequential order and avoid deadlocks. One other equivalent approach could be to use NSOperations and chain dependencies explicitly but that's probably overkill.

To summarize: using sync related to the main thread makes me nervous 😬.

My understanding of WKURLSchemeHandler (very small, I have zero experience with it) is that yes it's part of a UI framework, but I wouldn't draw the conclusion that everything implicitly and reliably syncs on the main thread. I could be wrong though: the docs of that protocol are vague! We could ask a quick question at WWDC maybe in the lounges about whether we can expect its methods to reliably be called on the main thread?

Copy link
Member Author

@mickael-menu mickael-menu Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said we are doing various work on the main thread that ideally we'd do in a background thread. I don't have a clear idea about how expensive that work can be: if it can't be very expensive, probably we could leave it on the main thread as-is and keep it simple.

I think we're fine, there's no work done on the main thread except when calling back to WKURLSchemeHandler APIs (which is anyway calling to the main thread from my observations).

Using the main thread as the synchronization queue means that we never have to explicitly wait for something from the main thread.

If you look at the two delegate APIs (which are called from the main thread), they are not waiting on any queue:

func webView(_ webView: WKWebView, start urlSchemeTask: WKURLSchemeTask) {
guard let url = urlSchemeTask.request.url else {
urlSchemeTask.didFailWithError(HandlerError.noURLProvided)
return
}
guard url.scheme == scheme else {
urlSchemeTask.didFailWithError(HandlerError.unsupportedScheme(url.scheme))
return
}
let href = url.absoluteString.removingPrefix(scheme + "://")
let resource = publication.get(href)
let task = Task(url: url, task: urlSchemeTask, resource: resource)
tasks[ObjectIdentifier(urlSchemeTask)] = task
task.start()
}
func webView(_ webView: WKWebView, stop urlSchemeTask: WKURLSchemeTask) {
guard let task = tasks[ObjectIdentifier(urlSchemeTask)] else {
urlSchemeTask.didFailWithError(HandlerError.taskNotStarted(urlSchemeTask))
return
}
task.cancel()
}

An improvement I could see is that instead of asserting we are (or not) on the main thread, we could schedule a main thread work when needed. This way if the WKURLSchemeHandler changes the used queue, it should still work.

func cancel() {
assert(Thread.isMainThread)
self.isCancelled = true
}
/// Any API call to WKURLSchemeTask will crash the app if the task is cancelled by a call to:
/// `webView(_ webView: WKWebView, stop urlSchemeTask: WKURLSchemeTask)`.
///
/// To prevent this, we need to synchronize on the main thread every API calls to the task.
private func withTask(callback: (WKURLSchemeTask) -> Void) {
assert(!Thread.isMainThread)
DispatchQueue.main.sync {
if !self.isCancelled {
callback(self._task)
}
}
}

To summarize: using sync related to the main thread makes me nervous 😬.

Agreed but I don't think it's the case here. Do you have an example from the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only example I believe is the one in withTask(callback:) and from your previous implementation with the custom queue isCancelledQueue.sync { ... }. If those could become async, that would be better since we'd at least remove the wait, and the order of WKURLSchemeTask api calls would still be preserved. But I think you said previously that that doesn't work for other reasons.

In any case, if we are sure that the work scheduled on the main thread is never going to be too much, I agree this should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only example I believe is the one in withTask(callback:)

I think the closures are pretty harmless, but I can't guarantee that WKURLSchemeHandler will never do heavy work. Especially in didReceive(Data). For now as we can see in the stacktrace, it is calling on the main thread synchronously anyway:

and from your previous implementation with the custom queue isCancelledQueue.sync { ... }.

Yeah this was bad. If we ever go back to having a custom queue, we should do the approach we talked about earlier to replace the for loop.

self.webView = WebView(editingActions: editingActions)
self.contentInset = contentInset
if #available(iOS 11.0, *) {
let scheme = "readium-pub"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this literal be a constant on the super class, since it's also used in EPUBReflowableSpreadView?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part and the one in EPUBReflowableSpreadView will probably not look like this in the final version. I just set this up quickly to try out the WKURLSchemeHandler approach.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants