-
Notifications
You must be signed in to change notification settings - Fork 46
Proof of concept to replace the HTTP server with WKURLSchemeHandler #160
base: develop
Are you sure you want to change the base?
Conversation
ifNotCancelled { | ||
isCancelled = true | ||
} | ||
assert(Thread.isMainThread) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks a bit like a hack :) meaning, there's no reason from this point of view to demand that cancel
must be invoked on the main thread, since there's no UI logic in the body. From my (very limited!) understanding of the context this is in, this is maybe because there are multithreading synchronization requirements here? If so I think it would be better to use our own queue without forcing ourself into the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially implemented my own serial queue to synchronise the access to isCancelled
, but unfortunately it triggered deadlocks. This is actually the main reason I raised this PR to Apple's engineer.
Basically it goes like this:
- The start delegate is called from the main thread, so we need to start a background task to return the data otherwise for large resources the UI will freeze.
func webView(_ webView: WKWebView, start urlSchemeTask: WKURLSchemeTask) { - In the background task, we call multiple times
didReceive(Data)
on the task, until we exhaust the resource.
self.withTask { $0.didReceive(response) } - iOS can at any moment cancel the task, it does it also from the main thread.
func webView(_ webView: WKWebView, stop urlSchemeTask: WKURLSchemeTask) {
Now the trouble is, if we call didReceive()
(or any other API) on a cancelled task, the app crashes with a low-level Objective-C exception.
So like you suggested I tried to fix this by protecting the access to isCancelled
with a serial queue, but got a deadlock because:
- The request to cancel a task is called from the main thread.
- The task's
didReceive(Data)
API (which is always called from my serial queue) is also synchronously calling to the main thread...
At this point, we agreed with Apple's engineer that something was fishy in the WKWebView
behavior and he asked me to post a bug report on WebKit's issue tracker.
This morning though I came up with this hack to resolve the situation without waiting for an hypothetic release: using the main thread as the "serial queue" protecting isCancelled
. This definitely sucks because it depends on undocumented behavior, but I don't know how to solve this differently at this point. I'm open for suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I could fix WKWebView
's behavior, I would either:
- Not crash when calling APIs on a cancelled task, just ignore it.
- Take a
DispatchQueue
as argument that will be used to start/cancel tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, now I understand. Would you mind adding a link to this discussion in the code so that the reasoning behind these choices is not lost once you merge the code?
regarding the serial queue solution, would it be possible to schedule work on the serial queue async
hronously instead of using sync
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, now I understand. Would you mind adding a link to this discussion in the code so that the reasoning behind these choices is not lost once you merge the code?
Sure, good idea.
regarding the serial queue solution, would it be possible to schedule work on the serial queue
async
hronously instead of usingsync
?
The calls to didReceive(Data)
must happen in sequence, otherwise the resources will be mangled in the web view. However, we probably could spawn multiple background tasks in sequence instead of a single background task with a for loop reading the bytes. That would probably solve the deadlock introduced with the custom serial queue.
But taking your other comment into consideration, I wonder if using the main thread as the synchronization queue is not the proper approach here. I considered WKURLSchemeHandler
to be background task, but it is true that it is coming from a UI framework. That explains why its APIs are coming from or going to (synchronously) the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calls to didReceive(Data) must happen in sequence, otherwise the resources will be mangled in the web view. However, we probably could spawn multiple background tasks in sequence instead of a single background task with a for loop reading the bytes. That would probably solve the deadlock introduced with the custom serial queue.
using the main thread to synchronize all the pieces is clever and it does work, so that's a big +1 and we should probably leave it in the commit history at the very least.
That being said we are doing various work on the main thread that ideally we'd do in a background thread. I don't have a clear idea about how expensive that work can be: if it can't be very expensive, probably we could leave it on the main thread as-is and keep it simple.
If it can be expensive, and since there are pieces of work that have to be synchronous and sequential, if we can rewrite this code in a way that we can post the chunks of work we need asynchronously to the serial queue, this would still preserve the sequential order and avoid deadlocks. One other equivalent approach could be to use NSOperations and chain dependencies explicitly but that's probably overkill.
To summarize: using sync
related to the main thread makes me nervous 😬.
My understanding of WKURLSchemeHandler
(very small, I have zero experience with it) is that yes it's part of a UI framework, but I wouldn't draw the conclusion that everything implicitly and reliably syncs on the main thread. I could be wrong though: the docs of that protocol are vague! We could ask a quick question at WWDC maybe in the lounges about whether we can expect its methods to reliably be called on the main thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That being said we are doing various work on the main thread that ideally we'd do in a background thread. I don't have a clear idea about how expensive that work can be: if it can't be very expensive, probably we could leave it on the main thread as-is and keep it simple.
I think we're fine, there's no work done on the main thread except when calling back to WKURLSchemeHandler
APIs (which is anyway calling to the main thread from my observations).
Using the main thread as the synchronization queue means that we never have to explicitly wait for something from the main thread.
If you look at the two delegate APIs (which are called from the main thread), they are not waiting on any queue:
r2-navigator-swift/r2-navigator-swift/Toolkit/WebViewResourceHandler.swift
Lines 29 to 53 in 6c603de
func webView(_ webView: WKWebView, start urlSchemeTask: WKURLSchemeTask) { | |
guard let url = urlSchemeTask.request.url else { | |
urlSchemeTask.didFailWithError(HandlerError.noURLProvided) | |
return | |
} | |
guard url.scheme == scheme else { | |
urlSchemeTask.didFailWithError(HandlerError.unsupportedScheme(url.scheme)) | |
return | |
} | |
let href = url.absoluteString.removingPrefix(scheme + "://") | |
let resource = publication.get(href) | |
let task = Task(url: url, task: urlSchemeTask, resource: resource) | |
tasks[ObjectIdentifier(urlSchemeTask)] = task | |
task.start() | |
} | |
func webView(_ webView: WKWebView, stop urlSchemeTask: WKURLSchemeTask) { | |
guard let task = tasks[ObjectIdentifier(urlSchemeTask)] else { | |
urlSchemeTask.didFailWithError(HandlerError.taskNotStarted(urlSchemeTask)) | |
return | |
} | |
task.cancel() | |
} |
An improvement I could see is that instead of asserting we are (or not) on the main thread, we could schedule a main thread work when needed. This way if the WKURLSchemeHandler
changes the used queue, it should still work.
r2-navigator-swift/r2-navigator-swift/Toolkit/WebViewResourceHandler.swift
Lines 110 to 126 in 6c603de
func cancel() { | |
assert(Thread.isMainThread) | |
self.isCancelled = true | |
} | |
/// Any API call to WKURLSchemeTask will crash the app if the task is cancelled by a call to: | |
/// `webView(_ webView: WKWebView, stop urlSchemeTask: WKURLSchemeTask)`. | |
/// | |
/// To prevent this, we need to synchronize on the main thread every API calls to the task. | |
private func withTask(callback: (WKURLSchemeTask) -> Void) { | |
assert(!Thread.isMainThread) | |
DispatchQueue.main.sync { | |
if !self.isCancelled { | |
callback(self._task) | |
} | |
} | |
} |
To summarize: using
sync
related to the main thread makes me nervous 😬.
Agreed but I don't think it's the case here. Do you have an example from the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only example I believe is the one in withTask(callback:)
and from your previous implementation with the custom queue isCancelledQueue.sync { ... }
. If those could become async
, that would be better since we'd at least remove the wait, and the order of WKURLSchemeTask api calls would still be preserved. But I think you said previously that that doesn't work for other reasons.
In any case, if we are sure that the work scheduled on the main thread is never going to be too much, I agree this should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only example I believe is the one in
withTask(callback:)
I think the closures are pretty harmless, but I can't guarantee that WKURLSchemeHandler
will never do heavy work. Especially in didReceive(Data)
. For now as we can see in the stacktrace, it is calling on the main thread synchronously anyway:
and from your previous implementation with the custom queue
isCancelledQueue.sync { ... }
.
Yeah this was bad. If we ever go back to having a custom queue, we should do the approach we talked about earlier to replace the for loop.
self.webView = WebView(editingActions: editingActions) | ||
self.contentInset = contentInset | ||
if #available(iOS 11.0, *) { | ||
let scheme = "readium-pub" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this literal be a constant on the super class, since it's also used in EPUBReflowableSpreadView
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part and the one in EPUBReflowableSpreadView
will probably not look like this in the final version. I just set this up quickly to try out the WKURLSchemeHandler
approach.
A first stab at readium/swift-toolkit#117
There are a few issues to address.
Cancelled
WKURLSchemeTask
The scheme handler is called on the main thread, so we must spawn a background task to read the resource and notify back the
WKURLSchemeTask
. However, if we received a cancellation notification for this task, any new call toWKURLSchemeTask
crashes the app (low-level exception). But guarding the access to theisCancelled
property with a serialized queue creates some random deadlocks.Byte range requests
We don't get any byte range information, however the
WKURLSchemeTask
is cancelled before loading the full video content when first loading the page. We can also serve the video progressively so there's no need to load everything in memory at once, however I didn't find any way to start streaming from a particular offset. Need some more testing with large videos to see if random access would work fine.According to this bug report on Mac we get byte range info but not on iOS. This Twitter thread provides more context.