-
-
Notifications
You must be signed in to change notification settings - Fork 457
Script cache experimentation #5059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This pull request has been mentioned on openHAB Community. There might be relevant details there: https://community.openhab.org/t/rules-and-concurrency/166577/49 |
Signed-off-by: Ravi Nadahar <[email protected]>
Signed-off-by: Ravi Nadahar <[email protected]>
Signed-off-by: Ravi Nadahar <[email protected]>
Signed-off-by: Ravi Nadahar <[email protected]>
77660c2
to
7456138
Compare
This is what #4704 was meant for.
IMO, this is kinda unnecessary. People can just store json string if they deem that necessary. Cross-language usage is probably extremely rare, but in such cases, json can be used and every language can deal with it because json is json. |
That might be, you didn't really document what it was for, I assumed it was done as a convenience function. But, it doesn't work as thread-safety: While only one thread is allowed to modify the cache object while the This was the whole point with the solution I tried to make here: Let the cache entries be lockable on an individual basis, while letting the cache continue to operate for unrelated entries.
Yes they can. But do they? Do they understand why they should? It's also very easy to do this "for them", and the point is that using this cache actually is thread-safe. Because of the serialization/deserialization, the same object is never returned. It's always "a new copy". Then same when it's put into the cache. As a result, everything will operate on their own copy, and there will be no threading issues. Let's not forget that the current shared cache isn't safe to use for anything else than objects that are in themselves thread-safe, which means basically Java objects written to be thread-safe. Timers will be if #5051 is merged, but not a lot of objects in general are thread-safe. Any objects made by any of the "no concurrency" languages can be thread-safe, since they lack the mechanisms to implement thread-safety. The current shared cache is thus in reality only safe for Timers and primitive types, the reason that it's safe for primitive types is that they are inherently immutable in Java. You can never modify a Java These ideas came as I thought about how to find safe ways to do the things that are unsafe with the current shared cache. The work of implementing this in OH is very minor, the biggest "problem" is that people will keep using the "unsafe" shared cache for things it can't handle for a long time anyway. But, with these other solutions, at least there's a way that can be documented where it is safe to do these things, it's possible to explain how this can be done safely. As of now, all you can do is really to say "You shouldn't do that", or to suggest that the user him/herself implements serialization or find some other creative solution using some other signalling mechanism for when they can modify things and not. And that most likely won't be thread-safe anyway, because it's not easy to ensure thread-safety without the proper tools. |
Good catch! I was copying how it was done in TrackingValueCacheImpl.get which does the same lock. So both |
Won't that cause confusion? "I tried modifying this object, and there were no errors, but why are my changes not "seen" by the others? I thought this was a shared cache". |
Sorry, I didn't read your comment well enough, so I answered as if you were talking about the locking cache. To address what you actually wrote: That depends on the documentation too, I'd say. Maybe "cache" is the wrong name, maybe "storage" is more fitting. It would have to be pointed out that changes must be explicitly "put" in the storage, and refreshed with "get" as needed. But, if done like that, it should work well. This is also how the current "shared cache" work for primitives, which is some places described as the only thing you can really use it for. They won't update "automatically" if you change them locally, nor will cache updates be reflected locally, unless there is some "real magic" going on with the conversion. That is because they are copied/converted during retrieval and storage, so they aren't actually the same objects. And that, again, is why "they work" with the current shared cache. The bottom line is that you can't have both: You must either "break the link" to the cached object, or you must use locking in some form. That can either be in the code of the object itself (like a thread-safe Java object), or it must be done using explit locking like |
Yeah, neither of them are "ideal" for a thread-safe cache, but if the functions are quick, it won't hurt... there probably should be a warning somewhere about giving them long-running functions. |
I like both ideas. I worry about people causing problems for themselves with the Is there a way for the cache to track when the script that accesses the cache starts running and returns to idle? It could lock and unlock the entry based on that and relieve the user from needing to do this manually. Though that does give less control to the end users in how long the entry remains locked, and I don't know if timers have a way to track them in a way useful for the cache. We'd need to auto-lock/unlock for timers too. Something to think about but definitely not required for an MVP PR. From an interface perspective, are you proposing making two new cache's, or augmenting the existing shared cache (there's no reason to touch the private cache I think). It looks like these are separate caches. I wonder if it would be easier to understand if we used arguments on the existing caches. I can see arguments either way. Probably best to keep them as separate caches. Does there need to be a distinction between read access and write access? Is that even feasible? I'm not sure. But to read you don't necessarily need a lock for any longer than it takes to get the value, unless you need to ensure it doesn't change until you are done with it. Maybe the serialization/deserialization feature could be used in a lot of cases to avoid deadlocks. Though the complexity to the end users keeps growing. |
I can't quite answer what is and isn't possible. What I can say is that I think it's quite risky: lots of possibilities that it doesn't unlock because something doesn't happen in the order we planned. It would also partially defeat the purpose, because the idea is that you only lock it while you work on that specific value/object. Once you're out of the "locked block" (that's why the best practice is to always use I also did a little "trick" in this
I'm not sure that I'm proposing anything that concrete at this stage. I'm primarily focused on the "challenges" with shared objects and concurrency, and possible ways to handle it. The private cache definitely don't need any of this, it doesn't need it, and works best the way it is. When it comes to putting some of these things into the existing shared cache, or making new ones, I'm not really sure what would be the best design, but I'm afraid that it could be very confusing to users if what they think of as "one cache" actually is several different that works entirely different. But, ultimately, that's a design decision, not so much a technical one.
Java does have a So, in reality, I only really use it in situations where things are read a lot, and only very rarely written. Then it can make sense. Except for that, I've experienced that locks aren't as expensive "as many seem to think" if you just think a bit about what you're doing inside the lock. If we're at all worried that this could be too complicated for users, and I think we should be, read/write locks just aren't worth it, IMO. With locks in general, you never want to hold it longer than you need to. You should organize your logic so that, to the extent possible, you do what you need to do while you hold the lock, in the lock, and then you do the rest outside of it. I often make local copies inside the lock, and then do longer operations with the local copies after having released the lock. Those local copies won't be updated if the shared variable changes, but that turns out not to be a problem in many situations. The main purpose I had in mind with the In every situation where you don't need such "shared decision-making", I think it would be better to not use The serialization/deserialization cache avoids all deadlock potential and is generally easy to use, as long as people realize that their local instance isn't "live" and must be stored after change and re-read to get the up-to-date state. |
Then leaving the responsibility to the end user is down right frightening.
Relying on best practice from a bunch of enthusiastic but inexperienced programmers (for a huge portion of OH's user base OH is their only exposure to programming at all) is going to guarantee disaster for most of them. That's why it would be better if we can implement best practice on their behalf. Not to mention that there is no guarantee that a Rules DSL rule will always execute If we have to depend on the end users locking and unlocking I foresee a return to the bad old days of OH 1/2 where rules were not locked and users had to use ReentrantLocks inside their rules. It worked some of the time. We already hammer on users that rules should run fast and exit. If you need to wait for something, use a timer. If the locking and unlocking were done on their behalf all we need to do is continue giving them that same advice we've always been giving them. They will still ignore it but at least we are not introducing (for most of them) a new concept they have top worry about. That's why I think it's worth it to lock the entry for the whole run of the script that uses it.
I agree, using the return value from unlock() is likely to be misunderstood and generate a lot of threads on the forum when it's not used or misused. There is no end to the creativity of OH users. Throw in vibe coding and the strangeness of code people ask for help with has become quite interesting. If we can do it for them, even if it means most of the time the lock is held a few dozen msec longer than it strictly needs to be, would greatly increase the likelihood OH users would be able to successfully use them without having to hit up the forum for help.
Just remember that most OH users have neither the knowledge nor the skill to do that thinking. If we rely on the end users to make good decisions instead of bad a lot of users will fail. Working with locks is hard for experienced developers. |
Yes and no. It's much simpler for the user than it is to try to "handle it for them". But, clearly, it can be and probably would be misunderstood/misused. There would have to be documentation that explained what is essential to use the If we found a waterproof way to unlock any locks locked by the script when it ended (which I'm not even sure is doable), there would still be a lot of implications. That script would then have to not do long-running or potentially blocking operations, even less things like "sleep", for the rest of the duration of the script. Another problem would be that if a "set of scripts" used two such "locking objects/values", every script would have to acquire the locks in the same order. If you have two values A and B, and two scripts, S and T, where S first used A, then B, while T first used B, then A, a deadlock would occur. I think using such a solution would require at least as much "understanding" and education of users, as explaining some rules that only applies for a limited block of code. Obviously, issues like the one you mention with Rules DSL, where
I think this boils down to a larger problem that many great brains have tried to, but been unable to, "solve" to this day: The desire to find a way to make concurrency-able code where the author of the code doesn't have to understand how it works. Most programming languages face this challenge, but the situation remains pretty much the same: The "simpler" languages opt to simply deny their users to do these things (JavaScript, Python, Ruby etc.). The more "advanced" languages stick to leaving the entire responsibility on the developers. There are some attempts at choosing a "middle road", like Go and Rust, but it doesn't really work: There are solutions that have "training wheels", but this always come with a cost. In the end, the language must either offer "the full system", like C, C++, Java etc., or accept limitations in what it can do. It's a tradeoff: the less complexity, the less possibilities and vice versa. Programming languages have a lot of possibilities to "design easier to handle mechanisms" than we do here. They can enforce immutability and all kind of "rules" they want to avoid certain pitfalls. We have very little such "power" here, since both Java and the scripting languages don't adhere to any "rules" we try to impose. And, even with these much better control of the environment and ability to enforce "contracts", programming languages have only marginally managed to do things better, in some situations. The The way it currently "works" isn't better than back when users used locks manually - it only has slightly different problems. With the current shared cache, you can do these things, but not in a thread-safe way, so sometimes this will fail. Maybe not most of the time, that heavily depends on what level of actual concurrency it is (how often does these rules/scripts actually run simultaneously, despite the fact that they theoretically can) in reality. But it clearly isn't "better" than using locks manually, because by doing that, at least it's possible to make it work consistently. But, it might lead to less frustrated users on the forum, because it's easier to do, and they might not notice when it goes wrong, or at least don't understand that this is the reason. My idea was that maybe the best approach was to discourage doing things that requires locks if at all possible, and then offer a solution where it could actually be done "safely" for those that want the ability to do that, despite the complexity. Maybe that's wrong, maybe that's highly subjective and depends on your perspective. I can't really say other than that, to me, it looks like it might be the best overall approach. |
Which is something we already recommend contantly.
I would assume we would be very clear in the docs for this to not do that. If you have more than one entry in this cache, use a map or array to put them all into one entry so only one lock is acquired. I imagine a warning could be written to the log in this case. Cluttering the logs with warnings are very good at changing user's behavior. The number of users who would need a locking cache in the first place is relatively low (especially once the timer synchronization gets merged). The number who would need to share more than one synchronized value between two different rules is even lower. It is a pretty small population of users that such a limitation would impact.
I'm not asking for that. I'm asking for simplification. I'm asking for a great reduction in the amount that the users need to understand and keep in their brain to use it mostly safely. Something where we can give the users three or four "rules" they can fallow and end up with reasonable code.
That's not too hard to follow and understand nor is it too much for users to keep in their mind at the same time. But if they have to lock and unlock manually, we have to double or triple the amount of "rules", the number of edge cases explodes, and their overall understanding plummets. |
Can't we just make the existing methods to perform locking in a more granular way, i.e. lock per key instead of the entire cache? In Java's ConcurrentHashMap, it does note that your supplier/mapper function needs to finish its job quickly. |
While recommending avoiding sleeping should be generally applicable and not something that people need to do, avoid blocking operations is much harder. There is nothing inherently "wrong" with blocking operations. It's often not easy to know if you call something that might block either. A blocking operation will usually normally return quickly, until suddenly it doesn't. Any HTTP call for example is blocking, in fact, most communication operations are, unless you are using listeners with async callbacks (which makes things much more complicated). I'd say that if you are to "ban" users from using blocking operations, you're significantly restricting what they can do with this.
I just don't understand that his is actually easier to use than an explicit unlock. To me, it seems easier to explain that they can only operate on that variable in the "locked section", and that they should exit that section before doing unrelated or potentially longer-running operations. I was imagining a use case like this: let decisionObject = cache.locking.get('decisionObject', () => { counter: 0 });
try {
if (decisionObject.counter < 10 && time.ZonedDateTime.now().getHour() <= 15) {
decisionObject.counter++;
}
} finally {
decisionObject = cache.locking.unlock('decisionObject');
} However, when looking at the code now, I'm not convinced that this will work. I'd have to test this to be sure how it would play out, but I can't quite understand how they can manage to keep objects "live" through the "translation" to and from Java objects. As far as I understand, a JavaScript object will be converted into a Java Because, if not, the pattern above wouldn't work. It would require a It's possible to change this, I could add a
We can, the question is if we should. My idea has been to make it as simple as possible, so only expose those that need locking to locking. Of course, the regular cache could simply be equipped with "optional" If the locking cache is separate, they could only use that for particular purposes where locking is needed, and when their focus is on how to do that, and forget about it the rest of the time. It's hard to know, really. I'd also like to add that since this is untested, I don't know if the code I've made behaves as expected. |
Avoid != ban
Then you have no experience helping non-coders on the forum to be successful creating moderately complicated rules. But I'm not going to argue about it. You've made up your mind, and I've learned there is no shifting that.
I've never seen one "suppressed" like they are in Rules DSL. But it's certainly possible for the individual add-ons to do that. Hopefully they don't but 🤷♂️ . It's hard to prove a negative. |
That's not true, but I do resist if I'm not convinced. As I've said from the start, I'm a bit hesitant of whether it's at all doable. You'd have to track what they do very precisely. But, I was contemplating one "possible solution", that I'm still not sure is doable, but that effectively tried to track any "locking object/value" that a script was in contact with, and when the script execution was over, it unlocked all levels of locks for that thread on the affected locks. It would allow users to explicitly unlock, but be a "safety net" in case they didn't, so that any locks "left hanging" would automatically be released. |
Can you show some pseudocode / examples on how to deal with this locking issue, and why a separate lock/unlock is needed, from the user API perspective? |
let decisionObject = cache.locking.get('decisionObject', () => { counter: 0 });
try {
if (decisionObject.counter < 10 && time.ZonedDateTime.now().getHour() <= 15) {
decisionObject.counter++;
}
} finally {
decisionObject = cache.locking.unlock('decisionObject');
} This is the pseudocode I gave above. As I also said above, I'm not so sure that it will work like that, but that all depends on exactly how the As to why it's needed, it goes to the core of concurrency: everything can change at any moment, so if you evaluate something and then take action, it won't necessarily be correct, because the evaluation you just made might not be correct anymore. It's called TOCTOU, and applies to concurrency in general, it's not particular to OH or scripts in any way. Some languages have decided to "shield" its users for such dilemmas, like JavaScript (single-threaded only), Python (GIL) and Ruby (GIL). That means that users of these languages never have to deal with this, but it also means that they are severely limited in what they can do. Threads are a great way to compartmentalize tasks, which would otherwise be extremely complicated. In addition, it's required to be able to utilize more than one CPU core (CPU's typically have more and more cores, leaving the single-threaded languages further and further behind). When using a GIL, you can have multiple threads, but only one of them can do anything at any one time, so using threads is kind of pointless. A GIL wraps all your code in an "invisible lock", so even though you don't see it, it's there. Java isn't like that. It supports full concurrency, luckily, because it would have been impossible to write something like OH in a single-threaded language. Just think of all the different things that go on at the same time, trying to "schedule" all this to run sequentially would have been a complete nightmare. The slightest "mishap" somewhere, would have brought everything to a grinding halt, not to mention that it would have run much slower to begin with. But, this is also where the problem comes in. We're trying to combine languages that can't handle concurrency with a system that actually does a lot of things concurrently. So, I don't know how to answer "why it's needed from a user API perspective" other than that if you want users to be able to make their independent rules/script exchange information or otherwise coordinate, you must address how to do it in a coordinated way. Locks/mutex'es is the most obvious way. There are some "alternative approaches" like using circular buffers that passes messages back and forth, but as far as I know, they are all vastly more complicated to use than locks. My suggestions here are what I've been able to come up with for the easiest way we can actually make independent rules/scripts communicate without experiencing "random behavior" as a result of ignoring concurrency. The risks surrounding concurrency are many, parts of it has to do with CPU caches and that the same variable can have different values in different caches, some have to do with that one thread can read while another is writing, and thus retrieve a partial or completely invalid value/object. Reordering is also a part of the puzzle, CPUs are free to reorder execution as they see fit, so things doesn't necessarily happen in the order that your code is written. Locks and other "concurrency tools" are also used to create barriers to reordering, where the CPU isn't allowed to "reorder across the barrier". |
thanks for the explanation and sorry I haven't been following things closely and missed your example. I asked AI about JRuby: AI Overview |
How about using the existing shared cache implementation, is this the same thing?
|
Yes, careful with listening too much to the ML models 😉 All the JSR223 implementation "can run in parallel" because different Java threads can launch different "scripts". The "problem" is that since "standard Ruby" uses a GIL, the language itself has no mechanism to deal with concurrency, and the developers using it has no experience dealing with it. So, I'd say that while the ML answer isn't "directly wrong", it takes the part about mutexes/locks very lightly. This is the exact thing we're trying to deal with here, there is no problem if several JRuby scripts run in parallel if they have no interaction with each other or other threads in Core. This leaves it up to us to figure out how to handle concurrency for the languages with no "native tools" for doing it. Another question is, if the language handles concurrency natively, are those mechanisms "compatible" with Java's mechanisms, since they have to "work together" here. I suspect that languages that are very close to Java like Groovy probably can cooperate nicely, but I'm not sure that it would be so easy if the language "is more different". That's pure speculation though.
No, that is "use the cache lock to protect my transaction". The cache lock is there to protect the Map holding the cache key/values itself from concurrency issues. It's only supposed to be locked quickly for getting things in and out of the cache. It is global for the entire cache. So, when you do the above, you lock the entire cache for the duration of the operation. Everything else, completely unrelated rules/scripts that tries to use the shared cache, will have to wait for this to finish. If they do the same thing, things will soon get very slow. It also has other implications. There are other locks at play here, locks protecting script execution for example. They will also be held for the time the scripts are waiting for this lock, so the whole thing can cascade throught the system and make OH as a whole barely move at all. This is what I've been trying to avoid, and it also has other problems - because if you need to access this value from outside the |
This one is from JRuby's own wiki, not AI generated: https://github.com/jruby/jruby/wiki/Concurrency-in-jruby
Is this a bit exaggerated? One could say that the get...unlock approach can do the same if the code inside takes some time to execute and a lot of other parallel executions are trying to get...unlock the same key. What I'm trying to say is that we could make the existing locking mechanism more granular so that it would only lock for the given key instead of taking a global lock. Then we won't need yet another class of cache + a new paradigm to do it.
Then do it inside compute(). Treat the whole compute() the same as you would with get....unlock block in your example above. If you make it granular then it won't lock the entire cache. It seems to be good enough for ConcurrentHashMap. So far I haven't seen any need to implement/offer something new. Could you show a different scenario that could not be achieved using compute()? |
I've read through it, and some of the things mentioned there I assume are Ruby-specific, since they don't mean a lot to me/I don't know what they are. It seems that "standard Ruby" has some concurrency tools in the "thread" library, which is surprising to me considering the GIL:
In addition to that, JRuby seems to have done some extras to improve the situation further (because the JVM under which it runs is multithreaded):
So, JRuby isn't entirely without tools to manage. That said, there were quite a lot of caveats listed as well, like the core classes and the standard library, and even basic comparison operations which aren't "safe". It's unclear to me exactly what this means though, because it's not like Java's classes or objects are thread-safe either. Java started out (in Java 1.0) seemingly trying to make everything thread-safe, but this was soon abandoned. It's too slow and causes too many problems to do it that way. The clue is to only apply protections where they are actually needed, not "everywhere", and as such, I can't quite judge whether the lack of thread-safety in the standard library and core classes is a problem or not. My guess is that it isn't, at least most of the time. But, I don't quite see that any of this makes a difference for OH's shared cache. If we were to lay the responsibility for synchronizing the use of the shared cache on the scripting languages, they'd all have to support it.
It depends entirely. I've "investigated" several users cases recently where OH has effectively ceased to operate. The thread dumps reveal lots of threads that are stuck waiting for locks, and the core of such an "investigation" is to figure out what started the whole cascade. These will happen under the right circumstances. If there's a deadlock between a couple of threads, or if some locking operation is operating "so slowly" that other things pile up, you do get these situations. None of the cases I've looked at have had the shared cache as the starting point, but to me, using a global lock on the cache to run "transactions" is a huge red flag. It's just the kind of thing that causes such "pile-ups". All it takes for a complete deadlock to occur is that some of the code in one of the "transactions" do something that blocks while some other part of the system retrieves the result, and that other part of the system must wait for something that in the end ends up waiting for the cache lock to be released. Even without a complete deadlock, you can get such long delays that tasks "pile up" in thread pool queues because the tasks arrive faster than they can be resolved. To me, it's just plain common sense that you don't want to do that. The chance of the same happening when using per-entry locking is much smaller - although still possible of course. Implicit locking has another huge downside as I see it - the user probably have no idea that it's going on, and have no way to do long-running or blocking operations outside the lock. With explicit locking, it's at least very clear to the user when the code is inside a lock and when it isn't.
The problem is that I don't understand what this means - I don't know how to do that - except for what I have shown. It doesn't matter if it's done to the existing cache or a new one, that is a "design choice", but the "locking concept" remains the same. With the existing cache, there is no locking except for when values are retrieved or stored (except for Implementing per key/entry locking isn't just "more granular", it's fundamentally a different concept, because the cache lock only protects the cache. The key/entry/value lock has to have a start and an end. What @rkoshak suggested was basically that the lock would start whenever you accessed the value (get or put) and end when the script finishes execution. In addition to potentially being difficult to achieve technically (it must "never" fail, no matter what happens, or that cache value will remain locked forever), it means that all the other things that the script does after using the shared cache value, will impact the time the lock is held. There is no option to do long-running or blocking operations "outside the lock" unless you can do them before you access the value. And, it's all pretty fuzzy exactly what's going on for the author of the script. It also implicitly has the limitation that one script can only access one shared cache value in practice, because if you acquire a lock while already holding a lock, you must ensure that the two locks will always be acquired in the same order, or you have a potential deadlock. It sounds much easier to me to handle explicit lock and unlock, than to have to deal with a lot of limitations to what you can do after you access a shared cache value, so I'm just not convinced that doing it "the implicit way" is actually simpler for the user - and it certainly isn't better when it comes to the risk for "lock cascades" since you breach the fundamental principle of holding the lock only for the time it's actually needed.
I'm not sure that I understand what you mean here. If your cache/map only holds immutable objects, it does work in a way. Because, you can retrieve an entry and inspect it, but you can't modify it. You can replace it with another immutable object, but that still won't modify the existing object for other threads that might hold a reference to it. In such a case, I've said from pretty early in this discussion, that to really assess what's needed and where, we'd need to know much more about what happens when objects/values crosses the language boundries. It seems reasonably clear that Java objects (other than primitives and
I have explained time and time again. Take some mutable object, put it in or retrieve it from the cache, and then modify it. What I can't answer is which objects that retain their "identity" and which ones are copied/converted on their way back and forth between Java and the scripting language.
|
I found Nashorn jsr223 engine notes. It seems to me like it might be entirely up to the individual scripting engine whether they convert/copy objects or just pass on the Java objects. If that's the case, I really don't know how we can know what should be treated how. Objects that are converted/copied can be used with the existing shared cache, except for the TOCTOU issue, which can partially be "solved" using Objects that are passed through can not be used with the existing shared cache unless they are thread-safe or immutable (and immutable might not be a thing in this context). If there's no way to figure out what kind of object we're dealing with, there's no way to figure out what's the right way. The only way to be "safe" would be "the paranoid approach" and treat everything as if they were mutable objects, not copies. But that would complicate and slow down "most operations" (depending on how many objects are copied/converted) unnecessarily. Is there a way, from within the scripting language/helper library, to determine if you're dealing with a Java object or not? |
In JS the answer is yes, though the helper library goes to great lengths to always give you JS objects. Probably the simplest, though not foolproof way would be to just test if I don't know how As far as I understand, in JS, String, primitives, and ZonedDateTime are the only things that are converted back and forth by the language (and ZDT is handled by the add-on, not GraalVM). But you actually need to run some code to do the conversion. For example, when the This weirdness is one of the reasons JS does so much to help users avoid Java as much as possible and why wrapping the event object for UI users finally is such a boon. |
OK I see now. You're talking about the stored object itself and not the cache. Sorry I'm slow to get this. I've just tested: In a jruby-only implementation, locking the object can be done by creating and storing the mutex into shared cache. However this does add an extra level of complexity that your proposed |
There is a problem we identified with JRuby, not related to threading. Each JRuby (or for this matter any language) script, whether it is a file, or UI script, is created inside a separate (Jruby) scripting engine. JRuby objects created in one engine cannot be used in another engine because each engine maintains a separate context space and each would have unique hashcode. The JRuby object hashcode from one engine isn't valid / recognised in another engine. The previous test I mentioned above where Ruby stores a hash object into shared cache and it can be accessed in JS - well, guess what, it can't be accessed by another Ruby script properly! That's why we warn users not to store complex objects into shared cache. This limitation doesn't apply to Java objects, hence why it's possible to store the timer object, which is a java object. It sounds like JS has a somewhat similar limitation, albeit not exactly for the same reasons, but I'm not sure - @rkoshak / @florian-h05 can you confirm? I tried storing a JS Object in the shared cache and got this warning:
So the question is: how often will we encounter the average user storing and working with "mutable objects" in the shared cache? Are there any common use cases for this other than manipulating timers? |
Yes, a similar limitation but not the same. Each Script is in it's own context which is the same. But a hash issue like you describe isn't a the problem. You can access complex JS Objects accessed from two different rules in JS. But the problem is that JS only lets one Java Thread touch it at a time. If one rule is using the object, a multi threaded exception will be thrown when a second rule tries to use it. But as long as they don't try to access it at the same time it's all ok But of course there's no way to guarantee that *never" happens.
That warning was added based on an issue I filled when I asked if it were possible to lock the entries in the cache as is described here. The answer was "no, but will add a warning telling users not to put anything but primitives into the cache".
I have some classes that don things with timers which are complicated JS Objects I'd like to be able to share. But I work around the limitation. |
This was what the serializing cache was meant to address. When serializing objects to JSON before caching, the object retrieved wouldn't be the same instance, so no issues should exist neither for "invalid" hash codes nor Graal's "sharing forbidden" policy. It would allow sharing objects of any complexity, as long as the scripting language in question would be able to serialize it to JSON. I think the scripting language/helper library would have to be responsible for serialization/deserialization, I imagine that doing that from Java on behalf of the scripting languages would be a huge mess. So, "behind the scenes", it would be a pure Java string cache, holding JSON objects. It doesn't need locking, since the objects inserted and retrieved are always copies. For the same reason, these objects wouldn't be "live", so you would have to There is always some drawback with every solution. It would be possible to equip the JSON cache with locking of course, but I don't think that would be worth it, given that you only need locks for "transaction control", not for "object protection".
I think those kinds of questions are hard to answer. People tend to "evolve" their use according to what is possible/works, so observation of current practices might have a hefty bias. The reason I suggested potentially two new caches, instead of modifying the existing one, is partially because I think all these things might be difficult to combine into one in a way that wouldn't be very confusing to use, and partly because it would mean that nothing would change with the existing cache. That way, it wouldn't break anything, it would only be additional opportunities for those that have the initiative and interest to learn how to use them. Everybody else could just keep going as they do, within the limitations that exist there. |
I've wondered how the script engines could know what objects were "native" and which were Java. I put an object into the cache from JS, and then attached my debugger to the Java cache. Here is the object creation: rules.JSRule({
name: 'Cache Test',
execute: (event) => {
let obj = cache.shared.get('obj', () => {
return {
name: 'Complex Object',
rating: 5.2,
tags: ['tag1', 'tag2'],
created: new Date(),
nested: { a: 1, b: 2 },
level: 15
}});
utils.dumpObject(obj);
}
}) ..and here is the result: ![]() It is thus no longer a mystery how it can keep track of that. It's not stored as a traditional object, it holds all kind of references to the script engine context and whatnot. It's not surprising how JS can track shared access of JS objects, or why Ruby refuses to share these across different rules/scripts. They aren't "data objects", they are more like instances that give meaning in that particular engine and context. What I don't understand is how @jimtng could work with the same object from both JS and Ruby. But, that was maybe a Java map? |
So what's the difference between
Yes, the map was created in JRuby. JRuby's native Hash object is basically a Java map (it is an implementation of java.util.Map), then accessed by JS. The JS could also manipulate the object. |
The only technical difference is that the serializing cache only accepts Java The serializing cache would need "help" from the scripting language for the actual serialization/deserialization though. I didn't envision that users should do the serialization themselves, but that it was handled automatically when things were stored or retrieved. If that's not possible to do, it's pointless. The fact that it would do implicit serialization/deserialization would require that you access it in a "different way" than with the existing shared cache, because you still need to handle other types of values/objects, like Java objects or primitives. So, it might as well be a different cache. Also, just adding the serialization/deserialization as "new methods" to the existing shared cache would "introduce complications" in that you'd have to throw an exception (or is there another way?) if an object that wasn't a Java string would be requested for retrieval (not to mention that the deserialization of "regular strings" would fail). So, I think it would be most practical to have it as a separate entity, but it's completely possible to just extend the existing cache. I need input from you "scripting language experts" to know if it's even possible to do this implicit serialization/deserialization, and how it could best be solved if possible. |
It's pretty easy to do this automatically, or when requested, within jruby helper library, directly onto the existing shared cache. There are several mechanisms we can introduce that would be compatible with existing mechanisms for handling similar things. It won't need a separate cache class for this. The same could be done for locking mechanisms as well should the need arise, although that would probably limit the cross-language interoperability, but IMO cross-language interoperability is probably not something you'd need often. None of the current jruby helper library maintainers actually use shared cache. We only added the support for feature parity. |
As I said above, I didn't think that there was a technical requirement to keep them separate. I think it might be more logical to work with, and it would limit the exceptions one might receive. But, these are just suggestions, if people think that it's easier to work with as a part of the existing cache, that's doable as well. I was just about to try to figure out how to insert an object, like the Ruby hash, and potentially also a more "traditional object" into the shared cache from Ruby, so that I can check out how that looks in the debugger. It takes me quite some time to figure out these syntaxes, so maybe you could just share something I can copy/paste to do that? |
shared_cache["ruby_hash"] = { "key1" => "value1" }
shared_cache["java_map"] = java.util.HashMap.new({ "key1" => "value1"}) |
Don't I have to do anything to make the shared cache "available" (Import/require etc.)? Also, is there something like a "Ruby Object" with key/value pairs similar to JS, or isn't that a thing? A record, struct, some similar concept? |
no. Just those two lines, nothing else.
No there isn't. Ruby has Hash which is the same as java's Map interface. In fact in JRuby it's an implementation of java's Map interface. |
What about a mix of JS/jRuby/whatever and Java? Those timer management classes I mentioned are written in JS but the timers themselves are Java. While that would be unsuitable for this serialized cache, I can see other use cases where the complex Object might be a mix of Java and JS, particularly in scripting languages that don't do so much to hide the Java from the end users. |
In that case, the serializing cache might be of much relevance to Ruby.. except for perhaps for sharing hash'es? |
I can't quite answer that until I can see how the objects look in Java. Could you come up with some code that would let me create such a "mixed object", and I can store it in the cache and inspect it from Java? My guess is that these are still JS objects though, so they might be a hard nut to crack given that Graal won't allow these to be shared. Wrapping them in a Java map won't help either as far as I can tell, because the JS "object entry" would still be the JS object. I think maybe you'd have to manually serialize the JS object, and then put it in the shared cache, leaving the Timer outside the serialization. |
The whole serializing cache functionality could be implemented using the existing shared cache if handled by the helper libraries on storage, but I don't know how they could handle retrieval. My thinking is that when asked to If the deserialization trigger could be solved, this could be a neat way to handle scripting language native objects. It doesn't handle what the locking cache addresses, but it does handle what the serializing cache does. |
Add to your existing JS Class an entry that is a Java Object.
|
I'm getting an error: Failed to execute rule Cache-Test-ebea05b6-73f7-4789-9a5f-4803c459dc78: TypeError: function type() { [native code] } is not a constructor function: TypeError: function type() { [native code] } is not a constructor function
at <program> (test.js:36)
at get (openhab.js:2)
at execute (test.js:30)
at doExecute (openhab.js:2)
Failed to execute rule Cache-Test-5ad539bb-59c7-46ad-bd6e-5be2ca025282: TypeError: Access to host class java.time.LocalDatetime is not allowed or does not exist.: TypeError: Access to host class java.time.LocalDatetime is not allowed or does not exist.
at <program> (test.js:36)
at get (openhab.js:2)
at execute (test.js:30)
at doExecute (openhab.js:2) Ah... it was |
I still get an error now if I use "new": Failed to execute rule Cache-Test-d854fd3b-eda0-4053-88a0-1a42ae97ff49: TypeError: instantiate on java.time.LocalDateTime.now failed due to: Message not supported.: TypeError: instantiate on java.time.LocalDateTime.now failed due to: Message not supported.
at <program> (test.js:37)
at get (openhab.js:2)
at execute (test.js:30)
at doExecute (openhab.js:2) If I drop "new" and use this: rules.JSRule({
name: 'Cache Test',
execute: (event) => {
cache.shared.remove('obj');
let obj = cache.shared.get('obj', () => {
let LocalDateTimeClass = Java.type('java.time.LocalDateTime');
return {
name: 'Complex Object',
rating: 5.2,
tags: ['tag1', 'tag2'],
created: new Date(),
javaObj: LocalDateTimeClass.now(),
nested: { a: 1, b: 2 },
level: 15
}});
utils.dumpObject(obj);
}
}) .. I don't get an error. But, I don't see the |
This isn't a traditional PR and might never become one. It's a way to illustrate some ideas so that others can judge if they might be useful. The code isn't tested, and isn't intended to be complete in any way. It's closer to "compilable pseudocode".
It's somewhat tricky to get the logic in the
LockingCache
right. I've made an attempt, but haven't tested or debugged it, so it might be flawed and need further refactoring if the idea is accepted as worthwhile.This contains two potential implementations for shared caches:
ObjectCache
is meant to store serialized objects. Objects would have to be serialized and deserialized on storage and retrieval. This would have to be done by the individual scripting language add-on, and objects are unlikely to be valid across different scripting languages. The deserialization procedure would have to have some way to implement that deserialization failed/that the object is invalid, preferably a way that differs from if the object simply doesn't exist in the cache.LockingCache
is meant to store (mutable) objects that can be locked, processed and then unlocked. This could be used to "coordinate" several scripts, for example with counters. Cache entries are locked on an individual basis, so one entry can remain locked while other entries can be processed normally. While locked, exclusive access to the object is obtained, so that decisions can be made based on the current value without the possibility that the value changes in the meanwhile, thus avoiding TOCTOU. What is especially tricky here is preventing potential deadlocks if the entry is removed from the cache while scripts still potentially have references to it. Once it has been removed from the cache, it's no longer possible to get theLock
instance, and thus it can't be unlocked. Therefore, I've tried to make logic that tries to make that impossible, but I'm not sure if it's actually "waterproof".