-
Notifications
You must be signed in to change notification settings - Fork 182
ASN1.decode/traverse/decode_all and ASN1#to_der in pure ruby #777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I think the tests outside
What breaking change do you expect? |
09c3fbf to
e92f837
Compare
|
@rhenium made a few adjustments to the ruby code. I believe it got cleaner. The same point stands regarding I broke down Didn't start cutting C code just yet. |
e92f837 to
c4823d9
Compare
I think so, unfortunately.
Encoding the value should be trivial. I added in an inline comment.
Please check X.690 for exact details, but encoding these types should be relatively simple too, since they are just encoded into ISO 8601 (except UTCTime uses 2-digit years). |
95e2d8d to
005d32b
Compare
2f32854 to
1e35012
Compare
|
@rhenium did UTCTime and GeneralizedTime, lmk what you think. |
13e200e to
5785230
Compare
lib/openssl/asn1.rb
Outdated
| xclass = take_asn1_tag_class(tag_class) | ||
|
|
||
| i = constructed ? V_ASN1_CONSTRUCTED : 0 | ||
| i |= (xclass & V_ASN1_PRIVATE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
& V_ASN1_PRIVATE doesn't change the value, as a tag class can only be one of the 4 classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ported this from here
47e6cf5 to
cfcbe9e
Compare
|
@rhenium anything still pending? |
faa1dc0 to
6f22cbd
Compare
|
@rhenium it's been a while since I worked on this, but got a buffer to do some benchmark / optimizations while doing similar work in # benchmarks/asn1.rb
require "benchmark/ips"
require "openssl"
OBJS = {
"integer" => OpenSSL::ASN1::Integer.new(128),
"negative integer" => OpenSSL::ASN1::Integer.new(-128),
"oid" => OpenSSL::ASN1::ObjectId.new("1.2.34.56789".b),
"nid" => OpenSSL::ASN1::ObjectId.new("sha256"),
"emptyseq" => OpenSSL::ASN1::Sequence.new([]),
"seq" => OpenSSL::ASN1::Sequence.new([OpenSSL::ASN1::EndOfContent.new]),
"emptyset" => OpenSSL::ASN1::Set.new([]),
"set" => OpenSSL::ASN1::Set.new([OpenSSL::ASN1::EndOfContent.new]),
"utctime" => OpenSSL::ASN1::UTCTime.new(Time.utc(2016, 9, 8, 23, 43, 39)),
"gentime" => OpenSSL::ASN1::GeneralizedTime.new(Time.utc(2016, 12, 8, 19, 34, 29))
}
RubyVM::YJIT.enable
def test_name(name)
"#{name} (#{OpenSSL::VERSION})"
end
Benchmark.ips do |x|
# x.config(warmup: 2, time: 5)
name = ENV.fetch("TEST", "integer")
obj = OBJS.fetch(name)
x.report(test_name(name)) { obj.to_der }
x.save! "asn1-results.json"
x.compare!
endThe measurements I got locally are below. The tl;dr:
|
lib/openssl/asn1.rb
Outdated
|
|
||
| return "\x00".b if @value.empty? | ||
|
|
||
| @unused_bits.chr << super |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integer#chr produces a US-ASCII String if the value is lower than 128.
I think we should avoid using Integer#chr in OpenSSL::ASN1 at all since automatic encoding promotion with ascii-only strings can be very tricky, for example:
(1.chr << "a".b).encoding #=> #<Encoding:US-ASCII>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've kept .chr followed by .force_encoding. I don't know of a more efficient way to perform the op in ruby (using pack would involve allocating an array). If you have a better alternative, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's worth it. Array#pack or String#setbyte seems much easier to understand IMO, even if that introduces a slight performance loss.
If this level of optimization is necessary, I think that might mean we should stick with a C implementation, but I'm hoping it's not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you past example snippets of how you'd see this rewritten with pack and/or. setbyte? I'm struggling with seeing how this would end up more maintainable in this case, as setbyte does not afaik do appends.
|
Sorry for the slow response. I've added some additional inline comments. My motivation for this series is that we can stop relying on those OpenSSL C API for BER encoding/decoding, which are screaming that it's not intended to be used outside OpenSSL, are completely undocumented, often have inconsistent semantics, and have undergone a lot of unexpected breaking changes. We also want to avoid performing string manipulation in C in general, especially when handling untrusted input. I think our top priority should be to make the code more understandable and maintainable, while keeping backwards compatibility. Good performance is nice to have, but I feel it's too early to focus on that. Some of the current optimizations feel a bit too aggressive. Using As for the speed of the OID encoding (or of anything else), I wouldn't rule out adding a specialized helper method implemented in C if it's necessary. However, I'm not yet sure if the current performance is an actual issue. I think it's too early to tell it. For measuring performance, I think using a complex and realistic structure like an X.509 certificate might provide a more useful result. |
6f22cbd to
a46b07b
Compare
Wholeheartedly agree. FWIW the main motivation for this work is the belif that ruby code is more maintainable/readable than C code, and the ASN1 benefits little of being in C.
I agree, I just want to make sure that any regression is within a certain margin. There are people using this in production, after all. I don't think that some of the optimizations done here make the code less readable though:
Agree with the edge cases, but just to clarify, the goal of this MR is not to touch parsing (that can come later).
Agreed. It'd be great to have some input from someone that knows about YJIT or ZJIT whether these operations are something that future versions will be able to optimize though. The less C, the more maintainable it is.
Anything in the repo that I could use? |
b3ce3b6 to
cba1a5f
Compare
|
@rhenium I've also added a pure ruby implementation of the decoding functions. LMK how does that look. The truffleruby jobs failed, but that seems to be rather a truffleruby bug. |
lib/openssl/asn1.rb
Outdated
| no_id_idx = if id == 0x1f | ||
| id = 0 | ||
| count = 1 | ||
| data[1..].each_byte do |byte| | ||
| count += 1 | ||
|
|
||
| id = (id << 7) | (byte & 0x7f) | ||
| break if byte.nobits?(0x80) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take extra care with the decoding code since the input data is not trusted. Some parts of this PR appear to have quadratic time complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would you suggest doing it more efficiently?
|
I didn't review deeper because this looks like a WIP. There are a number of unreachable code paths and missing input validations. At the moment, this feels like a fairly direct porting from C to Ruby. Could we try to restructure it so that it becomes more idiomatic Ruby? Our goal of a rewrite should be maintainability, not just doing something in Ruby. The existing C code does have issues, as it's built on top of the undocumented OpenSSL functions which were probably never meant to be used in the way they currently are, and also the general code quality is less than ideal. That's why I think a rewrite makes sense, and doing so in Ruby instead of C would be a good idea. But doing a 1:1 porting of problematic code to Ruby wouldn't really be an improvement IMO. |
this allows (when value is an integer) the runtime and JIT to optimize the call, instead of relying on C-extension openssl
…ppends) the former is a data structure that the runtime can improve processing of, and avoids realloc'ing the string on each call (as well as packing one byte at a time)
which means that it'll be nil; this improves encoding of asn1 nulls and cons with no elements
it's US-ASCII for integers < 128
* utctime should not take dates into account with a year below 1950 and above 2049 * gentime year must be 4 digits
cba1a5f to
ac643fb
Compare
|
@rhenium thx for the review
You'll have to be more concrete about what you mean. While parts of it may follow the same flow, the aim was to make the review more straightforward, and also privilege locality of similar logic. I also privilege a more imperative way of structuring functions when it comes to writing encode/decode logic, but that's a personal preference, I don't mind doing it differently. |
This is a draft request-for-comment proposal for implementing
OpenSSL::ASN1::ASN1Data#to_derin ruby, as per suggestion on the previous MR. It already passes the test, although there is a part which hasn't been ported yet, and seems not be covered in the tests. I guess that the ruby code can benefit from being broke down in multiple files, but I'll keep it here just for easy of review, ,for now. I also didn't delete C code, not yet.Currently, the whole of
#to_deris implemented in the ASN1Data class. The reason is, one can build a primitive or constructive object "manually" (if tag and/or tag_class is known), or can just use arrays directly. It would benefit from moving specific implementations into the respective child classes, but then it'd be a breaking change because of what I said. Any suggestions?I also didn't do any benchmarks yet. Is it relevant?
LMK what you think.