-
Notifications
You must be signed in to change notification settings - Fork 648
feat(services/hdfs): refactor HdfsBackend to use HdfsCore for client operations #6033
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi, seems all hdfs tests are failed: https://github.com/apache/opendal/actions/runs/14482506154/job/40622110318?pr=6033 |
@Xuanwo yes, just took some time to fix that. The PR is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for helping HdfsCore
. Hope you had a great easter.
One comment about how the backend references the core.
#[derive(Clone, Debug)] | ||
pub struct HdfsCore { | ||
pub info: Arc<AccessorInfo>, | ||
pub client: Arc<hdrs::Client>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: If we push this further, we should wrap Client
's methods instead of exposing the client
. I doubt this will be trouble in the future. So either other maintainer's call or you decide the direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I originally started by abstracting the client calls but then decided against it since the wrappers would be quite thin (mostly wrapping the exact call) and the core library is only used internally within this module. I'm leaning more for letting future changes drive this if there becomes a need to abstract more than just the nested client call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can get a clone of the client and reduce the call chain. I'll add that in a new commit and you can let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have applied this change here fb2e349. Let me know if it makes sense. Thanks
Ok(( | ||
RpDelete::default(), | ||
oio::OneShotDeleter::new(HdfsDeleter::new(Arc::new(self.clone()))), | ||
oio::OneShotDeleter::new(HdfsDeleter::new(Arc::clone(&self.core), self.root.clone())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @uruemu. We now have a base to start adding code to HdfsCore
. We’ll continue iterating to build out this core. Some implementation details can be extracted into functions within the core itself. This will make the backend easier to read and understand, as it will delegate more of the implementation to either the core or the reader/writer.
|
||
fn info(&self) -> Arc<AccessorInfo> { | ||
self.info.clone() | ||
self.core.info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use self.core.info
.
pub struct HdfsBackend { | ||
pub info: Arc<AccessorInfo>, | ||
pub root: String, | ||
root: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this into HdfsCore
. When using the root in the HdfsBackend
, use self.core.root
.
} | ||
|
||
async fn stat(&self, path: &str, _: OpStat) -> Result<RpStat> { | ||
let p = build_rooted_abs_path(&self.root, path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this line to 272: m.set_last_modified(meta.modified().into());
This can be a convenience function. Maybe as HdfsCore::hdfs_get_metadata
. Then we can use hdfs_get_metadata
in these functions:
blocking_stat
stat
self.core.client.metadata
is a good function. We will keep using self.core.client.metadata
.
} | ||
|
||
async fn read(&self, path: &str, args: OpRead) -> Result<(RpRead, Self::Reader)> { | ||
let p = build_rooted_abs_path(&self.root, path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can move parts of this implementation to core as HdfsCore::hdfs_read_from_path
. Once we have done that, the read
function will become:
- read the file
- return results (
HdfsReader
is what OpenDAL returns for users to read data from)
} | ||
|
||
async fn write(&self, path: &str, op: OpWrite) -> Result<(RpWrite, Self::Writer)> { | ||
let target_path = build_rooted_abs_path(&self.root, path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, we can move parts of this implementation to core as HdfsCore::hdfs_open_path_for_writing
. Once we have done that, the write
function will become:
- open the file for writing
- return results (
HdfsWriter
is what OpenDAL returns for users to write data from)
Ok((RpList::default(), Some(rd))) | ||
} | ||
|
||
async fn rename(&self, from: &str, to: &str, _args: OpRename) -> Result<RpRename> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't read the details of the implementation. At a glance, this looks similar to blocking_rename
. Maybe we can build HdfsCore::hdfs_rename
. This will benefit:
rename
blocking_rename
} | ||
|
||
fn blocking_write(&self, path: &str, op: OpWrite) -> Result<(RpWrite, Self::BlockingWriter)> { | ||
let target_path = build_rooted_abs_path(&self.root, path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to extract this implementation to core
as well.
} | ||
|
||
fn blocking_read(&self, path: &str, args: OpRead) -> Result<(RpRead, Self::BlockingReader)> { | ||
let p = build_rooted_abs_path(&self.root, path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to extract this to core
too.
} | ||
|
||
impl HdfsCore { | ||
pub fn info(&self) -> Arc<AccessorInfo> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this function. pub info: Arc<AccessorInfo>
is enough.
@erickguan Will come back to this PR this week. Sorry, I've been a bit busy |
@uruemu Thanks for working on this and your schedule! |
Which issue does this PR close?
Addresses #5702: Split service logic to
backend
andcore
modulesPart of #5702.
Rationale for this change
What changes are included in this PR?
This pull request refactors the
GridFsBackend
by introducing a newGridFsBackendCore
struct to hosting connection functionality and state.GridFsCore
implements thekv::Adapters
trait, allowing the removal of theAdapter
structAre there any user-facing changes?