github fallback to git ls-remote#129
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 28 minutes and 27 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughRefactors Git/GitHub resolution: introduces new git headers ( Changes
Sequence DiagramsequenceDiagram
participant Caller
participant Git as Git::resolveSCMBranch
participant Settings as Settings::remotes()
participant Cache as Local Cache (.mkn/<type>/ver)
participant GH as Github API
participant SCM as Git SCM
Caller->>Git: resolveSCMBranch(repo, cacheDirName)
Git->>Cache: read .mkn/{cacheDirName}/ver
alt cache hit
Cache-->>Git: branch
Git-->>Caller: return branch
else cache miss
Git->>Settings: remotes(module?)
Settings-->>Git: list of remote prefixes
Git->>GH: try GET_LATEST(repo/module) (if GH available)
alt GH success
GH-->>Git: branch
else GH fail
Git->>SCM: defaultRemoteBranch(candidate URL)
SCM-->>Git: branch
end
Git->>Cache: write branch to .mkn/{cacheDirName}/ver
Git-->>Caller: return branch
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (7)
inc/maiken/settings.hpp (1)
63-66:remotes(module)returns by value — silently copies the remotes vector on every call.Since
remoteModules()/remoteRepos()returnstd::vector<std::string> const&, plainautostrips the reference and deducesstd::vector<std::string>, so each call site (e.g. range-for insrc/maiken/scm/getter.cpp:45andinc/maiken/git/github.hpp:235) copies the full vector. Useauto const&(ordecltype(auto)) to preserve the reference and match the style of the underlying accessors.♻️ Proposed change
- auto remotes(bool const module) const { - if (module) return remoteModules(); - return remoteRepos(); - } + auto const& remotes(bool const module) const { + if (module) return remoteModules(); + return remoteRepos(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/settings.hpp` around lines 63 - 66, The remotes(bool module) function currently returns by value (using plain auto) which copies the vector from remoteModules() or remoteRepos(); change its return to preserve the reference (e.g., use "auto const&" or "decltype(auto)") so remotes(...) returns the same const& as remoteModules()/remoteRepos(), preventing unnecessary copies in callers like getRemote lists and GitHub code paths referencing remotes(), and ensure the function signature remains const-qualified as in remotes(bool const module) const.inc/maiken/git/github.hpp (2)
51-52: Consider markingURLconst.
api.github.comis effectively a compile-time constant; making itstatic inline std::string const URL = ...;(orconstexpr std::string_view) prevents accidental mutation from any TU that includes this header. Same applies toportif it’s not meant to be tunable at runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/git/github.hpp` around lines 51 - 52, The header defines mutable globals URL and port; make them immutable to prevent accidental mutation by other translation units: change static inline std::string URL to a constant form (e.g., static inline const std::string URL or better static inline constexpr std::string_view URL = "api.github.com") and mark port as const (e.g., static inline const int port = 443) so URL and port are compile-time / read-only symbols used by functions referencing URL and port.
194-194: Replace magic number3withYAML::NodeType::Sequencefor clarity and maintainability.The comparison
yaml.root().Type() == 3relies on the underlying enum ordering (Undefined=0, Null=1, Scalar=2, Sequence=3, Map=4). Using the named constant improves readability and guards against potential issues if yaml-cpp ever reorders the enum values.Proposed change
- if (yaml.root().Type() == 3) { + if (yaml.root().Type() == YAML::NodeType::Sequence) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/git/github.hpp` at line 194, Replace the magic numeric literal in the YAML node type check: change the comparison using yaml.root().Type() == 3 to use the named enum YAML::NodeType::Sequence (i.e., compare yaml.root().Type() against YAML::NodeType::Sequence) so the intent is clear and maintainable in github.hpp where the yaml.root() type is tested.src/maiken/depmod.cpp (1)
121-129: Cache lookup is duplicated betweenget_cache_versionandGit::resolveSCMBranch.
Git::resolveSCMBranch(depName, type)(perinc/maiken/git/git.hpp) opens the same.mkn/<type>/ver/<name>file via its ownmkn::kul::File verFile{...}check and short-circuits withmkn::kul::io::Reader(verFile).readLine(). With Line 126 already doingget_cache_version(depName, type), the cache is checked twice in succession. Behaviourally harmless, but you can drop the localget_cache_versioncall (and the helper, if not used elsewhere) and letGit::resolveSCMBranchown the cache contract — keeping a single source of truth for the.mkn/<type>/verlayout.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/maiken/depmod.cpp` around lines 121 - 129, The cache lookup is duplicated: remove the local get_cache_version(depName, type) call inside the resolveSCMBranch lambda and let Git::resolveSCMBranch(depName, type) handle the .mkn/<type>/ver/<name> cache lookup; update the lambda to only check n[STR_VERSION] and Applications::INSTANCE().getOrNullptr(depName) before calling Git::resolveSCMBranch(depName, type), and if get_cache_version is not used elsewhere, delete that helper to keep the cache handling in Git::resolveSCMBranch as the single source of truth.src/project.cpp (1)
38-39: Drop the now-stale[[maybe_unused]]oncacheDirName.The parameter is now actually used at Line 63 (
Git::resolveSCMBranch(scm, cacheDirName)), so the attribute is misleading and should be removed.♻️ Proposed change
-maiken::ProjectInfo maiken::ProjectInfo::PARSE_LINE( - std::string const& line, [[maybe_unused]] std::string const& cacheDirName) { +maiken::ProjectInfo maiken::ProjectInfo::PARSE_LINE( + std::string const& line, std::string const& cacheDirName) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/project.cpp` around lines 38 - 39, The attribute [[maybe_unused]] on the cacheDirName parameter of maiken::ProjectInfo::PARSE_LINE is stale because cacheDirName is actually used (Git::resolveSCMBranch(scm, cacheDirName)); remove the [[maybe_unused]] annotation from the function signature so the parameter accurately reflects its usage and recompile to ensure no warnings/errors remain; update the declaration and any matching definitions (maiken::ProjectInfo::PARSE_LINE) to keep signatures consistent.inc/maiken/git/git.hpp (1)
78-78: PreferConstants::STR_MODover the literal"mod".
defs.hppalready definesstatic constexpr auto STR_MOD = "mod";. Using it here avoids a magic string and keeps the cache-dir convention discoverable from a single source of truth.♻️ Proposed refactor
- bool const module = cacheDirName == "mod"; + bool const module = cacheDirName == Constants::STR_MOD;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/git/git.hpp` at line 78, Replace the magic string "mod" with the defined constant Constants::STR_MOD in the module initializer so it reads something like using the cacheDirName comparison against Constants::STR_MOD; update the boolean declaration (the variable named module that compares cacheDirName) to use Constants::STR_MOD and ensure the header where Constants is defined (defs.hpp) is available/visible to git.hpp so the symbol resolves.inc/maiken/defs.hpp (1)
57-63: Confirm placeholder credentials are the intended strategy forgit ls-remotecallers.Embedding
u:p@directly in the default URLs aligns with the pattern insrc/maiken/scm/getter.cpp(line ~55) where the same prefix is injected dynamically for HTTP URLs lacking@. That keeps git from prompting interactively on a 401.Two related observations worth verifying:
- The new
Git::resolveSCMBranchpath ininc/maiken/git/git.hppcallsmkn::kul::scm::Git().defaultRemoteBranch(url)directly with whateverSettings::INSTANCE().remotes(module)returns; it does not perform the sameu:p@injection thatgetter.cppdoes. With these new defaults that's fine, but a user who overrides_MKN_REMOTE_REPO_at compile time with a plainhttps://...(no@) would still be exposed to interactive auth prompts in this new code path.- Consider applying the same
u:p@injection insideGit::resolveSCMBranch(or pushing it down intokul::scm::Git) so the two SCM paths behave consistently regardless of how remotes are configured.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/defs.hpp` around lines 57 - 63, The new code path in Git::resolveSCMBranch calls mkn::kul::scm::Git().defaultRemoteBranch(url) with whatever Settings::INSTANCE().remotes(module) returns, so a compile-time override of _MKN_REMOTE_REPO_ that is a plain "https://..." can cause interactive auth prompts; update Git::resolveSCMBranch to detect HTTP(S) remotes without an '@' and inject the "u:p@" placeholder (the same logic used in src/maiken/scm/getter.cpp) before calling mkn::kul::scm::Git().defaultRemoteBranch(url), or alternatively move this injection into mkn::kul::scm::Git::defaultRemoteBranch so both call sites (getter.cpp and Git::resolveSCMBranch) behave consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@inc/maiken/git/git.hpp`:
- Around line 80-115: The code currently treats a true return from
Github<>::GET_LATEST as sufficient to mark resolved even if the returned version
string is empty; update the logic so that after calling
Github<>::GET_LATEST(repo, version, module) you verify version is non-empty
before setting resolved to true (i.e., set resolved = (Github<>::GET_LATEST(...)
&& !version.empty())), and ensure the cache write and early return only occur
when version is non-empty (do not rm/mk/write verFile or return when
version.empty(), instead let the code fall back to GET_DEFAULT_BRANCH or
propagate failure).
In `@inc/maiken/git/github.hpp`:
- Around line 89-220: The retry logic and multi-strategy fallback are being
aborted by KEXCEPT thrown inside the withResponse lambdas (seen in
GET_DEFAULT_BRANCH, GET_LATEST_RELEASE, GET_LATEST_TAG), so either catch
github::Exception around each request.send() inside those retry loops or stop
throwing from the response callbacks (e.g., set an error flag/status and let
send() return) so retries can occur and GET_LATEST can iterate strategies;
specifically, update GET_DEFAULT_BRANCH/GET_LATEST_RELEASE/GET_LATEST_TAG to
handle non-200 responses without propagating an exception (or wrap their send()
calls in try/catch that logs and continues), and/or add per-strategy try/catch
in GET_LATEST to ensure a thrown github::Exception from one strategy does not
prevent trying the next.
---
Nitpick comments:
In `@inc/maiken/defs.hpp`:
- Around line 57-63: The new code path in Git::resolveSCMBranch calls
mkn::kul::scm::Git().defaultRemoteBranch(url) with whatever
Settings::INSTANCE().remotes(module) returns, so a compile-time override of
_MKN_REMOTE_REPO_ that is a plain "https://..." can cause interactive auth
prompts; update Git::resolveSCMBranch to detect HTTP(S) remotes without an '@'
and inject the "u:p@" placeholder (the same logic used in
src/maiken/scm/getter.cpp) before calling
mkn::kul::scm::Git().defaultRemoteBranch(url), or alternatively move this
injection into mkn::kul::scm::Git::defaultRemoteBranch so both call sites
(getter.cpp and Git::resolveSCMBranch) behave consistently.
In `@inc/maiken/git/git.hpp`:
- Line 78: Replace the magic string "mod" with the defined constant
Constants::STR_MOD in the module initializer so it reads something like using
the cacheDirName comparison against Constants::STR_MOD; update the boolean
declaration (the variable named module that compares cacheDirName) to use
Constants::STR_MOD and ensure the header where Constants is defined (defs.hpp)
is available/visible to git.hpp so the symbol resolves.
In `@inc/maiken/git/github.hpp`:
- Around line 51-52: The header defines mutable globals URL and port; make them
immutable to prevent accidental mutation by other translation units: change
static inline std::string URL to a constant form (e.g., static inline const
std::string URL or better static inline constexpr std::string_view URL =
"api.github.com") and mark port as const (e.g., static inline const int port =
443) so URL and port are compile-time / read-only symbols used by functions
referencing URL and port.
- Line 194: Replace the magic numeric literal in the YAML node type check:
change the comparison using yaml.root().Type() == 3 to use the named enum
YAML::NodeType::Sequence (i.e., compare yaml.root().Type() against
YAML::NodeType::Sequence) so the intent is clear and maintainable in github.hpp
where the yaml.root() type is tested.
In `@inc/maiken/settings.hpp`:
- Around line 63-66: The remotes(bool module) function currently returns by
value (using plain auto) which copies the vector from remoteModules() or
remoteRepos(); change its return to preserve the reference (e.g., use "auto
const&" or "decltype(auto)") so remotes(...) returns the same const& as
remoteModules()/remoteRepos(), preventing unnecessary copies in callers like
getRemote lists and GitHub code paths referencing remotes(), and ensure the
function signature remains const-qualified as in remotes(bool const module)
const.
In `@src/maiken/depmod.cpp`:
- Around line 121-129: The cache lookup is duplicated: remove the local
get_cache_version(depName, type) call inside the resolveSCMBranch lambda and let
Git::resolveSCMBranch(depName, type) handle the .mkn/<type>/ver/<name> cache
lookup; update the lambda to only check n[STR_VERSION] and
Applications::INSTANCE().getOrNullptr(depName) before calling
Git::resolveSCMBranch(depName, type), and if get_cache_version is not used
elsewhere, delete that helper to keep the cache handling in
Git::resolveSCMBranch as the single source of truth.
In `@src/project.cpp`:
- Around line 38-39: The attribute [[maybe_unused]] on the cacheDirName
parameter of maiken::ProjectInfo::PARSE_LINE is stale because cacheDirName is
actually used (Git::resolveSCMBranch(scm, cacheDirName)); remove the
[[maybe_unused]] annotation from the function signature so the parameter
accurately reflects its usage and recompile to ensure no warnings/errors remain;
update the declaration and any matching definitions
(maiken::ProjectInfo::PARSE_LINE) to keep signatures consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: add00440-9621-4454-aeb3-32952a1ad263
📒 Files selected for processing (11)
inc/maiken/defs.hppinc/maiken/git.hppinc/maiken/git/git.hppinc/maiken/git/github.hppinc/maiken/github.hppinc/maiken/settings.hppmkn.yamlsrc/maiken/cli.cppsrc/maiken/depmod.cppsrc/maiken/scm/getter.cppsrc/project.cpp
| mkn::kul::File const verFile{name, ".mkn/" + cacheDirName + "/ver"}; | ||
| if (verFile) return mkn::kul::io::Reader(verFile).readLine(); | ||
|
|
||
| KOUT(NON) << "Attempting branch deduction resolution for: " << name; | ||
| std::string version; | ||
| bool resolved = false; | ||
|
|
||
| #ifdef _MKN_WITH_MKN_RAM_ | ||
| try { | ||
| resolved = Github<>::GET_LATEST(repo, version, module); | ||
| } catch (...) { | ||
| KLOG(ERR) << "Github API rejected - falling back to git for: " << name; | ||
| } | ||
| #endif // _MKN_WITH_MKN_RAM_ | ||
|
|
||
| if (!resolved) { | ||
| std::vector<std::string> repos; | ||
| if (IS_SOLID(repo)) | ||
| repos.push_back(repo); | ||
| else | ||
| for (auto const& s : Settings::INSTANCE().remotes(module)) repos.push_back(s + repo); | ||
|
|
||
| for (std::string const& s : repos) { | ||
| if (GET_DEFAULT_BRANCH(s, version)) { | ||
| resolved = true; | ||
| break; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if (resolved) { | ||
| verFile.rm(); | ||
| verFile.dir().mk(); | ||
| mkn::kul::io::Writer(verFile) << version; | ||
| return version; | ||
| } |
There was a problem hiding this comment.
Defend against caching/returning an empty version.
GET_DEFAULT_BRANCH (line 66) only sets resolved indirectly via !branch.empty(), but the GitHub path at line 89 trusts whatever boolean Github<>::GET_LATEST returns. If GET_LATEST ever returns true with an empty version (e.g., a repo where the underlying default-branch/release/tag callback produced an empty string but still returned success), this block will:
- Skip the git fallback at lines 95–108.
- Write an empty string to
.mkn/<cacheDirName>/verat line 113. - Return
""to the caller — and on every subsequent invocation, line 81 will short-circuit to that empty cached value instead of re-resolving.
That likely breaks downstream consumers that expect a real branch/tag name.
🛡️ Proposed defensive fix
KOUT(NON) << "Attempting branch deduction resolution for: " << name;
std::string version;
bool resolved = false;
`#ifdef` _MKN_WITH_MKN_RAM_
try {
- resolved = Github<>::GET_LATEST(repo, version, module);
+ resolved = Github<>::GET_LATEST(repo, version, module) && !version.empty();
} catch (...) {
KLOG(ERR) << "Github API rejected - falling back to git for: " << name;
}
`#endif` // _MKN_WITH_MKN_RAM_
@@
- if (verFile) return mkn::kul::io::Reader(verFile).readLine();
+ if (verFile) {
+ auto const cached = mkn::kul::io::Reader(verFile).readLine();
+ if (!cached.empty()) return cached;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@inc/maiken/git/git.hpp` around lines 80 - 115, The code currently treats a
true return from Github<>::GET_LATEST as sufficient to mark resolved even if the
returned version string is empty; update the logic so that after calling
Github<>::GET_LATEST(repo, version, module) you verify version is non-empty
before setting resolved to true (i.e., set resolved = (Github<>::GET_LATEST(...)
&& !version.empty())), and ensure the cache write and early return only occur
when version is non-empty (do not rm/mk/write verFile or return when
version.empty(), instead let the code fall back to GET_DEFAULT_BRANCH or
propagate failure).
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
inc/maiken/git/git.hpp (1)
49-56: Consider broadening thecatchclause to handle all exceptions.The function catches only
mkn::kul::Exception, butdefaultRemoteBranch(from the externalmkn::kul::scmlibrary) may throw other exception types—such asstd::runtime_errororstd::system_errorfrom process operations,std::bad_alloc, or other standard exceptions. These would escapeGET_DEFAULT_BRANCHand interrupt the fallback loop inresolveSCMBranch(lines 88–93), preventing subsequent remotes from being tried. Since this helper functions as a probe ("did this remote work?"), catchingstd::exception(or...) and returningfalsealigns with its usage pattern.- } catch (mkn::kul::Exception const&) { + } catch (std::exception const&) { return false; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inc/maiken/git/git.hpp` around lines 49 - 56, GET_DEFAULT_BRANCH currently only catches mkn::kul::Exception so non-library exceptions (std::runtime_error, std::bad_alloc, etc.) can escape and break resolveSCMBranch's fallback loop; update the function to catch broader exceptions (e.g., catch (std::exception const&) or a catch-all ...) around the call to mkn::kul::scm::Git().defaultRemoteBranch(url) and return false on any caught exception so GET_DEFAULT_BRANCH always fails safely when any error occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@inc/maiken/git/github.hpp`:
- Line 97: The process-wide non-atomic latch _403_rate_limited currently causes
permanent skipping of GitHub requests after a single 403 and has a data race
when used concurrently; replace or scope it so it is safe and not permanent:
either change the declaration of _403_rate_limited to std::atomic<bool> and use
atomic load/store (with appropriate memory_order) in GET_DEFAULT_BRANCH,
GET_LATEST_RELEASE, GET_LATEST_TAG and any GET_LATEST callers, and add logic to
clear/reset the atomic when a subsequent successful GitHub request occurs (or
after a configured time window), or instead remove the static inline and make
the latch local to a single resolution attempt (e.g., a thread-local or
function-scoped flag) so each resolution can retry GitHub instead of permanently
falling back to git ls-remote.
- Around line 242-254: The code accesses owner[0] before ensuring owner is
non-empty, causing UB when s ends with "github.com"; change the checks so
owner.empty() is tested before indexing: after computing owner =
s.substr(s.find("github.com") + 10) immediately check if owner.empty() and
continue if so, then validate owner[0] != '/' && owner[0] != ':' and proceed
with owner.erase/substring logic (i.e., reorder the guards around the local
variables s and owner rather than indexing owner first).
- Around line 198-204: Replace the magic check and incorrect node access: in the
logic around mkn::kul::yaml::String const yaml(r.body()) (the block that
currently uses yaml.root().Type() == 3 and yaml.root()["ref"]), use
yaml.root().IsSequence() instead and iterate the sequence (e.g., for i over
yaml.root()) using indexed access yaml.root()[i] to read the correct field (such
as "tag" or "sha") from each element; alternatively, if you actually need a ref
name, call the /git/refs/tags/{tag} endpoint instead of expecting a "ref" key in
the /git/tags array. Ensure you do not use operator[] on the sequence to mutate
it and only access element fields via the element node (yaml.root()[i]["tag"] or
["sha"]).
---
Nitpick comments:
In `@inc/maiken/git/git.hpp`:
- Around line 49-56: GET_DEFAULT_BRANCH currently only catches
mkn::kul::Exception so non-library exceptions (std::runtime_error,
std::bad_alloc, etc.) can escape and break resolveSCMBranch's fallback loop;
update the function to catch broader exceptions (e.g., catch (std::exception
const&) or a catch-all ...) around the call to
mkn::kul::scm::Git().defaultRemoteBranch(url) and return false on any caught
exception so GET_DEFAULT_BRANCH always fails safely when any error occurs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0a503b78-4b54-48ab-a617-c0ec5b020a7e
📒 Files selected for processing (3)
inc/maiken/git/git.hppinc/maiken/git/github.hpptest/github/test.cpp
✅ Files skipped from review due to trivial changes (1)
- test/github/test.cpp
Summary by CodeRabbit
New Features
Refactor
Chores