feat(skills): mongodb-connection MCP-419#5
Conversation
rozza
left a comment
There was a problem hiding this comment.
Added a note about the Java driver.
There was a problem hiding this comment.
Pull request overview
Adds a new mongodb-connection skill aimed at providing context-aware guidance for MongoDB client connection configuration and troubleshooting across multiple driver languages.
Changes:
- Introduces the core skill prompt/instructions for context-first MongoDB connection configuration.
- Adds reference material for monitoring/metrics interpretation and diagnosing connection churn.
- Adds language-specific connection-pattern guidance and curated links to official MongoDB/driver documentation.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| skills/mongodb-connection/SKILL.md | Core skill definition and workflow for context-driven MongoDB connection configuration and troubleshooting. |
| skills/mongodb-connection/references/monitoring-guide.md | Detailed monitoring/metrics guide for pool health and operational diagnosis. |
| skills/mongodb-connection/references/language-patterns.md | Language-by-language driver patterns, defaults, and best practices. |
| skills/mongodb-connection/references/external-links.md | Centralized external documentation links for infrastructure, driver options, and monitoring docs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
RaschidJFR
left a comment
There was a problem hiding this comment.
@rozza Can you help me verify some sections I recently updated? I tagged you in the comments.
rozza
left a comment
There was a problem hiding this comment.
Think its look really good.
Great catch on the node default pool size - seems the models have out of date data.
Just a couple of comments of feedback.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
dacharyc
left a comment
There was a problem hiding this comment.
Hey @RaschidJFR - back to you with what I've gotten through so far. I've spent a couple of hours with this, and my feedback is mainly about a lack of specificity throughout. I've done a quick skim through the rest of the monitoring-guide.md and if I continued to review, it would be a lot more similar comments.
What we have here seems to look useful on the surface, but it's unclear to me if it provides enough information to effectively improve agent outputs. I'd love to learn more about how you identified the areas of focus for this skill, and what types of performance differences you observed when testing without the skill and with the skill.
|
|
||
| **These are reference templates—adapt them to the user's specific context from Phase 1.** Each scenario below applies when the user described that environment during context gathering. | ||
|
|
||
| **Language-specific implementations**: For Python, Java, Go, C#, Ruby, or PHP, see `references/language-patterns.md` for complete code examples and driver-specific patterns. |
There was a problem hiding this comment.
Looks like there aren't any code examples in the referenced file, so we may want to remove this reference to them.
| **Language-specific implementations**: For Python, Java, Go, C#, Ruby, or PHP, see `references/language-patterns.md` for complete code examples and driver-specific patterns. | |
| **Language-specific implementations**: For Python, Java, Go, C#, Ruby, or PHP, see `references/language-patterns.md` for driver-specific patterns. |
|
|
||
| ## Language-Specific Considerations | ||
|
|
||
| Configuration examples above are Node.js-based. For Python, Java, Go, C#, Ruby, or PHP: consult `references/language-patterns.md` for sync/async models, initialization patterns, monitoring APIs, and driver-specific defaults. |
There was a problem hiding this comment.
Since we're not actually showing any examples, we may want to avoid using that term here.
Also, can we be specific about which things above are Node.js based? Is it the parameter names we're providing, or availability/implementation details in each Driver, or something else? I'm seeing Driver-specific patterns in the referenced file, but nothing like what we're showing above, so I'm having trouble finding the connection between the things above that might be Node.js-based and their analogs in the language-patterns.md file for the other Drivers.
Also, we mention here that users will find driver-specific defaults, but the only default supplied in the referenced file is the 100-connection maxPoolSize. Since that's the same across all drivers, it seems misleading to characterize info in the related file as providing "driver-specific defaults."
We also say "monitoring APIs" here, but this is the only monitoring-related content in the referenced file:
### Monitoring Access
Most drivers provide:
- **Event listeners**: Subscribe to connection pool events
- **Statistics APIs**: Query current pool state
- **Logging**: Enable debug logging for troubleshooting
I wouldn't characterize that as "monitoring APIs", nor does it seem particularly helpful or to cover anything beyond what's probably already in the LLM's base training data.
| Configuration examples above are Node.js-based. For Python, Java, Go, C#, Ruby, or PHP: consult `references/language-patterns.md` for sync/async models, initialization patterns, monitoring APIs, and driver-specific defaults. | |
| Configuration scenarios above are Node.js-based. For Python, Java, Go, C#, Ruby, or PHP: consult `references/language-patterns.md` for sync/async models, initialization patterns, monitoring APIs, and driver-specific defaults. |
| - **Pool metrics**: Connections in use? Wait queue? | ||
| - **Connectivity test**: Connects via mongo shell from same environment? | ||
|
|
||
| Ask follow-up questions if responses are vague. |
There was a problem hiding this comment.
Can we provide more clarity here for what the agent should consider "vague" and what types of follow up questions to ask? For example, something like this: "If the user does not specify deployment type, concurrency level, or workload pattern, ask for those details before proceeding."
In other words, what is the minimum information an agent needs to proceed past this step? We need to make it clear what's required and how to elicit relevant details.
|
|
||
| Analyze whether this is a client config issue or infrastructure problem. | ||
|
|
||
| **Infrastructure Issues (Out of Scope)** - redirect appropriately: |
There was a problem hiding this comment.
Maybe I am missing it somewhere in this PR, but we're instructing agents to analyze whether this is a client config issue or infrastructure problem, and not giving agents any details about how to identify infrastructure issues. Can we provide a concrete decision tree or diagnostic sequence to help agents make this determination?
| **Client Configuration Issues (Your Territory)**: | ||
| - Pool exhaustion, inappropriate timeouts, poor reuse patterns, suboptimal sizing, missing serverless caching, connection churn | ||
|
|
||
| When identifying infrastructure issues, explain: "This appears to be a [DNS/VPC/IP] issue rather than client config. It's outside the scope of the client configuration skill, but here's how to resolve: [guidance/docs link]." |
There was a problem hiding this comment.
If we intend an agent to provide docs links or guidance, we need to give the agent that info to pass along. Can we add relevant guidance or links to docs where agents can find the info to pass to the user?
| **Healthy pattern**: Fluctuates with application traffic while maintaining headroom. Should correlate with request volume. | ||
|
|
||
| **Action thresholds**: | ||
| - **Sustained >80% of maxPoolSize**: Increase `maxPoolSize` by 20-30% |
There was a problem hiding this comment.
These threshholds are specific and that's good! But we could provide some more details, like:
- Sustained over what period of time?
- Consistently 100% over what period of time? At all? Over 5 minutes? Over an hour?
- When we say "high wait queue times" - what constitutes "high?" What should the agent expect wait queue times to be when the connection pool is healthy?
| - **High percentage with high wait queue times**: Clear sign of undersized pool | ||
|
|
||
| **Diagnosis questions**: | ||
| - Does it correlate with traffic spikes? |
There was a problem hiding this comment.
Diagnostic questions are good! But can we tie these back to specific mitigations?
|
|
||
| --- | ||
|
|
||
| ### Connections Available (Idle) |
There was a problem hiding this comment.
Maybe this is a silly question, but I'm asking since I'm not familiar with how we normally communicate about these things. Is "available connections" really different than "connections in use?" We're talking about them like they're different things here, but it seems to me like these are both facets of a "connection usage" bucket, and you can derive this as a corollary of the section above.
I'm asking because LLMs have a tendency to attribute to or intuit meaning from distinctions like this, and if there really isn't a distinction, we may accidentally convince one that there is - i.e. that it should take different actions for the same underlying problem, not enough connections in the pool.
In other words, if these are really just different sides of the same thing, we can create cognitive load on the part of the agent by presenting them as though they're different, and that can have downstream consequences on agent performance and whether agents take the correct actions.
|
|
||
| ### Wait Queue Size | ||
|
|
||
| **What it is**: The number of operations currently waiting for an available connection because the pool is at capacity. |
There was a problem hiding this comment.
Apologies again if this is a silly q, but I had a quick look around the docs, and it wasn't obvious to me if there is a way to know the size of the wait queue. We're speaking here like there is, but I didn't spot anything obvious. Is this something we need to provide instructions for?
|
|
||
| ## Server-Level Metrics (MongoDB-Side) | ||
|
|
||
| Server-side metrics provide the MongoDB server's perspective on connection usage. Access via: |
There was a problem hiding this comment.
These seem like three different abstractions, and it's unclear to me what we expect the agent to do with this info.
I don't think there's a way for agents to access MongoDB Atlas monitoring dashboards (if there is, we should provide details about it), and it's unclear to me whether agents have a way to understand whether an integration with a monitoring platform is available or how to interface with it. That leaves db.adminCommand() in the agent's purview, but as far as I know, from the Driver side, executing admin commands is a one-time operation - not a listener that would facilitate monitoring. So if there is a monitoring pattern we expect agents to use, we probably need to provide details about what that should be.
MongoDB Agent Skill Submission
Skill Information
Skill Name: mongodb-connection
Skill Directory:
skills/mongodb-connectionUse Case
MongoDB connection configuration is deceptively complex. Developers often copy connection code from tutorials or Stack Overflow without understanding how pool sizes, timeouts, and connection patterns affect their specific environment. This leads to common production issues like connection pool exhaustion in serverless functions, timeout errors under load, and performance degradation in high-traffic applications. This skill addresses these challenges by providing context-aware guidance for MongoDB connection management across all officially supported driver languages.
Value Proposition
This skill prevents costly misconfigurations by enforcing the principle of "context before configuration." Instead of providing arbitrary parameter values, it asks about your environment (serverless, long-running, traffic patterns) and provides tailored recommendations. This approach:
Saves debugging time - Catch connection issues before they reach production
Improves performance - Right-sized pools and timeouts for your use case
Multi-language support - Works with Node.js, Python, Java, Go, C#, Ruby, PHP, and more
Production-ready patterns - Optimized for Lambda, Express APIs, and high-traffic applications
Educational - Learn MongoDB connection best practices while building
Special Considerations
None.
Validation Prompts
Author Self-Validation
skill-validatorlocallySME Review
SME: @rozza
Additional Context
Initial results look promising. More comprehensive testing is required. Do not merge yet!Ready to merge.