Skip to content

Commit 3be2e68

Browse files
ewilliams-clouderajkwatsonactions-usermliu-clouderabaasitsharief
authored
Support External Postgres DB (#276)
* ai generated tests for the evaluators functions * don't try to look up node ids in empty vector stores * move suggested questions under the sessions route * fix things up for postgres db access * formatting * fixes for not being able to create new dbs * only set the DB_URL if it isn't already set * fix the install directory * change location of .nvm and source bash from install_node * Update release version to dev-testing * removed unused import * add logging for initializing the JDBI instance * Update release version to dev-testing * wip on ui for metadata lastFile:ui/src/pages/Settings/MetadataDBFields.tsx * wip lastFile:llm-service/app/config.py * update FE types to match python land * fix margin bottom consistency * Update release version to dev-testing * set the username/password for the database if set from env * drop databases lastFile:ui/src/pages/Settings/MetadataDBFields.tsx * Update release version to dev-testing * limit number of retries * Update release version to dev-testing * bumped bedrock converse and fixed a bug in tool calling check * remove unused * Update release version to dev-testing * minor error handling improvement * fixed bug with Empty Response with no documents in data source and tool calling enabled * fix mypy issues * add a main method to test if a db connection string is valid * Update release version to dev-testing * add python endpoint to test a jdbc connection string * export the install dir so it can be used by the fastapi process * make sure to use the right java * pass in the db type so we can do a bare server connection * Update release version to dev-testing * better error handling for api proxy * pass through error on non-502s, use 502 error instaed of 503, dont retry on 502 * Update release version to dev-testing * more config details * wip settings page for external metadata db lastFile:ui/src/pages/Settings/AmpSettingsPage.tsx * fix * drop databases lastFile:llm-service/summaries/doc_summary_index_global/graph_store.json * wip on formatting warnings lastFile:ui/src/pages/Settings/AmpSettingsPage.tsx * wip lastFile:ui/src/api/ampMetadataApi.ts * wip test connection lastFile:ui/src/pages/Settings/MetadataDBFields.tsx * update form items * fix connection test * use formValues * refactor: make username and password required for JDBC connection * improve handling for testing connection * conditionally render test button * fix mypy issues * disable test button if no password or username * Update ui/src/pages/Settings/MetadataDBFields.tsx Co-authored-by: Copilot <[email protected]> * Update ui/src/pages/Settings/MetadataDBFields.tsx Co-authored-by: Copilot <[email protected]> * Update release version to dev-testing * handle clearing values for external db when switching to h2 * clear field values in ui when using h2 * Update release version to dev-testing * refactor environment variable handling for H2 database configuration * Update release version to dev-testing * refactor: update H2 database URL to use absolute path * refactor: change metadata_db_provider comparison to string literal for H2 * refactor: fix comparison operator for metadata_db_provider in H2 check * Update release version to dev-testing * refactor: remove DB_URL, DB_USERNAME, and DB_PASSWORD from environment variables for H2 * Update release version to dev-testing * refactor: update config_to_env to use Optional for environment variable values * refactor: change config_to_env to return non-optional environment variable values * Update release version to dev-testing * refactor: update DB_URL retrieval to use a fallback value for H2 configuration * refactor: streamline JDBC configuration for H2 by using a default DB_URL and bypassing validation * refactor: improve validation message handling and remove messageQueue for retry * Update release version to dev-testing * refactor: enhance input validation for JDBC URL, username, and password in H2 configuration * Vite dev changes to maybe address import error in dev, switch to using default export for NotFoundComponent.tsx * Update release version to dev-testing * test config change * Update release version to dev-testing * title change * remove restriction on username --------- Co-authored-by: jwatson <[email protected]> Co-authored-by: actions-user <[email protected]> Co-authored-by: Michael Liu <[email protected]> Co-authored-by: Baasit Sharief <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent 047e585 commit 3be2e68

File tree

50 files changed

+3097
-1476
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+3097
-1476
lines changed

backend/src/main/java/com/cloudera/cai/rag/configuration/JdbiConfiguration.java

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ private static Jdbi createJdbi() {
6666
if (jdbi == null) {
6767
synchronized (LOCK) {
6868
if (jdbi == null) {
69+
log.info("Initializing new Jdbi instance");
6970
jdbi = Jdbi.create(createDataSource());
7071
}
7172
}
@@ -92,10 +93,19 @@ private static Migrator migrator(DataSource dataSource, RdbConfig dbConfig) {
9293
private static DatabaseConfig createDatabaseConfig() {
9394
String dbUrl = System.getenv().getOrDefault("DB_URL", "jdbc:h2:mem:rag");
9495
String rdbType = System.getenv().getOrDefault("DB_TYPE", RdbConfig.H2_DB_TYPE);
96+
String password = System.getenv().get("DB_PASSWORD");
97+
String username = System.getenv().get("DB_USERNAME");
9598
RdbConfig rdbConfiguration =
96-
RdbConfig.builder().rdbUrl(dbUrl).rdbType(rdbType).rdbDatabaseName("rag").build();
99+
RdbConfig.builder()
100+
.rdbUrl(dbUrl)
101+
.rdbType(rdbType)
102+
.rdbDatabaseName("rag")
103+
.rdbUsername(username)
104+
.rdbPassword(password)
105+
.build();
97106
if (rdbConfiguration.isPostgres()) {
98-
rdbConfiguration = rdbConfiguration.toBuilder().rdbUsername("postgres").build();
107+
rdbConfiguration =
108+
rdbConfiguration.toBuilder().rdbUsername("postgres").rdbDatabaseName(null).build();
99109
}
100110
return DatabaseConfig.builder().RdbConfiguration(rdbConfiguration).build();
101111
}

backend/src/main/java/com/cloudera/cai/util/db/JdbiUtils.java

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
/*******************************************************************************
1+
/*
22
* CLOUDERA APPLIED MACHINE LEARNING PROTOTYPE (AMP)
33
* (C) Cloudera, Inc. 2024
44
* All rights reserved.
@@ -97,4 +97,46 @@ public static void createDBIfNotExists(RdbConfig rdbConfig) throws SQLException
9797
}
9898
}
9999
}
100+
101+
/**
102+
* A utility class to test database connectivity using JDBC.
103+
*
104+
* <p>Run this with: java -cp prebuilt_artifacts/rag-api.jar
105+
* -Dloader.main=com.cloudera.cai.util.db.JdbiUtils
106+
* org.springframework.boot.loader.launch.PropertiesLauncher <jdbc_url> <username> <password>
107+
*
108+
* <p>An exit code of 0 indicates success, 1 indicates failure, and 2 indicates incorrect usage.
109+
*/
110+
public static void main(String[] args) {
111+
if (args.length != 4) {
112+
System.err.println("Usage: JdbiUtils <db_url> <username> <password> <db_type>");
113+
System.exit(2); // Incorrect usage
114+
}
115+
String dbUrl = args[0];
116+
String username = args[1];
117+
String password = args[2];
118+
String dbType = args[3];
119+
RdbConfig rdbConfiguration =
120+
RdbConfig.builder()
121+
.rdbUrl(dbUrl)
122+
.rdbType(dbType)
123+
.rdbDatabaseName("rag")
124+
.rdbUsername(username)
125+
.rdbPassword(password)
126+
.build();
127+
var connectionString = RdbConfig.buildDatabaseServerConnectionString(rdbConfiguration);
128+
try (Connection connection =
129+
DriverManager.getConnection(connectionString, username, password)) {
130+
if (connection != null && !connection.isClosed()) {
131+
System.out.println("Connection successful.");
132+
System.exit(0); // Success
133+
} else {
134+
System.err.println("Connection failed: Connection is null or closed.");
135+
System.exit(1); // Failure
136+
}
137+
} catch (Exception e) {
138+
System.err.println("Connection failed: " + e.getMessage());
139+
System.exit(1); // Failure
140+
}
141+
}
100142
}

backend/src/main/java/com/cloudera/cai/util/db/RdbConfig.java

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ public static String buildDatabaseConnectionString(RdbConfig rdb) {
123123
return adjustMsSqlRdbUrl(rdb.rdbUrl) + ";databaseName=" + rdb.getRdbDatabaseName();
124124
}
125125
if (rdb.isPostgres()) {
126-
return rdb.rdbUrl + "/" + rdb.getRdbDatabaseName();
126+
return rdb.rdbUrl;
127127
}
128128

129129
final var url =
@@ -153,7 +153,15 @@ public static String buildDatabaseServerConnectionString(RdbConfig rdb) {
153153
}
154154

155155
if (rdb.isPostgres()) {
156-
return rdb.rdbUrl + "/" + rdb.getRdbDatabaseName();
156+
var pattern =
157+
Pattern.compile("^jdbc:postgresql:(//[^/]+/)?(\\w+)(.*)", Pattern.CASE_INSENSITIVE);
158+
var matcher = pattern.matcher(rdb.rdbUrl);
159+
if (!matcher.matches()) {
160+
throw new IllegalStateException("URL doesn't match the expected regex");
161+
}
162+
var firstPart = matcher.group(1);
163+
var lastPart = matcher.group(3);
164+
return "jdbc:postgresql:" + firstPart + rdb.getRdbDatabaseName() + lastPart;
157165
}
158166
final var url =
159167
rdb.rdbUrl
@@ -181,7 +189,11 @@ private static String adjustMsSqlRdbUrl(String rdbUrl) {
181189
public static String buildDatabaseName(RdbConfig rdb) {
182190
String dbName = rdb.getRdbDatabaseName();
183191
if (rdb.getDbConnectionUrl() != null) {
184-
dbName = getDBNameFromDBConnectionURL(rdb);
192+
dbName = getDBNameFromDBConnectionURL(rdb, rdb.dbConnectionUrl);
193+
}
194+
195+
if (dbName == null) {
196+
dbName = getDBNameFromDBConnectionURL(rdb, rdb.rdbUrl);
185197
}
186198

187199
if (dbName.contains("-")) {
@@ -195,25 +207,26 @@ public static String buildDatabaseName(RdbConfig rdb) {
195207
return dbName;
196208
}
197209

198-
private static String getDBNameFromDBConnectionURL(RdbConfig rdb) {
210+
private static String getDBNameFromDBConnectionURL(RdbConfig rdb, String url) {
199211
String regex;
200212
if (rdb.isMssql()) {
201213
// Regex reference: https://regex101.com/r/yaU0DY/1
202214
regex = ";databaseName=([^;]*)";
203-
} else {
215+
} else if (rdb.isMysql()) {
204216
regex = "^jdbc:mysql:(?://[^/]+/)?(\\w+)";
217+
} else if (rdb.isPostgres()) {
218+
regex = "^jdbc:postgresql:(?://[^/]+/)?(\\w+)";
219+
} else {
220+
throw new IllegalStateException(
221+
"database url parsing not supported for db type: " + rdb.rdbType);
205222
}
206223
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
207224
var dbName =
208-
pattern
209-
.matcher(rdb.dbConnectionUrl)
210-
.results()
211-
.map(mr -> mr.group(1))
212-
.collect(Collectors.joining());
225+
pattern.matcher(url).results().map(mr -> mr.group(1)).collect(Collectors.joining());
213226

214227
if (dbName.isEmpty()) {
215228
throw new InvalidDbConfigException(
216-
rdb.dbConnectionUrl, "Database name not found in the database connection URL");
229+
url, "Database name not found in the database connection URL");
217230
}
218231
return dbName;
219232
}

backend/src/main/resources/migrations/postgres/25_alter_datasource_config.up.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,6 @@
3838

3939
BEGIN;
4040

41-
ALTER TABLE rag_data_source ALTER COLUMN chunk_overlap_percent INTEGER DEFAULT 10;
41+
ALTER TABLE rag_data_source ALTER COLUMN chunk_overlap_percent SET DEFAULT 10;
4242

4343
COMMIT;
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
/*
2+
* CLOUDERA APPLIED MACHINE LEARNING PROTOTYPE (AMP)
3+
* (C) Cloudera, Inc. 2025
4+
* All rights reserved.
5+
*
6+
* Applicable Open Source License: Apache 2.0
7+
*
8+
* NOTE: Cloudera open source products are modular software products
9+
* made up of hundreds of individual components, each of which was
10+
* individually copyrighted. Each Cloudera open source product is a
11+
* collective work under U.S. Copyright Law. Your license to use the
12+
* collective work is as provided in your written agreement with
13+
* Cloudera. Used apart from the collective work, this file is
14+
* licensed for your use pursuant to the open source license
15+
* identified above.
16+
*
17+
* This code is provided to you pursuant a written agreement with
18+
* (i) Cloudera, Inc. or (ii) a third-party authorized to distribute
19+
* this code. If you do not have a written agreement with Cloudera nor
20+
* with an authorized and properly licensed third party, you do not
21+
* have any rights to access nor to use this code.
22+
*
23+
* Absent a written agreement with Cloudera, Inc. ("Cloudera") to the
24+
* contrary, A) CLOUDERA PROVIDES THIS CODE TO YOU WITHOUT WARRANTIES OF ANY
25+
* KIND; (B) CLOUDERA DISCLAIMS ANY AND ALL EXPRESS AND IMPLIED
26+
* WARRANTIES WITH RESPECT TO THIS CODE, INCLUDING BUT NOT LIMITED TO
27+
* IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY AND
28+
* FITNESS FOR A PARTICULAR PURPOSE; (C) CLOUDERA IS NOT LIABLE TO YOU,
29+
* AND WILL NOT DEFEND, INDEMNIFY, NOR HOLD YOU HARMLESS FOR ANY CLAIMS
30+
* ARISING FROM OR RELATED TO THE CODE; AND (D)WITH RESPECT TO YOUR EXERCISE
31+
* OF ANY RIGHTS GRANTED TO YOU FOR THE CODE, CLOUDERA IS NOT LIABLE FOR ANY
32+
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, PUNITIVE OR
33+
* CONSEQUENTIAL DAMAGES INCLUDING, BUT NOT LIMITED TO, DAMAGES
34+
* RELATED TO LOST REVENUE, LOST PROFITS, LOSS OF INCOME, LOSS OF
35+
* BUSINESS ADVANTAGE OR UNAVAILABILITY, OR LOSS OR CORRUPTION OF
36+
* DATA.
37+
*/
38+
39+
package com.cloudera.cai.util.db;
40+
41+
import static org.assertj.core.api.Assertions.assertThat;
42+
import static org.junit.jupiter.api.Assertions.*;
43+
44+
import org.junit.jupiter.api.Test;
45+
class RdbConfigTest {
46+
47+
@Test
48+
void buildDatabaseConnectionString() {
49+
50+
var url =
51+
"jdbc:postgresql://rag-dev-testing.cluster.us-west-2.rds.amazonaws.com:5432/rag?username=foo&password=bar";
52+
53+
var rdb =
54+
RdbConfig.builder().rdbUrl(url).rdbDatabaseName("postgres").rdbType("PostgreSQL").build();
55+
var result = RdbConfig.buildDatabaseServerConnectionString(rdb);
56+
assertThat(result)
57+
.isEqualTo(
58+
"jdbc:postgresql://rag-dev-testing.cluster.us-west-2.rds.amazonaws.com:5432/postgres?username=foo&password=bar");
59+
}
60+
}

docker-compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ services:
1010
- "9464:9464"
1111
environment:
1212
- API_HOST=0.0.0.0
13-
- DB_URL=jdbc:postgresql://db:5432
13+
- DB_URL=jdbc:postgresql://db:5432/rag
1414
- DB_TYPE=PostgreSQL
1515
- OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4318
1616
- OTEL_METRICS_EXPORTER=none # we configure this by hand

llm-service/app/config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252
SummaryStorageProviderType = Literal["Local", "S3"]
5353
ChatStoreProviderType = Literal["Local", "S3"]
5454
VectorDbProviderType = Literal["QDRANT", "OPENSEARCH"]
55+
MetadataDbProviderType = Literal["H2", "PostgreSQL"]
5556

5657

5758
class _Settings:

llm-service/app/routers/index/__init__.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,16 +46,15 @@
4646
from . import amp_metadata
4747
from . import models
4848
from . import metrics
49-
from . import chat
5049

5150
logger = logging.getLogger(__name__)
5251

5352

5453
router = APIRouter()
55-
router.include_router(chat.router)
5654
router.include_router(summaries.router)
5755
router.include_router(data_source.router)
5856
router.include_router(sessions.router)
57+
router.include_router(sessions.no_id_router)
5958
router.include_router(amp_metadata.router)
6059
# include this for legacy UI calls
6160
router.include_router(amp_metadata.router, prefix="/index", deprecated=True)

llm-service/app/routers/index/amp_metadata/__init__.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
from fastapi.params import Header
4747

4848
from .... import exceptions
49+
from ....config import MetadataDbProviderType
4950
from ....services.amp_metadata import (
5051
ProjectConfig,
5152
ProjectConfigPlus,
@@ -54,6 +55,8 @@
5455
update_project_environment,
5556
get_project_environment,
5657
get_application_config,
58+
validate_jdbc,
59+
ValidationResult,
5760
)
5861
from ....services.amp_update import does_amp_need_updating
5962
from ....services.models.providers import CAIIModelProvider
@@ -198,6 +201,24 @@ def save_auth_token(auth_token: Annotated[str, Body(embed=True)]) -> str:
198201
return "Auth token saved successfully"
199202

200203

204+
@router.post(
205+
"/validate-jdbc-connection",
206+
summary="Validates a JDBC connection string, username, and password.",
207+
)
208+
@exceptions.propagates
209+
def validate_jdbc_connection(
210+
db_url: Annotated[str, Body(embed=True)],
211+
username: Annotated[str, Body(embed=True)],
212+
password: Annotated[str, Body(embed=True)],
213+
db_type: Annotated[MetadataDbProviderType, Body(embed=True)],
214+
) -> ValidationResult:
215+
"""
216+
Calls the JdbiUtils main method to validate JDBC connection parameters.
217+
Returns a dict with 'valid': True/False and 'message'.
218+
"""
219+
return validate_jdbc(db_type, db_url, password, username)
220+
221+
201222
def save_cdp_token(auth_token: str) -> None:
202223
token_data = {"access_token": auth_token}
203224
with open("cdp_token", "w") as file:

llm-service/app/routers/index/chat/__init__.py

Lines changed: 0 additions & 69 deletions
This file was deleted.

0 commit comments

Comments
 (0)