Skip to content

Cube to looker ml #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

lkml2cube is a Python CLI tool that converts LookML models into Cube data models. It uses the `lkml` library to parse LookML files and generates YAML-based Cube definitions.

## Development Commands

### Environment Setup
- This project uses PDM for dependency management
- Install dependencies: `pdm install`
- Run tests: `pdm run pytest` or `pytest`

### Testing
- Tests are located in `tests/` directory
- Main test file: `tests/test_e2e.py`
- Test samples are in `tests/samples/` with both `lkml/` and `cubeml/` subdirectories
- Tests compare generated output against expected YAML files

### CLI Usage
The tool provides three main commands:
- `lkml2cube cubes` - Converts LookML views to Cube definitions (cubes only)
- `lkml2cube views` - Converts LookML explores to Cube definitions (cubes + views)
- `lkml2cube explores` - Generates LookML explores from Cube meta API (correctly maps Cube cubes→LookML views, Cube views→LookML explores)

## Architecture

### Core Components

#### Parser Module (`lkml2cube/parser/`)
- `loader.py` - File loading, writing, and summary utilities (includes LookML generation)
- `views.py` - Converts LookML views to Cube definitions
- `explores.py` - Handles explore parsing and join generation
- `cube_api.py` - Interfaces with Cube meta API, correctly separates cubes vs views
- `types.py` - Custom YAML types for proper formatting

#### Main Entry Point
- `main.py` - Typer-based CLI with three commands: cubes, views, explores
- Uses Rich for console output formatting

### Key Concepts
- **Cubes vs Views**: The `cubes` command only generates Cube model definitions, while `views` creates both cubes and views with join relationships
- **Explores**: LookML explores define join relationships equivalent to Cube's view definitions
- **Include Resolution**: Uses `--rootdir` parameter to resolve LookML `include:` statements
- **Cube API Mapping**:
- Cube cubes (with `sql_table`/`sql`) → LookML views
- Cube views (with `aliasMember` joins) → LookML explores with join definitions
- **LookML Enhancement**: Generates production-ready LookML with includes, proper joins, primary keys, and drill fields

### File Structure
- `examples/` - Contains sample output files (cubes and views)
- `tests/samples/` - Test fixtures with both LookML input and expected Cube output
- `lkml2cube/` - Main source code
- `dist/` - Built distribution files

## Development Notes

### YAML Formatting
The tool uses custom YAML representers for proper formatting:
- `folded_unicode` and `literal_unicode` types for multi-line strings
- Configured in `main.py` with `yaml.add_representer()`

### CLI Options
Common options across commands:
- `--parseonly` - Shows parsed LookML as Python dict
- `--printonly` - Prints generated YAML to stdout
- `--outputdir` - Directory for output files
- `--rootdir` - Base path for resolving includes
18 changes: 18 additions & 0 deletions export_pdm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash
# This script exports the PDM project dependencies to a requirements.txt file
# Ensure PDM is installed
if ! command -v pdm &> /dev/null
then
echo "PDM could not be found. Please install PDM first."
exit 1
fi

pdm export --without-hashes --format requirements > requirements.txt

pip install -r requirements.txt
if [ $? -eq 0 ]; then
echo "Dependencies installed successfully."
else
echo "Failed to install dependencies."
exit 1
fi
50 changes: 49 additions & 1 deletion lkml2cube/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
import typer
import yaml

from lkml2cube.parser.cube_api import meta_loader, parse_meta
from lkml2cube.parser.explores import parse_explores, generate_cube_joins
from lkml2cube.parser.loader import file_loader, write_files, print_summary
from lkml2cube.parser.loader import file_loader, write_files, write_lookml_files, print_summary
from lkml2cube.parser.views import parse_view
from lkml2cube.parser.types import (
folded_unicode,
Expand Down Expand Up @@ -126,5 +127,52 @@ def views(
print_summary(summary)


@app.command()
def explores(
metaurl: Annotated[str, typer.Argument(help="The url for cube meta endpoint")],
token: Annotated[str, typer.Option(help="JWT token for Cube meta")],
parseonly: Annotated[
bool,
typer.Option(
help=(
"When present it will only show the python"
" dict read from the lookml file"
)
),
] = False,
outputdir: Annotated[
str, typer.Option(help="The path for the output files to be generated")
] = ".",
printonly: Annotated[
bool, typer.Option(help="Print to stdout the parsed files")
] = False,
):
"""
Generate cubes-only given a LookML file that contains LookML Views.
"""

cube_model = meta_loader(
meta_url=metaurl,
token=token,
)

if cube_model is None:
console.print(f"No response received from: {metaurl}", style="bold red")
raise typer.Exit()

if parseonly:
console.print(pprint.pformat(cube_model))
return

lookml_model = parse_meta(cube_model)

if printonly:
console.print(yaml.dump(lookml_model, allow_unicode=True))
return

summary = write_lookml_files(lookml_model, outputdir=outputdir)
print_summary(summary)


if __name__ == "__main__":
app()
196 changes: 196 additions & 0 deletions lkml2cube/parser/cube_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
import requests
from lkml2cube.parser.types import reverse_type_map, literal_unicode, console


def meta_loader(
meta_url: str,
token: str,
) -> dict:
"""
Load the Cube meta API and return the model as a dictionary.
"""

if not token:
raise ValueError("A valid token must be provided to access the Cube meta API.")

# We need the extended version of the meta API to get the full model
if not meta_url.endswith("?extended"):
meta_url += "?extended"

headers = {"Authorization": f"Bearer {token}"}
response = requests.get(meta_url, headers=headers)

if response.status_code != 200:
raise Exception(f"Failed to fetch meta data: {response.text}")

return response.json()


def parse_members(members: list) -> list:
"""
Parse measures and dimensions from the Cube meta model.
"""

rpl_table = (
lambda s: s.replace("${", "{").replace("{CUBE}", "{TABLE}").replace("{", "${")
)
convert_to_literal = lambda s: (
literal_unicode(rpl_table(s)) if "\n" in s else rpl_table(s)
)
parsed_members = []

for member in members:
if member.get("type") not in reverse_type_map:
console.print(
f'Dimension type: {member["type"]} not implemented yet:\n {member}',
style="bold red",
)
continue

dim = {
"name": member.get("name"),
"label": member.get("title", member.get("name")),
"description": member.get("description", ""),
"type": reverse_type_map.get(member.get("aggType", member.get("type"))),
}
if "sql" in member:
dim["sql"] = convert_to_literal(member["sql"])

if not member.get("public"):
dim["hidden"] = "yes"

parsed_members.append(dim)
return parsed_members


def parse_meta(cube_model: dict) -> dict:
"""
Parse the Cube meta model and return a simplified version.
Separates Cube cubes (-> LookML views) from Cube views (-> LookML explores).
"""

lookml_model = {
"views": [],
"explores": [],
}

for model in cube_model.get("cubes", []):
# Determine if this is a cube (table-based) or view (join-based)
is_view = _is_cube_view(model)

if is_view:
# This is a Cube view -> LookML explore
explore = _parse_cube_view_to_explore(model)
lookml_model["explores"].append(explore)
else:
# This is a Cube cube -> LookML view
view = _parse_cube_to_view(model)
lookml_model["views"].append(view)

return lookml_model


def _is_cube_view(model: dict) -> bool:
"""
Determine if a Cube model is a view (has joins) or a cube (has its own data source).
Views typically have aliasMember references and no sql_table/sql property.
"""
# Check if any dimensions or measures use aliasMember (indicating joins)
has_alias_members = False

for dimension in model.get("dimensions", []):
if "aliasMember" in dimension:
has_alias_members = True
break

if not has_alias_members:
for measure in model.get("measures", []):
if "aliasMember" in measure:
has_alias_members = True
break

# If it has alias members and no own data source, it's a view
has_own_data_source = "sql_table" in model or "sql" in model

return has_alias_members and not has_own_data_source


def _parse_cube_to_view(model: dict) -> dict:
"""
Parse a Cube cube into a LookML view.
"""
view = {
"name": model.get("name"),
"label": model.get("title", model.get("description", model.get("name"))),
"extends": [],
"dimensions": [],
"measures": [],
"filters": [],
}

if "extends" in model:
view["extends"] = [model["extends"]]

if "sql_table" in model:
view["sql_table_name"] = model["sql_table"]

if "sql" in model:
view["derived_table"] = {"sql": model["sql"]}

if "dimensions" in model:
view["dimensions"] = parse_members(model["dimensions"])
if "measures" in model:
view["measures"] = parse_members(model["measures"])

return view


def _parse_cube_view_to_explore(model: dict) -> dict:
"""
Parse a Cube view into a LookML explore with joins.
"""
explore = {
"name": model.get("name"),
"label": model.get("title", model.get("description", model.get("name"))),
"joins": []
}

# Extract join information from aliasMember references
joined_cubes = set()
primary_cube = None

# Find all referenced cubes from dimensions and measures
for dimension in model.get("dimensions", []):
if "aliasMember" in dimension:
cube_name = dimension["aliasMember"].split(".")[0]
joined_cubes.add(cube_name)

for measure in model.get("measures", []):
if "aliasMember" in measure:
cube_name = measure["aliasMember"].split(".")[0]
joined_cubes.add(cube_name)

# Try to determine the primary cube (base of the explore)
# Usually the most referenced cube or the first one
if joined_cubes:
# For now, use the first cube alphabetically as primary
# In a real implementation, you might have more logic here
primary_cube = min(joined_cubes)
joined_cubes.remove(primary_cube)

explore["view_name"] = primary_cube

# Create joins for the remaining cubes
for cube_name in sorted(joined_cubes):
join = {
"name": cube_name,
"view_label": cube_name.replace("_", " ").title(),
"type": "left_outer", # Default join type
"relationship": "many_to_one", # Default relationship
# In a real implementation, you'd extract actual join conditions
# from the Cube model's join definitions
"sql_on": f"${{{primary_cube}.id}} = ${{{cube_name}.id}}"
}
explore["joins"].append(join)

return explore
Loading