Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
1b82c68
Initial commit with task details for issue #5
konard Nov 16, 2025
0b5f16f
Implement built-in references in Python codec per issue #5
konard Nov 16, 2025
43a6f9c
Implement built-in references in JavaScript codec per issue #5
konard Nov 16, 2025
fdc335c
Revert "Initial commit with task details for issue #5"
konard Nov 16, 2025
9e75f2d
Fix parser compatibility by using hybrid reference format
konard Nov 16, 2025
ac9726b
Add experiment scripts used during built-in references implementation
konard Nov 16, 2025
0671895
Implement built-in self-reference syntax per issue #5
konard Nov 16, 2025
59b775e
Fix encoder format and pyproject.toml dependency version
konard Nov 16, 2025
1f6ca74
Update to use built-in self-reference syntax (obj_0: type ...)
konard Nov 16, 2025
6318f82
Revert links-notation to 0.9.0 for Python 3.8-3.12 compatibility
konard Nov 16, 2025
aeafa93
Implement multi-link encoder to fix parser bug with nested self-refer…
konard Nov 16, 2025
d533c7a
Fix self-reference format and update to Python 3.13 + links-notation …
konard Nov 16, 2025
07e7f06
Implement multi-link encoder to fix parser bug with nested self-refer…
konard Nov 16, 2025
4672b3b
Fix CI workflow to test only Python 3.13 and Node.js 22
konard Nov 16, 2025
a769d70
Fix encoder to inline self-referenced definitions
konard Nov 16, 2025
97706dc
Fix Python decoder to handle back-references in inline definitions
konard Nov 16, 2025
4130f0a
Document Python links-notation parser bug with nested self-references
konard Nov 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions .github-workflows-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: Tests

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
test-python:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.13']

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
working-directory: ./python
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Run tests with coverage
working-directory: ./python
run: |
pytest tests/ -v --cov=link_notation_objects_codec --cov-report=term-missing

- name: Run linter (ruff)
working-directory: ./python
run: |
ruff check src/ tests/
continue-on-error: true

- name: Run type checker (mypy)
working-directory: ./python
run: |
mypy src/
continue-on-error: true

test-javascript:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: ['22']

steps:
- uses: actions/checkout@v4

- name: Set up Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}

- name: Install dependencies
working-directory: ./js
run: npm install

- name: Run tests
working-directory: ./js
run: npm test

- name: Run example
working-directory: ./js
run: npm run example
8 changes: 4 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Tests

on:
push:
branches: [ main, issue-* ]
branches: [ main ]
pull_request:
branches: [ main ]

Expand All @@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']
python-version: ['3.13']

steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -48,12 +48,12 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: ['18', '20', '22']
node-version: ['22']

steps:
- uses: actions/checkout@v4

- name: Set up Node.js ${{ matrix.node-version }}
- name: Set up Node.js ${{ matrix.python-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
Expand Down
148 changes: 148 additions & 0 deletions PARSER_BUG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Links Notation Parser Bug - Nested Self-References in Pairs

## Summary

The Python `links-notation` library (versions 0.11.0-0.11.2) has a parsing bug when handling self-referenced objects nested inside pairs (key-value structures).

## Environment

- **Package**: `links-notation`
- **Versions Tested**: 0.11.0, 0.11.1, 0.11.2
- **Python Version**: 3.13
- **Status**: JavaScript implementation works correctly, Python implementation fails

## Problem Description

When a self-referenced object definition (using the `(id: type ...)` syntax) is nested as a VALUE inside a pair, the parser fails to correctly parse the structure.

### What Works ✅

**Self-reference as direct child:**
```
(obj_0: list (int 1) (int 2) (obj_1: list (int 3) (int 4) obj_0))
```
This parses correctly because `(obj_1: list ...)` is a direct child of the list, not nested inside a pair.

**Simple self-reference:**
```
(obj_0: dict ((str c2VsZg==) obj_0))
```
This works because `obj_0` is a reference (no inline definition), not a nested definition.

### What Fails ❌

**Self-reference nested in pair:**
```
(obj_0: dict ((str bmFtZQ==) (str ZGljdDE=)) ((str b3RoZXI=) (obj_1: dict ((str bmFtZQ==) (str ZGljdDI=)) ((str b3RoZXI=) obj_0))))
```

In this example:
- The second pair has key `(str b3RoZXI=)` (base64 for "other")
- The second pair's value should be `(obj_1: dict ...)`
- But the parser fails to correctly identify this as a self-referenced dict definition

## Minimal Reproducible Example

```python
from links_notation import Parser

# This notation should represent two dicts that reference each other
notation = '(obj_0: dict ((str bmFtZQ==) (str ZGljdDE=)) ((str b3RoZXI=) (obj_1: dict ((str bmFtZQ==) (str ZGljdDI=)) ((str b3RoZXI=) obj_0))))'

parser = Parser()
links = parser.parse(notation)

# Expected: One top-level link with id="obj_0", containing:
# - First pair: (str bmFtZQ==) → (str ZGljdDE=)
# - Second pair: (str b3RoZXI=) → (obj_1: dict ...)
#
# Actual: The parser likely misinterprets the nested (obj_1: dict ...) structure
# causing the second pair to be malformed or missing

print(f"Number of links parsed: {len(links)}")
if links:
link = links[0]
print(f"Link ID: {link.id}")
print(f"Number of values: {len(link.values) if link.values else 0}")

if link.values and len(link.values) > 1:
# First value should be the type marker "dict"
print(f"Type marker: {link.values[0].id if hasattr(link.values[0], 'id') else 'NO ID'}")

# Remaining values should be pairs
pairs = link.values[1:]
print(f"Number of pairs: {len(pairs)}")

for i, pair in enumerate(pairs):
print(f"\nPair {i+1}:")
if hasattr(pair, 'values') and pair.values:
print(f" Pair has {len(pair.values)} elements")
if len(pair.values) >= 1:
key = pair.values[0]
print(f" Key: {key.id if hasattr(key, 'id') else 'NO ID'}")
if len(pair.values) >= 2:
value = pair.values[1]
print(f" Value ID: {value.id if hasattr(value, 'id') else 'NO ID'}")
print(f" Value has values: {bool(value.values) if hasattr(value, 'values') else False}")
else:
print(f" Pair has no values or is malformed")
```

## Expected Output

```
Number of links parsed: 1
Link ID: obj_0
Number of values: 3
Type marker: dict
Number of pairs: 2

Pair 1:
Pair has 2 elements
Key: bmFtZQ==
Value ID: ZGljdDE=
Value has values: False

Pair 2:
Pair has 2 elements
Key: b3RoZXI=
Value ID: obj_1
Value has values: True
```

## Actual Output

*(To be filled in after running the test)*

The parser likely produces incorrect structure for Pair 2, where the nested `(obj_1: dict ...)` is not properly recognized as a self-referenced dict definition.

## Workaround

Currently, the only workaround is to output separate top-level link definitions:

```
(obj_1: dict ((str bmFtZQ==) (str ZGljdDI=)) ((str b3RoZXI=) obj_0))
(obj_0: dict ((str bmFtZQ==) (str ZGljdDE=)) ((str b3RoZXI=) obj_1))
```

This avoids nesting self-referenced definitions inside pairs, but sacrifices the desired inline format.

## Comparison with JavaScript

The JavaScript implementation of `links-notation` correctly parses the nested self-reference syntax. Tests using the same notation format pass in JavaScript but fail in Python.

## Impact

This bug prevents the `link-notation-objects-codec` library from properly encoding/decoding mutually-referential dict structures using the inline self-reference format. It limits the library to either:
1. Using the multi-line workaround (separate top-level definitions)
2. Only supporting list-based circular references (which work because they don't nest definitions in pairs)

## References

- Issue: https://github.com/link-foundation/link-notation-objects-codec/issues/5
- Pull Request: https://github.com/link-foundation/link-notation-objects-codec/pull/6
- Links Notation Specification: https://github.com/link-foundation/links-notation

## Requested Action

Please fix the Python `links-notation` parser to correctly handle self-referenced object definitions when they appear as values inside pairs, matching the behavior of the JavaScript implementation.
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,15 +142,19 @@ console.log(JSON.stringify(decode(encode(data))) === JSON.stringify(data));

The library uses the [links-notation](https://github.com/link-foundation/links-notation) format as the serialization target. Each object is encoded as a Link with type information:

- Basic types are encoded with type markers: `(int 42)`, `(str "hello")`, `(bool True)`
- Basic types are encoded with type markers: `(int 42)`, `(str aGVsbG8=)`, `(bool True)`
- Strings are base64-encoded to handle special characters and newlines
- Collections include object IDs for reference tracking: `(list obj_0 item1 item2 ...)`
- Circular references use special `ref` links: `(ref obj_0)`
- Collections with self-references use built-in links notation self-reference syntax:
- **Format**: `(obj_id: type content...)`
- **Python example**: `(obj_0: dict ((str c2VsZg==) obj_0))` for `{"self": obj}`
- **JavaScript example**: `(obj_0: array (int 1) (int 2) obj_0)` for self-referencing array
- Simple collections without shared references use format: `(list item1 item2 ...)` or `(dict (key val) ...)`
- Circular references use direct object ID references: `obj_0` (without the `ref` keyword)

This approach allows for:
- Universal representation of object graphs
- Preservation of object identity
- Natural handling of circular references
- Natural handling of circular references using built-in links notation syntax
- Cross-language compatibility

## Development
Expand Down
1 change: 1 addition & 0 deletions issue_details.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"body":"```\nCircular references use special ref links: (ref obj_0)\n```\n\nNow I see in readme we add special marker/reference/keyword `ref`. And it is redundant.\n\nFor example:\n\n```js\nconst obj = {\n \"self\": obj\n \"other\": { \"1\": 1, \"2\": 2 }\n};\n```\n\nSelf reference should be translated as (or similar):\n\n```\n(obj: \n (self obj)\n (other (\n (1 1)\n (2 2)\n ))\n)\n```\n\nto links notation\n\nHow to read links notation:\n\n```\n(self-reference: first-reference second-reference ...)\n```\n\nImplement new style in both JS and Python versions.","comments":[],"title":"Instead of `ref` reference/marker use built-in references in links notation"}
25 changes: 25 additions & 0 deletions js/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading