Skip to content

Commit a551714

Browse files
committed
Update README
1 parent 8e265da commit a551714

File tree

1 file changed

+103
-2
lines changed

1 file changed

+103
-2
lines changed

README.md

Lines changed: 103 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,103 @@
1-
# code_diff
2-
A library for effective processing of code changes
1+
# Code Diff
2+
------------------------------------------------
3+
> Fast AST based code differencing in Python
4+
5+
Software projects are constantly evolving to integrate new features or improve existing implementations. To keep track of this progress, it becomes important to track individual code changes. Code differencing provides a way
6+
to identify the smallest code change between two
7+
implementations.
8+
9+
**code.diff** provides a fast alternative to standard code differencing techniques with a focus
10+
on AST based code differencing. As part of this library, we include a fast reimplementation of the [**GumTree**](https://github.com/GumTreeDiff/gumtree) algorithm. However, by relying on
11+
a best-effort AST parser, we are able to generate
12+
AST code changes for individual code snippets. Many
13+
programming languages including Python, Java and JavaScript are supported!
14+
15+
16+
## Installation
17+
The package is tested under Python 3. It can be installed via:
18+
```
19+
pip install code-diff
20+
```
21+
22+
## Usage
23+
code.diff can compute a code difference for nearly any program code in a few lines of code:
24+
```python
25+
import code_diff as cd
26+
27+
# Python
28+
output = cd.difference(
29+
'''
30+
def my_func():
31+
print("Hello World")
32+
''',
33+
'''
34+
def say_helloworld():
35+
print("Hello World")
36+
''',
37+
lang = "python")
38+
39+
# Output: my_func -> say_helloworld
40+
41+
output.edit_script()
42+
43+
# Output:
44+
# [
45+
# Update((identifier:my_func, line 1:12 - 1:19), say_helloworld)
46+
#]
47+
48+
49+
# Java
50+
output = cd.difference(
51+
'''
52+
int x = x + 1;
53+
''',
54+
'''
55+
int x = x / 2;
56+
''',
57+
lang = "java")
58+
59+
# Output: x + 1 -> x / 2
60+
61+
output.edit_script()
62+
63+
# Output: [
64+
# Insert(/:/, (binary_operator, line 0:4 - 0:9), 1),
65+
# Update((integer:1, line 0:8 - 0:9), 2),
66+
# Delete((+:+, line 0:6 - 0:7))
67+
#]
68+
69+
70+
```
71+
## Language support
72+
code.diff supports most programming languages
73+
where an AST can be computed. To parse an AST,
74+
the underlying parser employs
75+
* [**code.tokenize:**](https://github.com/cedricrupb/code_tokenize) A frontend for
76+
tree-sitter to effectively parse and tokenize
77+
program code in Python.
78+
79+
* [**tree-sitter:**](https://tree-sitter.github.io/tree-sitter/) A best-effort AST parser supporting
80+
many programming languages including Python, Java and JavaScript.
81+
82+
To decide whether your code can be handled by code.diff please review the libraries above.
83+
84+
**GumTree:** To compute an edit script between a source and target AST, we employ a Python reimplementation of the [GumTree](https://github.com/GumTreeDiff/gumtree) algorithm. Note however that the computed script are heavily dependent on the AST representation of the given code. Therefore, AST edit script computed with code.diff might significantly differ to the one computed by GumTree.
85+
86+
87+
## Release history
88+
* 0.0.1
89+
* Initial functionality
90+
* Documentation
91+
* SStuB Testing
92+
93+
## Project Info
94+
The goal of this project is to provide developer with easy access to AST-based code differencing. This is currently developed as a helper library for internal research projects. Therefore, it will only be updated as needed.
95+
96+
Feel free to open an issue if anything unexpected
97+
happens.
98+
99+
[Cedric Richter](https://uol.de/informatik/formale-methoden/team/cedric-richter) - [@cedricrupb](https://twitter.com/cedricrupb) - [email protected]
100+
101+
Distributed under the MIT license. See ``LICENSE`` for more information.
102+
103+

0 commit comments

Comments
 (0)