Skip to content

Commit 568f7e2

Browse files
authored
Update README.md
1 parent 4abd1ea commit 568f7e2

File tree

1 file changed

+50
-26
lines changed

1 file changed

+50
-26
lines changed

README.md

Lines changed: 50 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -66,47 +66,71 @@ DataFog can be installed via pip:
6666
pip install datafog
6767
```
6868

69-
and in your python environment:
69+
## Examples - Updated for v3
70+
7071

71-
```
72-
from datafog import PresidioEngine as presidio
73-
datafog = datafog.DataFog()
7472

7573
```
74+
# Example: Annotating PII
75+
from datafog import PIIAnnotationPipeline, PIIAnnotationRequest
7676
77-
## Examples
77+
# Initialize the PII annotation pipeline
78+
pii_pipeline = PIIAnnotationPipeline()
7879
79-
Here are some examples of datafog being used to redact information in business contexts. Please see '/examples' for our [Getting Started](examples/getting-started.ipynb) notebook. We'll be regularly updating content and providing comprehensive guides to using DataFog in production contexts. If you have any ideas for a tutorial or guide that you would like to see, please let us know!
80+
# Provide the text or document containing PII
81+
pii_text = "Name: John Doe\nAddress: 123 Main St, Anytown, USA"
8082
81-
### Scanning a single string
83+
# Submit the text for PII annotation
84+
annotated_text = pii_pipeline.annotate_pii(pii_text)
8285
83-
```
84-
ceo_email_chunk = "I'm announcing on Friday that Jeff is going to be CTO."
86+
# Print the annotated text with identified PII
87+
print("Annotated Text:")
88+
print(annotated_text)
8589
86-
scan_results1 = presidio.scan(ceo_email_chunk)
87-
print("PII Detected - base case:", scan_results1)
88-
# PII Detected - base case: [type: PERSON, start: 30, end: 34, score: 0.85]
8990
91+
# Example: Text Extraction from images
92+
93+
from datafog import DonutImageProcessor, PipelineOperationType
94+
95+
# Initialize the image processor
96+
processor = DonutImageProcessor(operation_type=PipelineOperationType.PARSE_IMAGE)
97+
98+
# Load the image containing the invoice
99+
sample_image_path = "path/to/your/invoice/image.png"
100+
101+
# Parse the invoice image to extract details
102+
result = processor.parse_invoice(sample_image_path)
103+
104+
# Print the extracted details
105+
print("Invoice Details:")
106+
for item in result:
107+
print(f"- {item['name']}: {item['price']}")
108+
109+
110+
111+
# Example: Text Extraction
112+
from datafog import DataFog, PipelineOperationType
113+
114+
# Initialize DataFog for text processing
115+
data_processor = DataFog(operation_type=PipelineOperationType.PROCESS_TEXT)
116+
117+
# Provide the text to be analyzed
118+
text = "Customer: John Smith\nProduct: Laptop\nPrice: $1200"
119+
120+
# Extract entities from the text
121+
entities = data_processor.extract_entities(text)
122+
123+
# Print the extracted entities
124+
print("Entities Detected:")
125+
for entity in entities:
126+
print(f"- {entity['type']}: {entity['text']}")
90127
91-
scan_results2 = presidio.scan(ceo_email_chunk, deny_list=['CTO'])
92-
print("PII Detected with deny list:", scan_results2)
93-
# PII Detected with deny list: [type: CUSTOM_PII, start: 50, end: 53, score: 1.0, type: PERSON, start: 30, end: 34, score: 0.85]
94128
95-
```
96129
97-
### Scanning a list of PDFs
98130
99131
```
100-
file_dir = ["/Users/sidmohan/Desktop/datafog-v2.4.0/datafog-python/tests/files/input_files/agi-builder-meetup.pdf",
101-
"/Users/sidmohan/Desktop/datafog-v2.4.0/datafog-python/tests/files/input_files/pypdf-readthedocs-io-en-stable.pdf"]
102-
datafog = datafog.DataFog()
103-
result = datafog.upload_files(uploaded_files=file_dir)
104-
print(result)
105-
```
106132

107-
The output here will be a dictionary where the keys are the file names and the values are the scan results for that file.
108-
for ex:
109-
`{'agi-builder-meetup.pdf': "2/26/24, 2:16 PM\nAGI Builders Meetup SF · Luma\nContact the HostReport Event29\nEvent FullIf youʼd like"}`
133+
110134

111135
## Contributing
112136

0 commit comments

Comments
 (0)