Pastec is an open source index and search engine for image recognition based on OpenCV. It can recognize flat objects such as covers, packaged goods or artworks. It has, however, not been designed to recognize faces, 3D objects, barcodes, or QR codes.
Pastec can be, for example, used to recognize DVD covers in a mobile app or detect near duplicate images in a big database.
Pastec does not store the pixels of the images in its database. It stores a signature of each image thanks to the technique of visual words.
Pastec offers a HTTP API using JSON to add, remove, and search for images in the index.
Pastec is developed by Visualink and licenced under the GNU LGPL v3.0. It is based on the free packages of OpenCV that are available for commercial purposes; you should therefore be free to use Pastec without paying for any patent license.
More precisely, Pastec uses the patent-free ORB descriptor and not the well-known SIFT and SURF descriptors that are patented.
The easiest way to run Pastec is using Docker. A Dockerfile and docker-compose configuration are provided.
- Clone the repository:
git clone https://github.com/lklic/pastec.git
cd pastec- Start with Docker Compose:
docker compose up -dThis will start Pastec on port 4212.
To be compiled, Pastec requires OpenCV 3.X and libmicrohttpd and libcurl. On Ubuntu 18.04, these packages can be installed using:
sudo apt-get install libopencv-dev libmicrohttpd-dev libcurl4-openssl-devPastec uses cmake as build system. You also need Git to get the source code:
sudo apt-get install cmake gitTo compile Pastec:
git clone https://github.com/Visu4link/pastec.git
cd pastec
mkdir build
cd build
cmake ../
makeTo start Pastec, run the pastec executable. It takes as mandatory argument the path to a file containing a list of ORB visual words:
./pastec visualWordsORB.datOptional arguments:
-p <port>: Set the HTTP port (default: 4212)-i <index_file>: Load an existing index file--https: Enable HTTPS--auth-key <key>: Set authentication key
Pastec can be controlled using a simple HTTP API. By default, it listens to port 4212.
All uploaded images must have their dimensions above 150 pixels. If one of the image dimensions exceeds 1000 pixels, the image is resized so that the maximum dimension is set to 1000 pixels and the original aspect ratio is kept.
Add the signature of an image to make it available for searching.
- Path: /index/images/<image_id>
- HTTP method: POST
- Data: Image binary data (JPEG) or JSON with image URL
- Response:
{
"type": "IMAGE_ADDED",
"image_id": 23,
"nb_features_extracted": 542
}Example using binary data:
curl -X POST --data-binary @/path/to/image.jpg http://localhost:4212/index/images/23Example using URL:
curl -X POST -d '{"url":"http://example.com/image.jpg"}' http://localhost:4212/index/images/23- Path: /index/images/<image_id>
- HTTP method: DELETE
- Response:
{
"type": "IMAGE_REMOVED",
"image_id": 23
}Example:
curl -X DELETE http://localhost:4212/index/images/23- Path: /index/images/<image_id>/tag
- HTTP method: POST
- Data: Tag string
- Response:
{
"type": "IMAGE_TAG_ADDED"
}Example:
curl -X POST --data "example_tag" http://localhost:4212/index/images/23/tag- Path: /index/images/<image_id>/tag
- HTTP method: DELETE
- Response:
{
"type": "IMAGE_TAG_REMOVED"
}Example:
curl -X DELETE http://localhost:4212/index/images/23/tagSearch for matches using an image.
- Path: /index/searcher
- HTTP method: POST
- Data: Image binary data (JPEG) or JSON with image URL
- Response:
{
"type": "SEARCH_RESULTS",
"results": [
{
"image_id": 2,
"score": 0.85,
"tag": "example_tag",
"bounding_rect": {
"x": 100,
"y": 200,
"width": 300,
"height": 400
}
},
{
"image_id": 5,
"score": 0.75,
"tag": "another_tag",
"bounding_rect": {
"x": 150,
"y": 250,
"width": 350,
"height": 450
}
}
]
}Each result object contains:
image_id: ID of the matched imagescore: Confidence score (higher is better)tag: Associated tag (if any)bounding_rect: Match location in image
Example using binary data:
curl -X POST --data-binary @/path/to/query.jpg http://localhost:4212/index/searcherExample using URL:
curl -X POST -d '{"url":"http://example.com/query.jpg"}' http://localhost:4212/index/searcher- Path: /index/imageIds
- HTTP method: GET
- Response:
{
"type": "INDEX_IMAGE_IDS",
"image_ids": [1, 2, 3, 23, 45]
}Example:
curl -X GET http://localhost:4212/index/imageIdscurl -X POST -d '{"type":"WRITE", "index_path":"index.dat"}' http://localhost:4212/index/iocurl -X POST -d '{"type":"LOAD", "index_path":"index.dat"}' http://localhost:4212/index/iocurl -X POST -d '{"type":"CLEAR"}' http://localhost:4212/index/iocurl -X POST -d '{"type":"WRITE_TAGS", "index_tags_path":"tags.dat"}' http://localhost:4212/index/iocurl -X POST -d '{"type":"LOAD_TAGS", "index_tags_path":"tags.dat"}' http://localhost:4212/index/ioSimple health check:
curl -X POST -d '{"type":"PING"}' http://localhost:4212/Response:
{
"type": "PONG"
}All API responses include a type field indicating success or error:
{
"type": "IMAGE_NOT_DECODED"
}Common error types:
IMAGE_NOT_DECODED: Image could not be decodedIMAGE_SIZE_TOO_BIG: Image dimensions exceed limitsIMAGE_SIZE_TOO_SMALL: Image dimensions below minimumIMAGE_NOT_FOUND: Referenced image ID not foundIMAGE_TAG_NOT_FOUND: No tag found for imageAUTHENTIFICATION_ERROR: Invalid authentication keyIMAGE_DOWNLOADER_HTTP_ERROR: Error downloading image from URL
A Python client library is provided in the python directory. Example usage:
from PastecLib import PastecConnection
pastec = PastecConnection("localhost", 4212)
# Add image from file
pastec.indexImageFile(1, "image.jpg")
# Add image from URL
# This requires handling yourself, the Python lib doesn't have direct URL support
# Add tag
pastec.addTag(1, "example_tag")
# Search with image file
results = pastec.imageQueryFile("query.jpg")
for image_id, tag in results:
print(f"Match: Image ID {image_id}, Tag: {tag}")
# Save and load index
pastec.writeIndex("index.dat")
pastec.loadIndex("index.dat")
# Save and load tags
pastec.writeIndexTags("tags.dat")
pastec.loadIndexTags("tags.dat")
# Clear index
pastec.clearIndex()