Shop-Sense is a smart eyewear solution designed to empower visually impaired persons (VIPs) to navigate shopping malls, recognize products, and manage their shopping independently. By combining advanced object recognition, navigation assistance, and intuitive feedback mechanisms, Shop-Sense aims to enhance autonomy and confidence for visually impaired individuals in unfamiliar shopping environments.
For individuals with visual impairments, shopping in a mall or buying groceries can be a challenging task. Relying on others for assistance may impact a person's confidence and independence. Shop-Sense addresses this need by providing:
- Independent shopping experiences: Enabling VIPs to identify products and make informed purchasing decisions.
- Easy navigation: Assisting VIPs in navigating unfamiliar spaces, like malls, with intuitive feedback mechanisms.
- Product Recognition: Identify items on store shelves using a video stream and provide details such as price and composition through audio feedback.
- Mall Navigation: Use visual markers and haptic feedback to guide users along the shortest path to their destination.
- Shopping History Management: Store shopping history locally on a connected app.
- Real-Time Object Detection: Incorporate IR sensors for detecting unexpected objects in real-time.
- Decentralized Computation: Use QR codes to create local servers on mobile devices for handling computations.
- Enhanced App Integration: Improve shopping history records and connect the app with eyewear using QR codes.
- Product Recognition: Using a YOLO_NAS model trained on the SKU-110 dataset for object detection and VGG-16 (with transfer learning) for feature extraction.
- Navigation Assistance: Navigation instructions are tested via a serial monitor due to unavailability of haptic motors and speakers.
- Audio Feedback: Implements Keyword Spotting and audio-to-text translation using DeepSpeech API hosted on a Flask server.
- Pipeline:
- Detect products in dense environments using YOLO_NAS.
- Crop and preprocess detected products.
- Use VGG-16 with K-Nearest Neighbor for product identification.
- Currency Recognition: Applies the same pipeline as product recognition.
- Visual Markers:
- Detect floor markers using computer vision (HSV values, Canny Edge Detection, and Hough Transform).
- Combine marker data with IMU sensor readings for precise localization.
- Pathfinding: Utilize the A* algorithm to guide users to their destination.
- Haptic Feedback: Simulated via coin buzzers for directional guidance.
- Flask Server: Handles image processing and communicates with the eyewear.
- Firebase Integration: Uses Notecard and Notehub.io for data exchange with the cloud database.
- STM32F411CE (Blackpill): Microcontroller for navigation and control.
- ESP32S3-SENSE: For audio data collection and communication.
- MPU9050 Sensor (IMU): For dead reckoning and orientation.
- Notecard and Notecarrier: For cloud data exchange.
- VCC: 3.3V on STM32F411
- GND: GND on STM32F411
- SCL: PB6 on STM32F411
- SDA: PB7 on STM32F411
- AD0: GND (0x68) or VCC (0x69)
- RX/TX to corresponding TX/RX on STM32F411CE
- SCL: STM32F411 PB6 (I2C1 SCL)
- SDA: STM32F411 PB7 (I2C1 SDA)
- V+: 5V USB Power Supply or 3.3V from STM32F411
- GND: GND of STM32F411
- A 3.3V voltage regulator ensures stable power for STM32F411CE and ESP32S3-SENSE.
- Both devices are powered through regulated 3.3V outputs, with grounds connected.
- Install Flask and required dependencies:
pip install flask firebase-admin deepspeech
- Run the Flask application:
python app.py
-
Register on Notehub.io.
-
Create routes to connect Notecard to the Flask server.
-
Add the provided endpoints to handle data exchange:
@app.route('/notehubWebhook', methods=['POST']) def notehub_webhook(): # Handle data and save to database
- Product Identification: Capture an image using the eyewear camera. Audio feedback will provide product details.
- Mall Navigation: Follow haptic feedback and audio instructions for guidance.
- Data Management: Use the connected app to view shopping history.
- Incorporate real-time object detection using IR sensors.
- Enable decentralized processing using mobile-hosted servers.
- Enhance navigation with advanced visual marker detection and integration.
- SKU-110 Dataset: For providing realistic retail images.
- DeepSpeech API: For enabling audio-to-text conversion.
- Flask: For powering the backend server.