Skip to content

Mrityunjay383/BulkUpload

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

CSV Bulk Upload with Parallel Processing using Node.js, MongoDB, and Worker Threads

This repository contains a Node.js application that processes large CSV files for bulk upload to a MongoDB database. The application leverages worker threads for parallel processing to efficiently handle large datasets. Progress updates are provided via socket.io to keep the user informed during the upload process.

Features

  • CSV Parsing: Efficiently parses large CSV files using streams.
  • Batch Processing: Organizes data into batches for optimized database insertion.
  • Parallel Processing: Uses worker threads to process batches in parallel.
  • Progress Updates: Real-time progress updates via socket.io.
  • Server Crash Recovery: Mechanism to resume uploads in case of server failure.
  • Error Handling: Robust error handling and logging.

Getting Started

Prerequisites

  • Node.js
  • MongoDB

Installation

  1. Clone the repository

    git clone https://github.com/Mrityunjay383/BulkUpload.git
    cd server
  2. Install dependencies

    npm install
  3. Set up environment variables Create a .env file in the root directory with the following contents:

    MONGO_URI=mongodb://localhost:27017/yourdbname
    PORT=5000
    BATCH_SIZE=1000
    

Running the Application

  1. Run the application

    npm run dev
  2. Access the application The application will be running on http://localhost:5000.

API Endpoints

  • Upload CSV File

    Endpoint: POST /upload

    • Description: Upload a CSV file for processing.
    • Request: Form-data with a file field named file.
    • Response: JSON with upload ID.

Contributing

  1. Fork the repository
  2. Create a new branch
    git checkout -b feature-name
  3. Commit your changes
    git commit -m "Add some feature"
  4. Push to the branch
    git push origin feature-name
  5. Open a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

About

A Node.js application designed to handle the efficient bulk upload of large CSV files into a MongoDB database. Utilizing worker threads for parallel processing, the application parses CSV data into batches and inserts them into the database, providing real-time progress updates through socket.io.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors