Skip to content

Each call to session.query() takes at least 13ms #29

@Musinux

Description

@Musinux

Hi chdb Team 👋

I'd like to raise an issue with the baseline time it takes to execute any query. The most minimal SELECT * FROM numbers(1) query takes ~13ms on my computer, and I have a hard time understanding why. Is this something that you have on your radar?

To give you context, my team wants to implement clickhouse in production, and we want to be able to run chdb as the backend for our test suite, that will cleanup the tables between each test, so such numbers add up with 30k tests.

Do you have any idea on how to optimize that? I've started stacking multiple queries in one call, but that's not a very clean solution. any help would be appreciated.

here's a sample script I've used for getting some benchmark.

const database = 'local';

const tableSchema = `
CREATE TABLE IF NOT EXISTS ${database}.chat_events
(
    timestamp                   DateTime64,
    tenant_id                   Int64,
    conversation_id             Int64,
    event_type                  String,
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(timestamp)
ORDER BY (tenant_id, toStartOfDay(timestamp), timestamp)
PRIMARY KEY (tenant_id, toStartOfDay(timestamp));
`;


async function run() {
  let time = performance.now();
  const chdb = await import('chdb');
// took 100ms
  console.log(`Importing chdb took ${Math.trunc(performance.now() - time)}ms`);

  const {Session} = chdb;
  const sess = new Session();
  time = performance.now();
  sess.query(`CREATE DATABASE IF NOT EXISTS ${database}`);
// 42ms
  console.log(`Creating database took ${Math.trunc(performance.now() - time)}ms`);

  time = performance.now();
  sess.query(tableSchema);
// 19ms
  console.log(`Creating table took ${Math.trunc(performance.now() - time)}ms`);

  const rows: Array<any> = [];
  for (let i = 0; i < 10000; i++) {
    rows.push({
      timestamp: (new Date()).getTime(),
      tenant_id: 1,
      conversation_id: i,
      event_type: 'message',
    });
  }

  time = performance.now();
  sess.query(`INSERT INTO ${database}.chat_events FORMAT JSONEachRow ${JSON.stringify(rows)}`);
// 30ms
  console.log(`Inserting rows took ${Math.trunc(performance.now() - time)}ms`);

  time = performance.now();
  sess.query('SELECT * FROM numbers(1)');
// 15ms
  console.log(`SELECT * FROM numbers(1) session query took ${Math.trunc(performance.now() - time)}ms`);

  time = performance.now();
// 13ms
  chdb.query('SELECT * FROM numbers(1)');
  console.log(`SELECT * FROM numbers(1) query took ${Math.trunc(performance.now() - time)}ms`);
}

run();

the output:

$ npm run ts-node minimal-test.ts
Importing chdb took 100ms
Creating database took 42ms
Creating table took 19ms
Inserting rows took 30ms
SELECT * FROM numbers(1) session query took 15ms
SELECT * FROM numbers(1) query took 13ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions