This guide covers fulltext search capabilities in the MatrixOne Python SDK, including fulltext indexing, search modes, and advanced features.
MatrixOne provides powerful fulltext search capabilities for text analysis and information retrieval:
- Multiple Algorithms: Support for TF-IDF and BM25 algorithms
- Search Modes: Natural language and boolean search modes
- Client Interface: Easy-to-use client methods for fulltext operations
- ORM Integration: Seamless integration with SQLAlchemy models
- Multi-language Support: Support for various languages including Chinese
- JSON Search: Fulltext search within JSON documents
from matrixone import Client
# Create client and connect (see configuration_guide for connection options)
client = Client()
client.connect(host="127.0.0.1", port=6001, user="root", password="111", database="test")
# Create table with text content using create_table API
client.create_table("articles", {
"id": "int",
"title": "varchar(200)",
"content": "text",
"summary": "text",
"author": "varchar(100)",
"category": "varchar(50)",
"tags": "varchar(200)",
"metadata": "json"
}, primary_key="id")
# Create another table for documents
client.create_table("documents", {
"id": "int",
"filename": "varchar(255)",
"file_content": "text",
"description": "text",
"file_type": "varchar(50)"
}, primary_key="id")# Create fulltext index using fulltext_index API
client.fulltext_index.create("articles", name="idx_content", columns="content", algorithm="BM25")
client.fulltext_index.create("articles", name="idx_title", columns="title", algorithm="TF-IDF")
client.fulltext_index.create("documents", name="idx_file_content", columns="file_content", algorithm="BM25")
# Create fulltext index on multiple columns
client.fulltext_index.create("articles", name="idx_title_content", columns=["title", "content"], algorithm="BM25")
# Drop a fulltext index
client.fulltext_index.drop("articles", "idx_title")MatrixOne supports specialized parsers for different content types:
JSON Parser - For searching within JSON documents:
Note
JSON Parser Key Points:
- The JSON parser indexes VALUES in JSON documents, not keys
- Perfect for searching product specifications, user preferences, configuration data
- Supports nested JSON structures
- Works with both JSON columns and TEXT columns containing JSON
- Can be combined with regular SQL filters for powerful queries
from matrixone import Client, FulltextParserType
from matrixone.orm import declarative_base
from sqlalchemy import Column, BigInteger, Text
client = Client()
client.connect(host="127.0.0.1", port=6001, user="root", password="111", database="test")
# Enable fulltext functionality
client.fulltext_index.enable_fulltext()
# Method 1: Using ORM with JSON parser
Base = declarative_base()
class Product(Base):
__tablename__ = "products"
id = Column(BigInteger, primary_key=True)
details = Column(Text) # Stores JSON as text
# Define fulltext index with JSON parser in table definition
__table_args__ = (
FulltextIndex("ftidx_details", "details", parser=FulltextParserType.JSON),
)
# Create table with ORM (index is created automatically)
client.create_table(Product)
# Insert JSON data
products = [
{"id": 1, "details": '{"name": "Laptop", "brand": "Dell", "price": 1200}'},
{"id": 2, "details": '{"name": "Phone", "brand": "Apple", "price": 800}'},
{"id": 3, "details": '{"name": "Tablet", "brand": "Samsung", "price": 600}'},
]
client.batch_insert(Product, products)
# Search within JSON content
result = client.query(Product).filter(
boolean_match(Product.details).must("Dell")
).execute()
for row in result.fetchall():
print(f"Found: {row.details}")
# Method 2: Create JSON index on existing table
client.execute("CREATE TABLE json_docs (id INT PRIMARY KEY, data TEXT)")
client.execute(
"CREATE FULLTEXT INDEX ftidx_json ON json_docs (data) WITH PARSER json"
)
# Insert and search JSON data
client.execute(
"INSERT INTO json_docs VALUES "
"(1, '{\"title\": \"Python Tutorial\", \"tags\": [\"python\", \"programming\"]}'), "
"(2, '{\"title\": \"Java Guide\", \"tags\": [\"java\", \"programming\"]}'))"
)
result = client.execute(
"SELECT * FROM json_docs WHERE MATCH(data) AGAINST('python' IN BOOLEAN MODE)"
)
for row in result.fetchall():
print(f"ID: {row[0]}, Data: {row[1]}")NGRAM Parser - For Chinese and other Asian languages:
from matrixone import Client, FulltextParserType
from matrixone.orm import declarative_base
from matrixone.sqlalchemy_ext import FulltextIndex, natural_match, boolean_match
from sqlalchemy import Column, Integer, String, Text
client = Client()
client.connect(host="127.0.0.1", port=6001, user="root", password="111", database="test")
# Enable fulltext functionality
client.fulltext_index.enable_fulltext()
# Method 1: Using ORM with NGRAM parser for Chinese content
Base = declarative_base()
class ChineseArticle(Base):
__tablename__ = "chinese_articles"
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(200))
body = Column(Text)
# Define fulltext index with NGRAM parser for Chinese tokenization
__table_args__ = (
FulltextIndex("ftidx_chinese", ["title", "body"], parser=FulltextParserType.NGRAM),
)
# Create table with ORM (index is created automatically)
client.create_table(ChineseArticle)
# Insert Chinese content
articles = [
{"id": 1, "title": "神雕侠侣 第一回", "body": "越女采莲秋水畔,窄袖轻罗,暗露双金钏"},
{"id": 2, "title": "神雕侠侣 第二回", "body": "正自发痴,忽听左首屋中传出一人喝道"},
{"id": 3, "title": "神雕侠侣 第三回", "body": "郭靖在舟中潜运神功,数日间伤势便已痊愈了大半"},
]
client.batch_insert(ChineseArticle, articles)
# Search Chinese content with natural language mode
result = client.query(ChineseArticle).filter(
natural_match(ChineseArticle.title, ChineseArticle.body, query="神雕侠侣")
).execute()
print(f"Found {len(result.fetchall())} articles about 神雕侠侣")
# Search with boolean mode
result = client.query(ChineseArticle).filter(
boolean_match(ChineseArticle.title, ChineseArticle.body).must("郭靖")
).execute()
for row in result.fetchall():
print(f"Title: {row.title}, Body: {row.body[:20]}...")
# Method 2: Create NGRAM index on existing table
client.execute(
"CREATE TABLE chinese_docs ("
"id INT PRIMARY KEY, "
"title VARCHAR(200), "
"content TEXT"
")"
)
client.execute(
"CREATE FULLTEXT INDEX ftidx_ngram ON chinese_docs (title, content) "
"WITH PARSER ngram"
)
# Insert and search Chinese content
client.execute(
"INSERT INTO chinese_docs VALUES "
"(1, 'MO全文索引示例', '这是一个关于MO全文索引的例子'), "
"(2, 'ngram解析器', 'ngram解析器允许MO对中文进行分词')"
)
result = client.execute(
"SELECT * FROM chinese_docs "
"WHERE MATCH(title, content) AGAINST('全文索引' IN NATURAL LANGUAGE MODE)"
)
for row in result.fetchall():
print(f"Title: {row[1]}, Content: {row[2]}")Mixed Content (English + Chinese):
from matrixone import Client, FulltextParserType
from matrixone.orm import declarative_base
from matrixone.sqlalchemy_ext import FulltextIndex, natural_match
from sqlalchemy import Column, Integer, String, Text
client = Client()
client.connect(host="127.0.0.1", port=6001, user="root", password="111", database="test")
client.fulltext_index.enable_fulltext()
Base = declarative_base()
class MixedContent(Base):
__tablename__ = "mixed_articles"
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(255))
content = Column(Text)
# NGRAM parser works well for mixed English/Chinese content
__table_args__ = (
FulltextIndex("ftidx_mixed", ["title", "content"], parser=FulltextParserType.NGRAM),
)
client.create_table(MixedContent)
# Insert mixed content
articles = [
{"id": 1, "title": "MO全文索引示例", "content": "这是关于MO fulltext index的例子"},
{"id": 2, "title": "Python教程", "content": "Learn Python programming with 中文教程"},
]
client.batch_insert(MixedContent, articles)
# Search for Chinese terms
result = client.query(MixedContent).filter(
natural_match(MixedContent.title, MixedContent.content, query="全文索引")
).execute()
# Search for English terms
result = client.query(MixedContent).filter(
natural_match(MixedContent.title, MixedContent.content, query="Python")
).execute()# Insert articles using insert API
articles = [
{
"id": 1,
"title": "Introduction to Machine Learning",
"content": "Machine learning is a subset of artificial intelligence that focuses on algorithms and statistical models. It enables computers to learn and make decisions from data without being explicitly programmed.",
"summary": "An overview of machine learning concepts and applications",
"author": "John Doe",
"category": "Technology",
"tags": "AI, ML, algorithms",
"metadata": '{"language": "English", "difficulty": "beginner"}'
},
{
"id": 2,
"title": "Deep Learning Fundamentals",
"content": "Deep learning uses neural networks with multiple layers to model and understand complex patterns in data. It has revolutionized fields like computer vision, natural language processing, and speech recognition.",
"summary": "Understanding deep learning and neural networks",
"author": "Jane Smith",
"category": "Technology",
"tags": "deep learning, neural networks, AI",
"metadata": '{"language": "English", "difficulty": "intermediate"}'
}
]
for article in articles:
client.insert("articles", article)
# Insert documents using batch_insert API
documents = [
{
"id": 1,
"filename": "research_paper.pdf",
"file_content": "This research paper discusses advanced machine learning techniques and their applications in real-world scenarios.",
"description": "Academic research paper on ML",
"file_type": "PDF"
}
]
client.batch_insert("documents", documents)# Natural language search - automatically handles stopwords, stemming, and relevance scoring
# This mode is ideal for user queries and general search applications
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(natural_match("content", query="machine learning")).execute()
print("Natural language search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Boolean search with phrase matching - provides precise control over search terms
# Use phrase() for exact phrase matching, encourage() for boosting relevance
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(
boolean_match("content").phrase("deep learning").encourage("networks")
).execute()
print("Boolean search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Search with relevance scoring - returns a relevance score for ranking results
# Higher scores indicate better matches; useful for search result ranking
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author",
natural_match("content", query="artificial intelligence").label("relevance")
).execute()
print("Search with relevance scoring:")
for row in result.fetchall():
print(f" {row[1]} (Relevance: {row[4]:.4f})")
# Simple search without ordering - just get matching results
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").must("artificial intelligence")).execute()
print("Simple search results:")
for row in result.fetchall():
print(f" {row[1]}")
# Using ORM models for fulltext search
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
Base = declarative_base()
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String(200))
content = Column(Text)
author = Column(String(100))
category = Column(String(50))
# Natural language search with model
result = client.query(Article).filter(
natural_match(Article.content, query="machine learning")
).execute()
print("Natural language search with model:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Boolean search with model
result = client.query(Article).filter(
boolean_match(Article.content).phrase("deep learning").encourage("networks")
).execute()
print("Boolean search with model:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Search with scoring using model
result = client.query(
Article.id,
Article.title,
Article.content,
Article.author,
natural_match(Article.content, query="artificial intelligence").label("relevance")
).execute()
print("Search with scoring using model:")
for row in result.fetchall():
print(f" {row[1]} (Relevance: {row[4]:.4f})")MatrixOne supports two fulltext search modes with different operators and use cases:
Best for: User-facing search, Google-style queries, general search applications
Features:
- Automatic stopword removal (e.g., "the", "a", "is", "to")
- Automatic word stemming (e.g., "running" → "run")
- Natural relevance scoring (BM25 or TF-IDF)
- No special operators needed
- Perfect for search boxes and user queries
Usage:
from matrixone.sqlalchemy_ext import natural_match
# Simple keyword search
result = client.query(Article).filter(
natural_match(Article.content, query="machine learning")
).execute()
# Multi-word natural query
result = client.query(Article).filter(
natural_match(Article.title, Article.content,
query="how to learn python programming")
).execute()
# Question-like queries (stopwords handled automatically)
result = client.query(Article).filter(
natural_match(Article.content, query="what is deep learning")
).execute()Best for: User queries, search boxes, general search
Best for: Precise control, advanced filters, complex logic
Boolean Mode Operators:
MUST (+ operator, required terms):
from matrixone.sqlalchemy_ext import boolean_match # Must contain "machine" AND "learning" result = client.query(Article).filter( boolean_match(Article.content).must("machine", "learning") ).execute() # Must contain "Python" result = client.query(Article).filter( boolean_match(Article.content).must("Python") ).execute()
MUST_NOT (- operator, excluded terms):
# Contains "programming" but NOT "legacy" result = client.query(Article).filter( boolean_match(Article.content) .must("programming") .must_not("legacy") ).execute() # Contains "learning" but NOT "deep" result = client.query(Article).filter( boolean_match(Article.content) .must("learning") .must_not("deep") ).execute()
ENCOURAGE (boost relevance, optional terms):
# Must have "Python", boost if has "data" or "science" result = client.query(Article).filter( boolean_match(Article.content) .must("Python") .encourage("data", "science") ).execute() # Articles with encouraged terms rank higher
DISCOURAGE (~ operator, reduce relevance):
# Must have "Python", discourage "legacy" (still matches but ranks lower) result = client.query(Article).filter( boolean_match(Article.content) .must("Python") .encourage("modern") .discourage("legacy") ).execute()
PHRASE ("" operator, exact phrase matching):
# Exact phrase "neural networks" result = client.query(Article).filter( boolean_match(Article.content).phrase("neural networks") ).execute() # Exact phrase "best practices" result = client.query(Article).filter( boolean_match(Article.content).phrase("best practices") ).execute()
GROUP (combine terms with OR logic):
from matrixone.sqlalchemy_ext.fulltext_search import group # Must contain either "programming" OR "development" result = client.query(Article).filter( boolean_match(Article.content) .must(group().medium("programming", "development")) ).execute()
Complex Boolean Queries:
# Complex example: All operators combined
result = client.query(Article).filter(
boolean_match(Article.content)
.must("learning") # Required term
.must(group().medium("machine", "deep")) # Required: either "machine" OR "deep"
.encourage("tutorial") # Optional: boosts relevance
.discourage("advanced") # Optional: reduces relevance
.must_not("legacy") # Excluded term
).execute()
# Practical example: Python articles
result = client.query(Article).filter(
boolean_match(Article.title, Article.content)
.must("Python") # Must contain Python
.encourage("beginner", "tutorial", "guide") # Prefer beginner content
.must_not("deprecated", "outdated") # Exclude old content
).execute()Quick Comparison:
| Feature | Natural Mode | Boolean Mode |
|---|---|---|
| Query Style | "how to learn python" | must("python") |
| Processing | Auto (stopwords/stem) | Exact terms |
| Best For | Search boxes | Advanced filters |
# Multi-column search - searches across multiple text columns simultaneously
# The columns must match exactly what's defined in your fulltext index
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(natural_match("title", "content", query="machine learning")).execute()
print("Multi-column search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Combined search with SQL filters - combines fulltext search with regular SQL conditions
# This allows you to filter by metadata while searching text content
# Method 1: Multiple conditions in single filter()
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(
natural_match("content", query="AI"),
"articles.category = 'Technology'"
).execute()
print("Filtered search results (single filter):")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Method 2: Chained filter() calls
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(boolean_match("content").must("AI")).filter("articles.category = 'Technology'").execute()
print("Filtered search results (chained filters):")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Method 3: Complex filtering with multiple conditions
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author",
"articles.category"
).filter(
boolean_match("content").encourage("programming"),
"articles.category = 'Programming'",
"articles.id > 1"
).execute()
print("Complex filtered search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]} - {row[4]}")
# Paginated search results - useful for large result sets
# LIMIT controls how many results to return, OFFSET skips the first N results
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author",
natural_match("content", query="learning").label("relevance")
).limit(2).offset(1).execute()
print("Paginated search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]} (Score: {row[4]:.4f})")
# Simple pagination without ordering - just get next N results
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").must("learning")).limit(2).offset(1).execute()
print("Simple paginated results:")
for row in result.fetchall():
print(f" {row[1]}")You can combine fulltext search with regular SQL filters in several ways:
# Method 1: Multiple conditions in single filter() call
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author",
"articles.category"
).filter(
boolean_match("content").must("python"), # Fulltext condition
"articles.category = 'Programming'", # SQL condition 1
"articles.id > 1", # SQL condition 2
"articles.author LIKE '%Smith%'" # SQL condition 3
).execute()
# Method 2: Chained filter() calls (more readable for complex queries)
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author",
"articles.category"
).filter(boolean_match("content").encourage("programming")) # Fulltext condition
.filter("articles.category = 'Programming'") # SQL condition 1
.filter("articles.id > 1") # SQL condition 2
.filter("articles.author LIKE '%Smith%'") # SQL condition 3
.execute()
# Method 3: Using ORM model attributes (when available)
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
# Define ORM model
Base = declarative_base()
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String(200))
content = Column(Text)
author = Column(String(100))
category = Column(String(50))
# Using model class in queries
result = client.query(Article).filter(
boolean_match(Article.content).must("python"),
Article.category == "Programming",
Article.id > 1,
Article.author.like("%Smith%")
).execute()
# Using model with natural_match
result = client.query(Article).filter(
natural_match(Article.title, Article.content, query="machine learning")
).execute()
# Using model with scoring
result = client.query(
Article.id,
Article.title,
Article.content,
boolean_match(Article.content).encourage("python").label("score")
).execute()
# Method 4: Complex filtering with IN, BETWEEN, and other operators
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category",
"articles.tags"
).filter(
natural_match("title", "content", query="machine learning"),
"articles.category IN ('AI', 'Technology', 'Programming')",
"articles.id BETWEEN 1 AND 10",
"articles.tags LIKE '%tutorial%'",
"articles.author IS NOT NULL"
).execute()
# Method 5: Combining with scoring
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category",
boolean_match("title", "content").encourage("python").label("score")
).filter(
"articles.category = 'Programming'",
"articles.id > 1"
).limit(5).execute()
# Method 6: Using SQLAlchemy and_/or_ for complex conditions
from sqlalchemy import and_, or_
# Logical AND: Combine fulltext search with category filter
fulltext_condition = boolean_match("title", "content").must("python")
category_condition = "articles.category = 'Programming'"
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category"
).filter(and_(fulltext_condition, category_condition)).execute()
# Logical OR: Combine different category conditions
programming_condition = "articles.category = 'Programming'"
ai_condition = "articles.category = 'AI'"
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category"
).filter(or_(programming_condition, ai_condition)).execute()
# Filter by multiple values using SQL IN
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category",
"articles.author"
).filter("articles.category IN ('Programming', 'AI', 'Technology')").execute()
# SQL IN with fulltext search
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.author"
).filter(
boolean_match("title", "content").encourage("python"),
"articles.author IN ('John Doe', 'Jane Smith', 'Bob Wilson')"
).execute()
# Complex nested logical conditions
result = client.query(
"articles.id",
"articles.title",
"articles.content",
"articles.category",
"articles.author"
).filter(
and_(
boolean_match("title", "content").encourage("programming"),
or_(
"articles.category IN ('Programming', 'AI')",
"articles.category = 'Technology'"
),
"articles.author IN ('John Doe', 'Jane Smith')",
"articles.id > 1"
)
).execute()# AND operator - both terms must be present in the document
# Use must() for required terms (AND logic)
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").must("machine", "learning")).execute()
print("AND search results:")
for row in result.fetchall():
print(f" {row[1]}")
# OR operator - at least one of the terms must be present
# Use group().medium() for OR logic within required conditions
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").must(group().medium("deep", "neural"))).execute()
print("OR search results:")
for row in result.fetchall():
print(f" {row[1]}")
# NOT operator (exclusion) - documents containing the excluded term are filtered out
# Use must_not() to exclude documents with specific terms
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").must("machine").must_not("learning")).execute()
print("NOT search results:")
for row in result.fetchall():
print(f" {row[1]}")
# Phrase search - exact phrase matching
# Use phrase() for exact phrase matching
result = client.query(
"articles.id",
"articles.title",
"articles.content"
).filter(boolean_match("content").phrase("artificial intelligence")).execute()
print("Phrase search results:")
for row in result.fetchall():
print(f" {row[1]}")
# Using ORM models with boolean search operators
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
Base = declarative_base()
class Article(Base):
__tablename__ = 'articles'
id = Column(Integer, primary_key=True)
title = Column(String(200))
content = Column(Text)
author = Column(String(100))
category = Column(String(50))
# AND operator with model
result = client.query(Article).filter(
boolean_match(Article.content).must("machine", "learning")
).execute()
print("AND search with model:")
for row in result.fetchall():
print(f" {row[1]}")
# OR operator with model
result = client.query(Article).filter(
boolean_match(Article.content).must(group().medium("deep", "neural"))
).execute()
print("OR search with model:")
for row in result.fetchall():
print(f" {row[1]}")
# NOT operator with model
result = client.query(Article).filter(
boolean_match(Article.content).must("machine").must_not("learning")
).execute()
print("NOT search with model:")
for row in result.fetchall():
print(f" {row[1]}")
# Phrase search with model
result = client.query(Article).filter(
boolean_match(Article.content).phrase("artificial intelligence")
).execute()
print("Phrase search with model:")
for row in result.fetchall():
print(f" {row[1]}")import asyncio
from matrixone import AsyncClient
from matrixone.config import get_connection_params
async def async_fulltext_example():
# Get connection parameters
host, port, user, password, database = get_connection_params()
client = AsyncClient()
await client.connect(host=host, port=port, user=user, password=password, database=database)
# Create table using async create_table API
await client.create_table("async_articles", {
"id": "int",
"title": "varchar(200)",
"content": "text",
"author": "varchar(100)"
}, primary_key="id")
# Create fulltext index using async fulltext_index API
await client.fulltext_index.create("async_articles", name="idx_content", columns="content", algorithm="BM25")
# Insert data using async insert API
await client.insert("async_articles", {
"id": 1,
"title": "Async Article",
"content": "This is an article created using async operations for fulltext search testing.",
"author": "Async Author"
})
# Fulltext search using async query API
result = await client.query(
"async_articles.id",
"async_articles.title",
"async_articles.content",
"async_articles.author"
).filter(natural_match("content", query="async operations")).execute()
print("Async fulltext search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Using ORM models with async fulltext search
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
Base = declarative_base()
class AsyncArticle(Base):
__tablename__ = 'async_articles'
id = Column(Integer, primary_key=True)
title = Column(String(200))
content = Column(Text)
author = Column(String(100))
category = Column(String(50))
# Async search with model using boolean_match
result = await client.query(AsyncArticle).filter(
boolean_match(AsyncArticle.content).must("async")
).execute()
print("Async search with model (boolean_match):")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Async search with model using natural_match
result = await client.query(AsyncArticle).filter(
natural_match(AsyncArticle.title, AsyncArticle.content, query="async operations")
).execute()
print("Async search with model (natural_match):")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Async search with model and scoring
result = await client.query(
AsyncArticle.id,
AsyncArticle.title,
AsyncArticle.content,
boolean_match(AsyncArticle.content).encourage("async").label("score")
).execute()
print("Async search with model and scoring:")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# Clean up
await client.drop_table("async_articles")
await client.disconnect()
asyncio.run(async_fulltext_example())from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
from sqlalchemy.orm import sessionmaker
from matrixone import Client
from matrixone.config import get_connection_params
from matrixone.sqlalchemy_ext.fulltext_search import boolean_match, natural_match
# Define ORM models
Base = declarative_base()
class Article(Base):
__tablename__ = 'orm_articles'
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(200), nullable=False)
content = Column(Text, nullable=False)
author = Column(String(100))
category = Column(String(50))
def orm_fulltext_example():
# Get connection parameters
host, port, user, password, database = get_connection_params()
client = Client()
client.connect(host=host, port=port, user=user, password=password, database=database)
# Create table using ORM model
client.create_table(Article)
# Create fulltext index
client.fulltext_index.create("orm_articles", name="idx_content", columns="content", algorithm="BM25")
# Create session
Session = sessionmaker(bind=client.get_sqlalchemy_engine())
session = Session()
# Insert data using ORM
article1 = Article(
title="ORM Article 1",
content="This article demonstrates fulltext search with ORM models in MatrixOne.",
author="ORM Author",
category="Technology"
)
session.add(article1)
session.commit()
# Natural language search - automatically processes query for optimal results
# Handles synonyms, stemming, and stopword removal automatically
result = client.query(Article).filter(natural_match(Article.content, "fulltext search")).execute()
print("Natural language search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Boolean search with must conditions - both terms are required
# Chain multiple must() calls for AND logic; all terms must be present
result = client.query(Article).filter(boolean_match(Article.content).must("fulltext").must("search")).execute()
print("Boolean search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Complex boolean search with multiple operators
# must() = required, encourage() = preferred but optional, must_not() = excluded
result = client.query(Article).filter(
boolean_match(Article.content)
.must("fulltext") # Required: must contain "fulltext"
.encourage("search") # Preferred: boost relevance if present
.must_not("legacy") # Excluded: filter out documents with "legacy"
).execute()
print("Complex boolean search results:")
for row in result.fetchall():
print(f" {row[1]} by {row[3]}")
# Clean up
client.drop_table(Article)
session.close()
client.disconnect()
orm_fulltext_example()Modern ORM-style fulltext queries with boolean_match and natural_match:
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
from matrixone import Client
from matrixone.config import get_connection_params
from matrixone.sqlalchemy_ext.fulltext_search import boolean_match, natural_match, group
# Define ORM models
Base = declarative_base()
class Article(Base):
__tablename__ = 'advanced_articles'
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(200), nullable=False)
content = Column(Text, nullable=False)
tags = Column(String(500))
category = Column(String(50))
def advanced_orm_fulltext_example():
host, port, user, password, database = get_connection_params()
client = Client()
client.connect(host=host, port=port, user=user, password=password, database=database)
# Create table using ORM model
client.create_table(Article)
# Create fulltext index
client.fulltext_index.create("advanced_articles", name="idx_content_tags", columns=["content", "tags"], algorithm="BM25")
# Insert test data
articles = [
{"title": "Python Programming Guide", "content": "Learn Python programming from basics to advanced concepts.", "tags": "python,programming,tutorial", "category": "Programming"},
{"title": "Machine Learning with Python", "content": "Introduction to machine learning using Python and scikit-learn.", "tags": "python,machine-learning,AI", "category": "AI"},
{"title": "Web Development Tutorial", "content": "Build modern web applications with Python and Django framework.", "tags": "python,web,django", "category": "Web"}
]
client.batch_insert(Article, articles)
# 1. Natural language search - user-friendly, handles variations automatically
# Best for end-user search interfaces; processes "python programming" intelligently
result = client.query(Article).filter(natural_match(Article.content, "python programming")).execute()
print("Natural language search results:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 2. Basic boolean search - exact term matching with required conditions
# Must contain "python" - strict matching without stemming or variations
result = client.query(Article).filter(boolean_match(Article.content).must("python")).execute()
print("\nBoolean search - must contain 'python':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 3. Boolean search with exclusion - filter out unwanted results
# Required: "python", Excluded: "django" - finds Python articles without Django
result = client.query(Article).filter(
boolean_match(Article.content).must("python").must_not("django")
).execute()
print("\nBoolean search - must have 'python', must not have 'django':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 4. Boolean search with preference - boost relevance without filtering
# Required: "python", Preferred: "tutorial" - boosts tutorial results in ranking
result = client.query(Article).filter(
boolean_match(Article.content).must("python").encourage("tutorial")
).execute()
print("\nBoolean search - must have 'python', encourage 'tutorial':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 5. Boolean search with discouragement - lower ranking for certain terms
# Required: "python", Discouraged: "legacy" - lowers ranking of legacy content
result = client.query(Article).filter(
boolean_match(Article.content).must("python").discourage("legacy")
).execute()
print("\nBoolean search - must have 'python', discourage 'legacy':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 6. Group search - logical OR within required conditions
# Must contain either "programming" OR "machine" - flexible matching
result = client.query(Article).filter(
boolean_match(Article.content).must(group().medium("programming", "machine"))
).execute()
print("\nGroup search - must contain either 'programming' or 'machine':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 7. Phrase search - exact phrase matching
# Finds documents containing the exact phrase "machine learning"
result = client.query(Article).filter(
boolean_match(Article.content).phrase("machine learning")
).execute()
print("\nPhrase search - exact phrase 'machine learning':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 8. Prefix search - wildcard matching for word beginnings
# Finds words starting with "python" (e.g., "pythonic", "pythonista")
result = client.query(Article).filter(
boolean_match(Article.content).prefix("python")
).execute()
print("\nPrefix search - words starting with 'python':")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 9. Complex boolean search - combining multiple operators for sophisticated queries
# Required: "python" AND (either "programming" OR "machine")
# Preferred: "tutorial", Discouraged: "legacy" - advanced ranking control
result = client.query(Article).filter(
boolean_match(Article.content)
.must("python") # Must contain "python"
.must(group().medium("programming", "machine")) # Must contain either term
.encourage("tutorial") # Boost tutorial content
.discourage("legacy") # Lower legacy content ranking
).execute()
print("\nComplex boolean search:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 10. Combined fulltext and SQL filters - mix fulltext search with metadata filtering
# Fulltext search for content + SQL filter for category metadata
result = client.query(Article).filter(
boolean_match(Article.content).must("python") # Fulltext search
).filter(
Article.category == "Programming" # SQL filter
).execute()
print("\nCombined with regular filters:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 11. Limited results - control result presentation
# Return only top 2 results
result = client.query(Article).filter(
boolean_match(Article.content).must("python")
).limit(2).execute()
print("\nLimited results:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# Clean up
client.drop_table(Article)
client.disconnect()
advanced_orm_fulltext_example()Here are comprehensive examples showing all available operators in action:
from matrixone import Client
from matrixone.sqlalchemy_ext.fulltext_search import boolean_match, natural_match, group
from matrixone.config import get_connection_params
def complete_fulltext_examples():
host, port, user, password, database = get_connection_params()
client = Client()
client.connect(host=host, port=port, user=user, password=password, database=database)
# Create table and index
client.create_table("complete_articles", {
"id": "int",
"title": "varchar(200)",
"content": "text",
"tags": "varchar(500)",
"category": "varchar(50)"
}, primary_key="id")
client.fulltext_index.create("complete_articles", name="idx_complete", columns=["title", "content", "tags"], algorithm="BM25")
# Insert test data
articles = [
{"id": 1, "title": "Python Programming Guide", "content": "Learn Python programming from basics to advanced concepts.", "tags": "python,programming,tutorial", "category": "Programming"},
{"id": 2, "title": "Machine Learning with Python", "content": "Introduction to machine learning using Python and scikit-learn.", "tags": "python,machine-learning,AI", "category": "AI"},
{"id": 3, "title": "Web Development Tutorial", "content": "Build modern web applications with Python and Django framework.", "tags": "python,web,django", "category": "Web"},
{"id": 4, "title": "Legacy Python Code", "content": "This is deprecated Python code that should be avoided.", "tags": "python,legacy,deprecated", "category": "Legacy"}
]
client.batch_insert("complete_articles", articles)
# 1. Natural language search with relevance scoring
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
natural_match("title", "content", query="python programming").label("relevance")
).execute()
print("Natural language search with scoring:")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# 1b. Natural language search without ordering (simpler)
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(natural_match("title", "content", query="python programming")).execute()
print("Natural language search (simple):")
for row in result.fetchall():
print(f" {row[1]}")
# 2. Boolean search with must conditions (AND logic)
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").must("python", "programming")).execute()
print("\nBoolean search - must contain 'python' AND 'programming':")
for row in result.fetchall():
print(f" {row[1]}")
# 3. Boolean search with exclusion (NOT logic)
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").must("python").must_not("legacy")).execute()
print("\nBoolean search - must have 'python', must not have 'legacy':")
for row in result.fetchall():
print(f" {row[1]}")
# 4. Boolean search with preference (encourage)
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
boolean_match("title", "content").must("python").encourage("tutorial").label("score")
).execute()
print("\nBoolean search - must have 'python', encourage 'tutorial':")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# 5. Boolean search with discouragement
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
boolean_match("title", "content").must("python").discourage("legacy").label("score")
).execute()
print("\nBoolean search - must have 'python', discourage 'legacy':")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# 6. Group search with OR logic
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").must(group().medium("programming", "machine"))).execute()
print("\nGroup search - must contain either 'programming' OR 'machine':")
for row in result.fetchall():
print(f" {row[1]}")
# 7. Weighted group search
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
boolean_match("title", "content").encourage(group().high("tutorial").low("basic")).label("score")
).execute()
print("\nWeighted group search - prefer 'tutorial' over 'basic':")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# 8. Phrase search
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").phrase("machine learning")).execute()
print("\nPhrase search - exact phrase 'machine learning':")
for row in result.fetchall():
print(f" {row[1]}")
# 9. Prefix search
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").prefix("python")).execute()
print("\nPrefix search - words starting with 'python':")
for row in result.fetchall():
print(f" {row[1]}")
# 10. Complex boolean search combining multiple operators
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
boolean_match("title", "content")
.must("python") # Must contain "python"
.must(group().medium("programming", "machine")) # Must contain either term
.encourage("tutorial") # Boost tutorial content
.discourage("legacy") # Lower legacy content ranking
.label("complex_score")
).execute()
print("\nComplex boolean search:")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# 11. Combined fulltext and SQL filters (single filter with multiple conditions)
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category"
).filter(
boolean_match("title", "content").must("python"), # Fulltext search
"complete_articles.category = 'Programming'" # SQL filter
).execute()
print("\nCombined with regular filters (single filter):")
for row in result.fetchall():
print(f" {row[1]} - {row[3]}")
# 11b. Chained filter calls
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category"
).filter(boolean_match("title", "content").encourage("programming")).filter("complete_articles.category = 'Programming'").execute()
print("\nCombined with regular filters (chained):")
for row in result.fetchall():
print(f" {row[1]} - {row[3]}")
# 11c. Complex filtering with multiple SQL conditions
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category",
"complete_articles.tags"
).filter(
boolean_match("title", "content").must("python"),
"complete_articles.category = 'Programming'",
"complete_articles.id > 1",
"complete_articles.tags LIKE '%tutorial%'"
).execute()
print("\nComplex filtering with multiple conditions:")
for row in result.fetchall():
print(f" {row[1]} - {row[3]} - {row[4]}")
# 11d. Filtering with IN conditions
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category"
).filter(
boolean_match("title", "content").encourage("python"),
"complete_articles.category IN ('Programming', 'AI')"
).execute()
print("\nFiltering with IN conditions:")
for row in result.fetchall():
print(f" {row[1]} - {row[3]}")
# 11e. Filtering with range conditions
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(
boolean_match("title", "content").must("python"),
"complete_articles.id BETWEEN 1 AND 3"
).execute()
print("\nFiltering with range conditions:")
for row in result.fetchall():
print(f" {row[1]}")
# 11f. Using SQL IN for multiple value filtering
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category"
).filter(
boolean_match("title", "content").encourage("python"),
"complete_articles.category IN ('Programming', 'AI', 'Technology')"
).execute()
print("\nFiltering with SQL IN:")
for row in result.fetchall():
print(f" {row[1]} - {row[3]}")
# 11g. Complex logical conditions with and_/or_
from sqlalchemy import and_, or_
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content",
"complete_articles.category",
"complete_articles.tags"
).filter(
and_(
boolean_match("title", "content").must("python"),
or_(
"complete_articles.category IN ('Programming', 'AI')",
"complete_articles.category = 'Technology'"
),
"complete_articles.id > 1"
)
).execute()
print("\nComplex logical conditions:")
for row in result.fetchall():
print(f" {row[1]} - {row[3]}")
# 11h. Using ORM models with fulltext search
from sqlalchemy import Column, Integer, String, Text
from matrixone.orm import declarative_base
# Define ORM model
Base = declarative_base()
class ArticleModel(Base):
__tablename__ = 'complete_articles'
id = Column(Integer, primary_key=True)
title = Column(String(200))
content = Column(Text)
tags = Column(String(500))
category = Column(String(50))
# Using model with boolean_match
result = client.query(ArticleModel).filter(
boolean_match(ArticleModel.content).must("python")
).execute()
print("\nUsing ORM model with boolean_match:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# Using model with natural_match
result = client.query(ArticleModel).filter(
natural_match(ArticleModel.title, ArticleModel.content, query="machine learning")
).execute()
print("\nUsing ORM model with natural_match:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# Using model with scoring and ordering
result = client.query(
ArticleModel.id,
ArticleModel.title,
ArticleModel.content,
boolean_match(ArticleModel.content).encourage("python").label("score")
).execute()
print("\nUsing ORM model with scoring:")
for row in result.fetchall():
print(f" {row[1]} (Score: {row[3]:.4f})")
# Using model with SQLAlchemy and_
from sqlalchemy import and_
result = client.query(ArticleModel).filter(
and_(
boolean_match(ArticleModel.content).must("python"),
ArticleModel.category.in_(["Programming", "AI"]),
ArticleModel.id > 1
)
).execute()
print("\nUsing ORM model with and_:")
for row in result.fetchall():
print(f" {row[1]} - {row[4]}")
# 12. Limited results
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").must("python")).limit(2).execute()
print("\nLimited results:")
for row in result.fetchall():
print(f" {row[1]}")
# 12b. Simple limited results without ordering
result = client.query(
"complete_articles.id",
"complete_articles.title",
"complete_articles.content"
).filter(boolean_match("title", "content").must("python")).limit(2).execute()
print("\nSimple limited results:")
for row in result.fetchall():
print(f" {row[1]}")
# Clean up
client.drop_table("complete_articles")
client.disconnect()
complete_fulltext_examples()The boolean_match function provides powerful operators for precise fulltext search control:
Core Operators:
- `.must(term)` - Required term (AND logic) - Document must contain this term - Chain multiple .must() calls for AND conditions - Example: .must("python").must("programming") = "python AND programming"
- `.must_not(term)` - Excluded term (NOT logic) - Document must NOT contain this term - Filters out unwanted results - Example: .must("python").must_not("legacy") = "python NOT legacy"
- `.encourage(term)` - Preferred term (positive weight) - Boosts relevance score if term is present - Does not filter results if term is absent - Example: .must("python").encourage("tutorial") = "python, prefer tutorial"
- `.discourage(term)` - Discouraged term (negative weight) - Lowers relevance score if term is present - Does not filter results if term is absent - Example: .must("python").discourage("legacy") = "python, avoid legacy"
Group Operators:
- `.must(group().medium(term1, term2))` - Required group (OR logic) - Document must contain at least one term from the group - Example: .must(group().medium("python", "java")) = "python OR java"
- `.encourage(group().high(term1).low(term2))` - Weighted group - .high() gives higher weight, .low() gives lower weight - Example: .encourage(group().high("tutorial").low("basic")) = "prefer tutorial over basic"
Special Operators:
- `.phrase("exact phrase")` - Exact phrase matching - Finds documents containing the exact phrase - Example: .phrase("machine learning") = exact phrase match
- `.prefix("prefix")` - Prefix/wildcard matching - Finds words starting with the prefix - Example: .prefix("python") = matches "python", "pythonic", "pythonista"
Usage Patterns:
# Basic required search
boolean_match(Article.content).must("python")
# Multiple requirements (AND)
boolean_match(Article.content).must("python").must("programming")
# Required with exclusion (AND NOT)
boolean_match(Article.content).must("python").must_not("legacy")
# Required with preference (AND, prefer X)
boolean_match(Article.content).must("python").encourage("tutorial")
# Required with discouragement (AND, avoid X)
boolean_match(Article.content).must("python").discourage("deprecated")
# Group requirements (AND (OR))
boolean_match(Article.content).must(group().medium("python", "java"))
# Complex combination
boolean_match(Article.content)
.must("programming")
.must(group().medium("python", "java"))
.encourage("tutorial")
.discourage("legacy")
.phrase("best practices")from matrixone import Client
from matrixone.exceptions import QueryError, ConnectionError
from matrixone.config import get_connection_params
def error_handling_example():
client = None
try:
host, port, user, password, database = get_connection_params()
# Create client with error handling
client = Client()
client.connect(host=host, port=port, user=user, password=password, database=database)
# Create table with error handling
try:
client.create_table("error_articles", {
"id": "int",
"content": "text"
}, primary_key="id")
print("✓ Table created successfully")
except QueryError as e:
print(f"❌ Table creation failed: {e}")
# Create fulltext index with error handling
try:
client.fulltext_index.create("error_articles", name="idx_content", columns="content", algorithm="BM25")
print("✓ Fulltext index created successfully")
except QueryError as e:
print(f"❌ Fulltext index creation failed: {e}")
# Insert data with error handling
try:
client.insert("error_articles", {"id": 1, "content": "Test content for fulltext search"})
print("✓ Data inserted successfully")
except QueryError as e:
print(f"❌ Data insertion failed: {e}")
# Fulltext search with error handling
try:
result = client.query(
"error_articles.id",
"error_articles.content"
).filter(natural_match("content", query="test content")).execute()
print(f"✓ Fulltext search successful: {len(result.fetchall())} results")
except QueryError as e:
print(f"❌ Fulltext search failed: {e}")
except ConnectionError as e:
print(f"❌ Connection failed: {e}")
except Exception as e:
print(f"❌ Unexpected error: {e}")
finally:
# Always clean up
if client:
try:
client.drop_table("error_articles")
client.disconnect()
print("✓ Cleanup completed")
except Exception as e:
print(f"⚠️ Cleanup warning: {e}")
error_handling_example()- Choose the right algorithm: - BM25: Best for general fulltext search, handles modern document collections well - TF-IDF: Good for specific use cases, traditional approach with proven reliability - Recommendation: Start with BM25 for new applications
- Optimize index creation: - Create indexes after data insertion: Avoid rebuilding indexes during data loading - Use appropriate column types: TEXT for large content, VARCHAR for shorter text - Match index columns exactly: Columns in MATCH() must exactly match fulltext index definition - Consider multi-column indexes: Index related text columns together for better performance
- Use appropriate search modes: - Natural language mode: Best for user-facing search interfaces, handles variations automatically - Boolean mode: Best for programmatic queries, provides precise control over search terms - ORM boolean_match: Use for type-safe, chainable queries with modern syntax
- Pagination: - With scoring: Use .label("score") to get relevance scores for ranking results - Without scoring: Skip scoring for simple searches where ranking isn't important - Pagination: Use .limit() and .offset() for pagination (ordering is optional) - Performance: Scoring may be slower than simple searches, but provides better relevance
- Optimize search queries: - Use encourage() over must(): When terms are preferred but not required - Use discourage() for ranking: Lower unwanted content without filtering it out - Combine with SQL filters: Mix fulltext search with metadata filtering for better results - Use phrases for exact matches: Wrap exact phrases in quotes or use .phrase()
- Filter combination strategies: - Single filter() with multiple conditions: More efficient for simple combinations - Chained filter() calls: Better readability for complex queries - Use appropriate operators: IN, BETWEEN, LIKE, IS NULL for different filtering needs - Combine with scoring: Use .label() for ranked results - Performance consideration: More filters = more precise results but potentially slower queries
- SQLAlchemy logical operators for complex conditions: - and_(): Combine multiple conditions with AND logic (from sqlalchemy) - or_(): Combine multiple conditions with OR logic (from sqlalchemy) - not_(): Negate conditions (from sqlalchemy) - SQL IN: Filter by multiple values using SQL IN clause - Nested combinations: Use SQLAlchemy operators for complex nested conditions - Fulltext + logical operators: Combine fulltext search with SQLAlchemy expressions - Performance: Logical operators provide more control but may impact query performance
- Handle errors gracefully: - Always use try-catch blocks: Fulltext operations can fail due to index issues - Provide meaningful error messages: Help users understand what went wrong - Clean up resources properly: Always disconnect clients and close sessions - Validate query syntax: Check boolean operators before executing complex queries
- Performance optimization: - Use batch operations: Insert large datasets with batch_insert() instead of individual inserts - Create indexes strategically: Only index columns that will be searched - Limit result sets: Use LIMIT and OFFSET for pagination with large result sets - Monitor index usage: Regularly check which indexes are being used effectively
- Read the :doc:`api/fulltext_index` for detailed fulltext index API
- Check out the :doc:`api/fulltext_search` for fulltext search API
- Explore :doc:`vector_guide` for vector search capabilities
- Learn about :doc:`orm_guide` for ORM patterns with fulltext search
- Check out the :doc:`examples` for comprehensive usage examples