You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guide/crawl-openai-custom.md
+23-2Lines changed: 23 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,33 @@
1
1
# User defined AI functions
2
2
3
-
In order to meet the personalized needs of different users, x-crawl also provides user-customized AI functions. Providing openai instances means you can tailor and optimize the AI to your needs to better suit your crawling efforts.
3
+
In order to meet the personalized needs of different users, x-crawl also provides user-customized AI functions. Providing ai instances means you can tailor and optimize the AI to your needs to better suit your crawling efforts.
4
+
5
+
## Ollama
6
+
7
+
Use the custom() method for the AI application instance.
8
+
9
+
Example:
10
+
11
+
```js{8}
12
+
import { createCrawlOllama } from 'x-crawl'
13
+
14
+
const crawlOllamaApp = createCrawlOllama({
15
+
model: "Your model ",
16
+
clientOptions: { ... }
17
+
})
18
+
19
+
const Ollama = crawlOllamaApp.custom()
20
+
```
21
+
22
+
You can refer to Ollama obtained by calling custom: https://github.com/ollama/ollama-js?tab=readme-ov-file#custom-client, call the custom get Ollama and site sample new Ollama () to get the instance of about the same, The difference is that x-crawl will pass the clientOptions that were passed in when the AI application instance was created to new Ollama. It will get the intact Ollama instance, and x-crawl will not rewrite it.
23
+
24
+
## Openai
4
25
5
26
Use the [custom()](/api/custom#custom) method of the AI application instance.
Copy file name to clipboardExpand all lines: docs/guide/crawl-openai-help.md
+44-1Lines changed: 44 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,54 @@
2
2
3
3
Can provide you with intelligent answers and suggestions. Whether it is about crawling strategies, anti-crawling techniques or data processing, you can ask AI questions, and AI will provide you with professional answers and suggestions based on its powerful learning and reasoning capabilities to help you complete your tasks better. Reptile task.
4
4
5
+
## Ollama
6
+
7
+
Use the help() method of the AI application instance.
8
+
9
+
Example:
10
+
11
+
```js{8,17}
12
+
import { createXCrawlOllama } from 'x-crawl'
13
+
14
+
const xCrawlOllamaApp = createXCrawlOllama({
15
+
model: "Your model ",
16
+
clientOptions: { ... }
17
+
})
18
+
19
+
xCrawlOllamaApp.help('What is x-crawl').then((res) => {
20
+
console.log(res)
21
+
/*
22
+
res:
23
+
x-crawl is a flexible Node.js AI-assisted web crawling library. It offers powerful AI-assisted features that make web crawling more efficient, intelligent, and convenient. You can find more information and the source code on x-crawl's GitHub page: https://github.com/coder-hxl/x-crawl.
24
+
*/
25
+
})
26
+
27
+
xCrawlOllamaApp
28
+
.help('Three major things to note about crawlers')
29
+
.then((res) => {
30
+
console.log(res)
31
+
/*
32
+
res:
33
+
There are several important aspects to consider when working with crawlers:
34
+
35
+
1. **Robots.txt:** It's important to respect the rules set in a website's robots.txt file. This file specifies which parts of a website can be crawled by search engines and other bots. Not following these rules can lead to your crawler being blocked or even legal issues.
36
+
37
+
2. **Crawl Delay:** It's a good practice to implement a crawl delay between your requests to a website. This helps to reduce the load on the server and also shows respect for the server resources.
38
+
39
+
3. **User-Agent:** Always set a descriptive User-Agent header for your crawler. This helps websites identify your crawler and allows them to contact you if there are any issues. Using a generic or misleading User-Agent can also lead to your crawler being blocked.
40
+
41
+
By keeping these points in mind, you can ensure that your crawler operates efficiently and ethically.
42
+
*/
43
+
})
44
+
```
45
+
46
+
## Openai
47
+
5
48
Use the [help()](/api/help#help) method of the AI application instance.
0 commit comments