Skip to content

Commit 04dbec1

Browse files
[8.1] Change 'Connecting' section to make connecting to SOBD clusters easier
Co-authored-by: Seth Michael Larson <[email protected]>
1 parent 2265b4f commit 04dbec1

File tree

1 file changed

+210
-54
lines changed

1 file changed

+210
-54
lines changed

docs/guide/connecting.asciidoc

Lines changed: 210 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -3,58 +3,199 @@
33

44
This page contains the information you need to connect the Client with {es}.
55

6+
[discrete]
7+
[[connect-ec]]
8+
=== Connecting to Elastic Cloud
9+
10+
https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html[Elastic Cloud] is the easiest way to get started with Elasticsearch. When connecting to Elastic Cloud with the Python Elasticsearch client you should always use the `cloud_id` parameter to connect. You can find this value within the "Manage Deployment" page after you've created a cluster (look in the top-left if you're in Kibana).
11+
12+
We recommend using a Cloud ID whenever possible because your client will be automatically configured for optimal use with Elastic Cloud including HTTPS and HTTP compression.
13+
14+
[source,python]
15+
----
16+
from elasticsearch import Elasticsearch
17+
18+
# Password for the 'elastic' user generated by Elasticsearch
19+
ELASTIC_PASSWORD = "<password>"
20+
21+
# Found in the 'Manage Deployment' page
22+
CLOUD_ID = "deployment-name:dXMtZWFzdDQuZ2Nw..."
23+
24+
# Create the client instance
25+
client = Elasticsearch(
26+
cloud_id=CLOUD_ID,
27+
basic_auth=("elastic", ELASTIC_PASSWORD)
28+
)
29+
30+
# Successful response!
31+
client.info()
32+
# {'name': 'instance-0000000000', 'cluster_name': ...}
33+
----
634

735
[discrete]
8-
[[connect-url]]
9-
==== Connecting with URLs
36+
[[connect-self-managed-new]]
37+
=== Connecting to a self-managed cluster
38+
39+
By default Elasticsearch will start with security features like authentication and TLS enabled. To connect to the Elasticsearch cluster you'll need to configure the Python Elasticsearch client to use HTTPS with the generated CA certificate in order to make requests successfully.
40+
41+
If you're just getting started with Elasticsearch we recommend reading the documentation on https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html[configuring] and https://www.elastic.co/guide/en/elasticsearch/reference/current/starting-elasticsearch.html[starting Elasticsearch] to ensure your cluster is running as expected.
1042

11-
A single node can be specified via a `scheme`, `host`, `port`, and optional `path_prefix`. These values can either be specified manually via a URL in a string, dictionary, `NodeConfig`, or a list of these values. You must specify at least `scheme`, `host` and `port` for each node. All of the following are valid configurations:
43+
When you start Elasticsearch for the first time you'll see a distinct block like the one below in the output from Elasticsearch (you may have to scroll up if it's been a while):
44+
45+
[source,sh]
46+
----
47+
\----------------------------------------------------------------------
48+
-> Elasticsearch security features have been automatically configured!
49+
-> Authentication is enabled and cluster connections are encrypted.
50+
51+
-> Password for the elastic user (reset with `bin/elasticsearch-reset-password -u elastic`):
52+
lhQpLELkjkrawaBoaz0Q
53+
54+
-> HTTP CA certificate SHA-256 fingerprint:
55+
a52dd93511e8c6045e21f16654b77c9ee0f34aea26d9f40320b531c474676228
56+
...
57+
\----------------------------------------------------------------------
58+
----
59+
60+
Note down the `elastic` user password and HTTP CA fingerprint for the next sections. In the examples below they will be stored in the variables `ELASTIC_PASSWORD` and `CERT_FINGERPRINT` respectively.
61+
62+
Depending on the circumstances there are two options for verifying the HTTPS connection, either verifying with the CA certificate itself or via the HTTP CA certificate fingerprint.
63+
64+
[discrete]
65+
==== Verifying HTTPS with CA certificates
66+
67+
Using the `ca_certs` option is the default way the Python Elasticsearch client verifies an HTTPS connection.
68+
69+
The generated root CA certificate can be found in the `certs` directory in your Elasticsearch config location (`$ES_CONF_PATH/certs/http_ca.crt`). If you're running Elasticsearch in Docker there is https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html[additional documentation for retrieving the CA certificate].
70+
71+
Once you have the `http_ca.crt` file somewhere accessible pass the path to the client via `ca_certs`:
1272

1373
[source,python]
14-
----------------------------
74+
----
1575
from elasticsearch import Elasticsearch
1676
17-
# Single node via URL
18-
es = Elasticsearch("http://localhost:9200")
77+
# Password for the 'elastic' user generated by Elasticsearch
78+
ELASTIC_PASSWORD = "<password>"
1979
20-
# Multiple nodes via URL
21-
es = Elasticsearch([
22-
"http://localhost:9200",
23-
"http://localhost:9201",
24-
"http://localhost:9202"
25-
])
80+
# Create the client instance
81+
client = Elasticsearch(
82+
"https://localhost:9200",
83+
ca_certs="/path/to/http_ca.crt",
84+
basic_auth=("elastic", ELASTIC_PASSWORD)
85+
)
2686
27-
# Single node via dictionary
28-
es = Elasticsearch({"scheme": "http", "host": "localhost", "port": 9200})
87+
# Successful response!
88+
client.info()
89+
# {'name': 'instance-0000000000', 'cluster_name': ...}
90+
----
2991

30-
# Multiple nodes via dictionary
31-
es = Elasticsearch([
32-
{"scheme": "http", "host": "localhost", "port": 9200},
33-
{"scheme": "http", "host": "localhost", "port": 9201},
34-
])
35-
----------------------------
92+
NOTE: If you don't specify `ca_certs` or `ssl_assert_fingerprint` then the https://certifiio.readthedocs.io[certifi package] will be used for `ca_certs` by default if available.
3693

3794
[discrete]
38-
[[connect-ec]]
39-
==== Connecting to Elastic Cloud
95+
==== Verifying HTTPS with certificate fingerprints (Python 3.10 or later)
4096

41-
Cloud ID is an easy way to configure your client to work with your Elastic Cloud
42-
deployment. Combine the `cloud_id` with either `basic_auth` or `api_key` to
43-
authenticate with your Elastic Cloud deployment.
97+
NOTE: Using this method **requires using Python 3.10 or later** and isn't available when using the `aiohttp` HTTP client library so can't be used with `AsyncElasticsearch`.
4498

45-
Using `cloud_id` enables TLS verification and HTTP compression by default and
46-
sets the port to 443 unless otherwise overwritten via the port parameter or the
47-
port value encoded within `cloud_id`. Using Cloud ID also disables sniffing as
48-
a proxy is in use.
99+
This method of verifying the HTTPS connection takes advantage of the certificate fingerprint value noted down earlier. Take this SHA256 fingerprint value and pass it to the Python Elasticsearch client via `ssl_assert_fingerprint`:
49100

50101
[source,python]
51-
----------------------------
102+
----
52103
from elasticsearch import Elasticsearch
53104
54-
es = Elasticsearch(
55-
cloud_id="cluster-1:dXMa5Fx..."
105+
# Fingerprint either from Elasticsearch startup or above script.
106+
# Colons and uppercase/lowercase don't matter when using
107+
# the 'ssl_assert_fingerprint' parameter
108+
CERT_FINGERPRINT = "A5:2D:D9:35:11:E8:C6:04:5E:21:F1:66:54:B7:7C:9E:E0:F3:4A:EA:26:D9:F4:03:20:B5:31:C4:74:67:62:28"
109+
110+
# Password for the 'elastic' user generated by Elasticsearch
111+
ELASTIC_PASSWORD = "<password>"
112+
113+
client = Elasticsearch(
114+
"https://localhost:9200",
115+
ssl_assert_fingerprint=CERT_FINGERPRINT,
116+
basic_auth=("elastic", ELASTIC_PASSWORD)
117+
)
118+
119+
# Successful response!
120+
client.info()
121+
# {'name': 'instance-0000000000', 'cluster_name': ...}
122+
----
123+
124+
The certificate fingerprint can be calculated using `openssl x509` with the certificate file:
125+
126+
[source,sh]
127+
----
128+
openssl x509 -fingerprint -sha256 -noout -in /path/to/http_ca.crt
129+
----
130+
131+
If you don't have access to the generated CA file from Elasticsearch you can use the following script to output the root CA fingerprint of the Elasticsearch instance with `openssl s_client`:
132+
133+
[source,sh]
134+
----
135+
# Replace the values of 'localhost' and '9200' to the
136+
# corresponding host and port values for the cluster.
137+
openssl s_client -connect localhost:9200 -servername localhost -showcerts </dev/null 2>/dev/null \
138+
| openssl x509 -fingerprint -sha256 -noout -in /dev/stdin
139+
----
140+
141+
The output of `openssl x509` will look something like this:
142+
143+
[source,sh]
144+
----
145+
SHA256 Fingerprint=A5:2D:D9:35:11:E8:C6:04:5E:21:F1:66:54:B7:7C:9E:E0:F3:4A:EA:26:D9:F4:03:20:B5:31:C4:74:67:62:28
146+
----
147+
148+
149+
[discrete]
150+
[[connect-no-security]]
151+
=== Connecting without security enabled
152+
153+
WARNING: Running Elasticsearch without security enabled is not recommended.
154+
155+
If your cluster is configured with https://www.elastic.co/guide/en/elasticsearch/reference/current/security-settings.html[security explicitly disabled] then you can connect via HTTP:
156+
157+
[source,python]
158+
----
159+
from elasticsearch import Elasticsearch
160+
161+
# Create the client instance
162+
client = Elasticsearch("http://localhost:9200")
163+
164+
# Successful response!
165+
client.info()
166+
# {'name': 'instance-0000000000', 'cluster_name': ...}
167+
----
168+
169+
[discrete]
170+
[[connect-url]]
171+
=== Connecting to multiple nodes
172+
173+
The Python Elasticsearch client supports sending API requests to multiple nodes in the cluster. This means that work will be more evenly spread across the cluster instead of hammering the same node over and over with requests. To configure the client with multiple nodes you can pass a list of URLs, each URL will be used as a separate node in the pool.
174+
175+
[source,python]
176+
----
177+
from elasticsearch import Elasticsearch
178+
179+
# List of nodes to connect use with different hosts and ports.
180+
NODES = [
181+
"https://localhost:9200",
182+
"https://localhost:9201",
183+
"https://localhost:9202",
184+
]
185+
186+
# Password for the 'elastic' user generated by Elasticsearch
187+
ELASTIC_PASSWORD = "<password>"
188+
189+
client = Elasticsearch(
190+
NODES,
191+
ca_certs="/path/to/http_ca.crt",
192+
basic_auth=("elastic", ELASTIC_PASSWORD)
56193
)
57-
----------------------------
194+
----
195+
196+
By default nodes are selected using round-robin, but alternate node selection strategies can be configured with `node_selector_class` parameter.
197+
198+
NOTE: If your Elasticsearch cluster is behind a load balancer like when using Elastic Cloud you won't need to configure multiple nodes. Instead use the load balancer host and port.
58199

59200

60201
[discrete]
@@ -66,12 +207,13 @@ providers. All authentication methods are supported on the client constructor
66207
or via the per-request `.options()` method:
67208

68209
[source,python]
69-
----------------------------
210+
----
70211
from elasticsearch import Elasticsearch
71212
72213
# Authenticate from the constructor
73214
es = Elasticsearch(
74-
"http://localhost:9200",
215+
"https://localhost:9200",
216+
ca_certs="/path/to/http_ca.crt",
75217
basic_auth=("username", "password")
76218
)
77219
@@ -90,7 +232,7 @@ for i in range(10):
90232
index="example-index",
91233
document={"field": i}
92234
)
93-
----------------------------
235+
----
94236

95237

96238
[discrete]
@@ -101,14 +243,16 @@ HTTP Basic authentication uses the `basic_auth` parameter by passing in a userna
101243
password within a tuple:
102244

103245
[source,python]
104-
----------------------------
246+
----
105247
from elasticsearch import Elasticsearch
106248
107249
# Adds the HTTP header 'Authorization: Basic <base64 username:password>'
108250
es = Elasticsearch(
251+
"https://localhost:9200",
252+
ca_certs="/path/to/http_ca.crt",
109253
basic_auth=("username", "password")
110254
)
111-
----------------------------
255+
----
112256

113257

114258
[discrete]
@@ -121,14 +265,15 @@ https://www.elastic.co/guide/en/elasticsearch/reference/master/security-api-crea
121265
and https://www.elastic.co/guide/en/elasticsearch/reference/master/security-api-get-token.html[Bearer Tokens].
122266

123267
[source,python]
124-
----------------------------
268+
----
125269
from elasticsearch import Elasticsearch
126270
127271
# Adds the HTTP header 'Authorization: Bearer token-value'
128272
es = Elasticsearch(
273+
"https://localhost:9200",
129274
bearer_auth="token-value"
130275
)
131-
----------------------------
276+
----
132277

133278

134279
[discrete]
@@ -140,14 +285,16 @@ cluster. Note that you need the values of `id` and `api_key` to
140285
[authenticate via an API Key](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html).
141286

142287
[source,python]
143-
----------------------------
288+
----
144289
from elasticsearch import Elasticsearch
145290
146291
# Adds the HTTP header 'Authorization: ApiKey <base64 api_key.id:api_key.api_key>'
147292
es = Elasticsearch(
293+
"https://localhost:9200",
294+
ca_certs="/path/to/http_ca.crt",
148295
api_key=("api_key.id", "api_key.api_key")
149296
)
150-
----------------------------
297+
----
151298

152299
[discrete]
153300
[[compatibility-mode]]
@@ -181,52 +328,61 @@ IMPORTANT: The async client shouldn't be used within Function-as-a-Service as a
181328
==== GCP Cloud Functions
182329

183330
[source,python]
184-
----------------------------
331+
----
185332
from elasticsearch import Elasticsearch
186333
334+
# Client initialization
187335
client = Elasticsearch(
188-
... # Client initialization
336+
cloud_id="deployment-name:ABCD...",
337+
api_key=...
189338
)
190339
191340
def main(request):
192-
... # Use the client
341+
# Use the client
342+
client.search(index=..., query={"match_all": {}})
193343
194-
----------------------------
344+
----
195345

196346
[discrete]
197347
[[connecting-faas-aws]]
198348
==== AWS Lambda
199349

200350
[source,python]
201-
----------------------------
351+
----
202352
from elasticsearch import Elasticsearch
203353
354+
# Client initialization
204355
client = Elasticsearch(
205-
... # Client initialization
356+
cloud_id="deployment-name:ABCD...",
357+
api_key=...
206358
)
207359
208360
def main(event, context):
209-
... # Use the client
361+
# Use the client
362+
client.search(index=..., query={"match_all": {}})
210363
211-
----------------------------
364+
----
212365

213366
[discrete]
214367
[[connecting-faas-azure]]
215368
==== Azure Functions
216369

217370
[source,python]
218-
----------------------------
371+
----
219372
import azure.functions as func
220373
from elasticsearch import Elasticsearch
221374
375+
# Client initialization
222376
client = Elasticsearch(
223-
... # Client initialization
377+
cloud_id="deployment-name:ABCD...",
378+
api_key=...
224379
)
225380
226381
def main(request: func.HttpRequest) -> func.HttpResponse:
227-
... # Use the client
382+
# Use the client
383+
client.search(index=..., query={"match_all": {}})
228384
229-
----------------------------
385+
----
230386

231387
Resources used to assess these recommendations:
232388

0 commit comments

Comments
 (0)