Beginner's Guide to ElasticSearch
Introduction
Elastic is a search server based on Apache Lucene, and provides a distributable full-text search engine that’s accessible through a restful interface.
ElasticSearch is schema-less, and uses JSON instead of XML. It is open-source and built in Java, which means you can run ElasticSearch on any platform, as Java is platform independent.
ElasticSearch is a document-based store. It is an alternative to traditional document stores, so it can be used to replace other document stores like MongoDB or RavenDB.
Fast and Scalable
ElasticSearch is incredibly fast when it comes to searching, so if your current document search is not giving you read performance or is not scaling as well as you would want it to, ElasticSearch is highly scalable.
Terminology
For all those coming from traditional MySQL databases, here is a table comparing ElasticSearch terminology with traditional relational database terminology:
A table comparing terminologies.
MySQL (RDBMS) Terminology | ElasticSearch Terminology |
---|---|
Database | Index |
Table | Type |
Row | Document |
How to Setup ElasticSearch
To get started, you need to download elasticsearch from this link, and unzip the zipped file in a folder where you want to place elasticsearch.
To run this, open a command window, go to your bin folder, and type elasticsearch to run it. Make sure you have the JAVA_HOME
environment variable defined.
Interacting with ElasticSearch
To check whether ElasticSearch has been correctly installed and started locally, use the following URL in your browser:
http://localhost:9200/
It should show you an output like:
{
"name" : "Domo",
"cluster_name" : "elasticsearch_root",
"version" : {
"number" : "2.2.0",
"build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe",
"build_timestamp" : "2016-01-27T13:32:39Z",
"build_snapshot" : false,
"lucene_version" : "5.4.1"
},
"tagline" : "You Know, for Search"
}
Once ElasticSearch has started, you can use any REST API client such as postman or fiddler.
Restful APIs are used to interact with ElasticSearch. The generic pattern used to make a RESTful call is as shown below:
REST API Format : http://host:port/[index]/[type]/[_action/id]
HTTP Methods used: GET, POST, PUT, DELETE
- To get a list of all available indices in your ElasticSearch, use the following URL:
http://localhost:9200/_cat/indices
- To get the status of an index (say, a company), use the following URL:
http://localhost:9200/company?pretty
The first part (localhost) denotes the host (server) where your ElasticSearch is hosted, and the default port is 9200.
http://localhost:9200/company/employee/_search
The second part (company) is index , followed by the (employee) type name, followed by (_search) action.
ElasticSearch lets you use HTTP methods such as GETs, POSTs, DELETEs, and PUTs along with a payload that would be in a JSON structure.
In this tutorial, I assume you are using the REST API client postman or fiddler to run the below mentioned RESTful calls.
Let's take a look at how to create an index, insert data into it and then retrieve data from ElasticSearch.
Creating an Index
http://localhost:9200/company
PUT
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"analysis": {
"analyzer": {
"analyzer-name": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
},
"mappings": {
"employee": {
"properties": {
"age": {
"type": "long"
},
"experience": {
"type": "long"
},
"name": {
"type": "string",
"analyzer": "analyzer-name"
}
}
}
}
}
}
Once you run the above command , this is the Response received:
{
"acknowledged": true
}
The above command creates an index named Company
with a type named employee
with the fields age
, experience
and name
.
What are Analysers
ElasticSearch is a text-based search engine based on apache lucene. The data to be indexed is processed according to the requirements prior to the splitting into terms. This process is called analysis, and is performed by analyzers.
The Analysis process involves:
- Splitting the text into tokens
- Standardizing these tokens so they become searchable.
Analysis is comprised of three functions:
- Character Filtering
- Tokenization, and
- Token filters
Character Filtering is applied on the input text string to filter out the unwanted terms. The Tokenizers are used to split a string into a stream of tokens. The terms generated after the tokenization process are passed through a token filter, which transforms the terms as per the standard requirement of user. For example: Token filters can be used to change the tokenized terms to uppercase.
In the above-created Index, we added the following analyzer:
"analysis": {
"analyzer": {
"analyzer-name": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
Here we created a custom analyser named "analyzer-name", with the following components:
-
"type": "custom"
An analyzer of typecustom
allows you to combine a Tokenizer with zero or more Token Filters, and zero or more Char Filters. Since no character filter has been used in above defined index , the analyzer has been defined with type custom. -
"tokenizer": "keyword"
This tokenizer emits the entire input as a single output. -
"filter": "lowercase"
The lowercase filter will convert all tokens entering into it to lowercase.
Inserting Data
We need to pass the document in the form of a JSON object as a data parameter when making the HTTP API call.
http://localhost:9200/company/employee/?_create
POST
{
"name": "Andrew",
"age" : 45,
"experience" : 10
}
Response:
{
"_index": "company",
"_type": "employee",
"_id": "AVM8D42POa82oxyTa_Pu",
"_version": 1,
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true
}
This way we can insert one document at a time. In order to insert multiple documents, we'll use the Bulk API of ElasticSearch.
Retrieving Data
To read all records of a type within an index, use the following commands with the GET HTTP method:
http://localhost:9200/company/employee/_search
http://localhost:9200/vehicles/car/_search
http://localhost:9200/vehicles/bike/_search
http://localhost:9200/vehicles/truck/_search
The above URLs search in the Index named vehicles, which has the document types car, bike and truck. Each one of these documents will contain specific data related to car, bike and truck respectively.
- One can perform a lot of other operations using ElasticSearch's REST APIs, such as:
- Checking the status of ElasticSearch Server,
- Performing CRUD (Create, Read, Update and Delete) and Search Operations against your indexes.
- Perform operations like paging, sorting, filtering, scripting, faceting, aggregations, etc.
Retrieving Data with Conditional Search
- Fetch all documents: The above-mentioned URL can be rewritten using the
match_all
parameter to return all documents of a type within an index.
Most REST clients (such as postman) don't accept a body with a GET method, so you can use a PUT instead. I have shown the examples with a GET method.
http://localhost:9200/company/employee/_search
GET
{
"query": {
"match_all": {}
}
}
Response:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"company",
"_type":"employee",
"_id":"AVM8D42POa82oxyTa_Pu",
"_score":1.0,"
_source":{
"name": "Andrew",
"age" : 45,
"experience" : 10
}
}
]
}
}
- Fetch all employees with a particular name:
To retrieve all employees with the name ‘Andrew’, you can use a query parameter and specify the condition within it.
http://localhost:9200/company/employee/_search
GET
{
"query": {
"match": {
"name": "Andrew"
}
}
}
The Response:
{
"took":7,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.30685282,
"hits":[
{
"_index":"company",
"_type":"employee",
"_id":"AVM8D42POa82oxyTa_Pu",
"_score":0.30685282,"
_source":{
"name": "Andrew",
"age" : 45,
"experience" : 10
}
}
]
}
}
- Here's how to fetch all employees with age greater than a number:
http://localhost:9200/company/employee/_search
GET
{
"query": {
"range": {
"age": { "gte": 35 }
}
}
}
The Response:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits":[{
"_index":"company",
"_type":"employee",
"_id":"AVM8D42POa82oxyTa_Pu",
"_score":0.30685282,
"_source":{
"name": "Andrew",
"age" : 45,
"experience" : 10
}
}]
}
}
- Fetch data with multiple conditions:
You can also combine multiple clauses and add them to one query parameter, such as
http://localhost:9200/company/employee/_search
GET
{
"bool": {
"must": { "match": {"name": "Andrew" }},
"should": { "range": {"age": { "gte": 35 }}}
}
}
The Response:
{
"took":31,
"timed_out":false,
"_shards":{
"total":5,"successful":5,"failed":0},
"hits":{
"total":1,
"max_score":0.04500804,
"hits":[{
"_index":"company",
"_type":"employee",
"_id":"AVM8D42POa82oxyTa_Pu",
"_score":0.04500804,
"_source":{
"name": "Andrew",
"age" : 45,
"experience" : 10
}
}]
}
}
Summary
ElasticSearch will help you resolve many search optimization problems in your existing applications. It is useful for giving your users a quality search experience and also letting them find what they are really looking for.
Thx! The article provides a beginner-friendly overview of ElasticSearch. The article illustrates how to retrieve data from ElasticSearch for my https://mybestcasino.ca/quebec/ site by performing searches based on various conditions using RESTful API calls.
Come with me and enter the war zone, the most of online users are loved to spend our time here http://tanktrouble.xyz this game format is totally different from others. the many war things are showing in this game.
thanks Ashish . great tutorial with easy explanation. you should write more about elasticsearch … as your language is easy to understand :)