logstash設定教學

設定檔編寫

logstash基本元素

input, filter, 和 output,每個元素可能一個或多個

建立一個基本的pipeline

下面建立一個基本的pipeline,從stdin讀取資料,然後輸出到stdout。我們將建立一個first-pipeline.conf並且放在C:\ELK\logstash-7.17.3\logstash-7.17.3(與bin同一個資料夾)

1
2
3
4
5
6
7
8
9
10
11
12
13
# The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
stdin { }
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
stdout {}
}

檢查pipeline語法

打開powershell,利用以下指令檢查pipeline語法,注意要確認沒有任何錯誤,因為檢查程式如果找不到檔案位置最後也會顯示檢查OK。
我們在C:\ELK\logstash-7.17.3\logstash-7.17.3輸入以下指令bin/logstash -f first-pipeline1.conf --config.test_and_exit

輸出如下

1
2
3
4
5
6
7
8
9
10
11
Using LS_JAVA_HOME defined java: C:\ELK\logstash-7.17.3\logstash-7.17.3\jdk\
WARNING: Using LS_JAVA_HOME while Logstash distribution comes with a bundled JDK.
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Sending Logstash logs to C:/ELK/logstash-7.17.3/logstash-7.17.3/logs which is now configured via log4j2.properties
[2023-05-05T14:24:31,585][INFO ][logstash.runner ] Log4j configuration path used is: C:\ELK\logstash-7.17.3\logstash-7.17.3\config\log4j2.properties
[2023-05-05T14:24:31,601][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.17.3", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.14.1+1 on 11.0.14.1+1 +indy +jit [mswin32-x86_64]"}
[2023-05-05T14:24:31,601][INFO ][logstash.runner ] JVM bootstrap flags: [-Xms1g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[2023-05-05T14:24:31,761][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2023-05-05T14:24:32,816][INFO ][org.reflections.Reflections] Reflections took 78 ms to scan 1 urls, producing 119 keys and 419 values
Configuration OK
[2023-05-05T14:24:33,781][INFO ][logstash.runner ] Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash

自動重載設定檔

檢查通過後我們用--config.reload.automatic選項重新載入設定檔,如此一來就不用一直重啟程式。在過程中你可能會看到忽略pipelines.yml的警告,因為我們已經在命令中明確指定要用的設定檔了,所以可以忽略這個警告,之後我們再學習使用pipelines.yml

bin/logstash -f first-pipeline.conf --config.reload.automatic

等到看到[main] Pipeline started {"pipeline.id"=>"main"}這個訊息後,我們可以直接輸入一些文字,按下enter,就可以看到輸入的文字被輸出到stdout了。
例如入Hello wordl後按下Enter,結果如下

1
2
3
4
5
6
7
Hello World
{
"@version" => "1",
"host" => "eastcoastVM01",
"message" => "Hello World\r",
"@timestamp" => 2023-05-05T06:32:33.396Z
}

使用 Grok Filter 外掛程式(Plugin)

Grok是眾多logstash外掛程式的其中一個,這裡可以看到更多關於logstash的外掛程式。

grok filter plugin可以幫我們將沒有結構化的log轉化成結構化的log

多個資料來源

以下範例是多個資料來源,分別來自twitter和firebeat,輸出也有多個,分別為elasticsearch和寫到檔案

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
input {
twitter {
consumer_key => "enter_your_consumer_key_here"
consumer_secret => "enter_your_secret_here"
keywords => ["cloud"]
oauth_token => "enter_your_access_token_here"
oauth_token_secret => "enter_your_access_token_secret_here"
}
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => ["IP Address 1:port1", "IP Address 2:port2", "IP Address 3"]
}
file {
path => "/path/to/target/file"
}
}

建立一個接收rabbitmq訊號的範例

解析rabbitmq訊號

在input中我們可以加入input plugin,這裡我們使用rabbitmq plugin,先建立一個最簡單的範例並且將解析結果輸出到console。
在rabbitmq plugin中的說明文件有提到,預設的輸出是json codec

設定rabbitmq的參數在這裡,我們會需要設定rabbitmq的連線資訊,範例如下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
rabbitmq {
host => "localhost"
port => 5672
username => guest
password => guest
}
}
# The filter part of this file is commented out to indicate that it is
# optional.
# filter {
#
# }
output {
stdout {}
}

抽取事件中的欄位

有時候我們會需要依據事件中欄位的資料做一些處理,Field Reference可以幫助我們抽取事件中的欄位。以下事件為範例
,如果要取得第一階的欄位如agent, ip, request, response, ua,只需要一個[]就可以取得例如[request],如果是第二階欄位如os則要用[ua][os]

1
2
3
4
5
6
7
8
9
10
11
12
{
"agent": "Mozilla/5.0 (compatible; MSIE 9.0)",
"ip": "192.168.24.44",
"request": "/index.html"
"response": {
"status": 200,
"bytes": 52353
},
"ua": {
"os": "Windows 7"
}
}

對欄位操作Mutate filter plugin

https://www.elastic.co/guide/en/logstash/7.17/plugins-filters-mutate.html#plugins-filters-mutate-replace

檢查子欄位是否存在

https://github.com/elastic/logstash/issues/10215#issuecomment-447912618

加入一個filed,其值是一個sub field

https://stackoverflow.com/a/39126309