Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

promtail syslog receiver aborts on non-UTF8 logs #1783

Closed
candlerb opened this issue Mar 9, 2020 · 4 comments · Fixed by #2127
Closed

promtail syslog receiver aborts on non-UTF8 logs #1783

candlerb opened this issue Mar 9, 2020 · 4 comments · Fixed by #2127
Labels
component/agent keepalive An issue or PR that will be kept alive and never marked as stale. type/bug Somehing is not working as expected

Comments

@candlerb
Copy link
Contributor

candlerb commented Mar 9, 2020

Describe the bug
When promtail receives a message which is not valid UTF8 over the TCP syslog receiver, it logs an error and drops the TCP connection on the floor.

To Reproduce
This is tested with promtail-linux-amd64 1.3.0 from binary release.

  1. Configure rsyslog to forward to promtail

    *.*     action(type="omfwd" protocol="tcp"
                   target="127.0.0.1" port="5140"
                   Template="RSYSLOG_SyslogProtocol23Format"
                   TCP_Framing="octet-counted")
    
  2. Send a packet containing binary goop, e.g. this one captured from a Netgear switch

    echo -ne '<189>Mar  8 16:21:20 10.26.1.7-1 TRAPMGR[101083328]: traputil.c(625) 30766 %% LLDP-MED Topology Change Detected: ChassisIDSubtype: 4, ChassisID: HM~\xe0\xf0\x91, DeviceClass: 1' |
      nc -w1 -u localhost 514
    

What happens (as recorded in /var/log/syslog by rsyslog itself):

Mar  8 16:21:20 10.26.1.7-1 TRAPMGR[101083328]: traputil.c(625) 30766 %% LLDP-MED Topology Change Detected: ChassisIDSubtype: 4, ChassisID: HM~<E0><F0><91>, DeviceClass: 1
Mar  8 16:21:21 ix-loki promtail-linux-amd64[3256]: level=warn ts=2020-03-08T16:21:21.16213922Z caller=syslogtarget.go:174 msg="error parsing syslog stream" err="expecting a free-form optional message in UTF-8 (starting with or without BOM) [col 172]"
Mar  8 16:21:21 ix-loki rsyslogd: omfwd: TCPSendBuf error -2027, destruct TCP Connection to 127.0.0.1:5140 [v8.32.0 try http://www.rsyslog.com/e/2027 ]
Mar  8 16:21:21 ix-loki rsyslogd: action 'action 2' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.32.0 try http://www.rsyslog.com/e/2007 ]
Mar  8 16:21:21 ix-loki rsyslogd: action 'action 2' resumed (module 'builtin:omfwd') [v8.32.0 try http://www.rsyslog.com/e/2359 ]
  • rsyslog forwards the message to promtail
  • promtail logs an error ("expecting a free-form optional message in UTF-8") and closes the TCP connection
  • rsyslog logs an error about the closed TCP connection and backs off

Expected behavior
The log to be received and processed. Note that RFC5424 section 6 explicitly allows non-UTF8 messages as long as they don't begin with the Unicode BOM, and therefore rsyslog is not violating the protocol.

      MSG             = MSG-ANY / MSG-UTF8
      MSG-ANY         = *OCTET ; not starting with BOM
      MSG-UTF8        = BOM UTF-8-STRING
      BOM             = %xEF.BB.BF

Possibly, the non-UTF8 text may be converted to replacement character uFFFD, or codepoints u80 to uFF, or perhaps surrogate escape sequences in the range uDC80 to uDCFF, before forwarding to loki (since it has to be JSON-wrapped). But storing the log in a partially modified form is better than not storing it and aborting.

Environment:

  • Ubuntu 18.04
  • rsyslog 8.2002 from Adiscon PPA
  • promtail-linux-amd64 1.3.0 binary package from github releases

Screenshots, Promtail config, or terminal output
Promtail config:

# Promtail can expose prometheus metrics
server:
  http_listen_address: 127.0.0.1
  http_listen_port: 9080
  grpc_listen_port: 0

# How to find the loki server(s)
clients:
  - url: http://127.0.0.1:3100/loki/api/v1/push

# Logs to read and forward
scrape_configs:
  - job_name: syslog
    syslog:
      listen_address: 127.0.0.1:5140
    relabel_configs:
      - source_labels: [__syslog_message_severity]
        target_label: severity
      - source_labels: [__syslog_message_facility]
        target_label: facility
      - source_labels: [__syslog_message_hostname]
        target_label: host
      - source_labels: [__syslog_message_app_name]
        target_label: app
@candlerb
Copy link
Contributor Author

candlerb commented Mar 9, 2020

Possible resolutions I can see:

  1. Attempt to decode the string as UTF8; if it fails then decode again as latin-1 (which will give codepoints u80 to uFF)
  2. Only decode the string as UTF8 if it begins with BOM; otherwise decode as latin-1
  3. Decode as UTF8 and replace invalid characters with RuneError (uFFFD)
  4. Decode as UTF8 and replace invalid characters with surrogate escape uDC80 to uDCFF
  5. Decode as UTF8 and replace invalid characters with some ASCII sequence such as \ x N N or < N N > (the latter is what less does)

2 is fully reversible. The problem with 2 is that there might be some devices out there already sending UTF8 inside syslog but not declaring it with a BOM, and these would be mis-decoded (and this would be a change in behaviour).

The problem with 3 is that there is data loss.

The problem with 4 is that surrogate escapes are not widely used, and are technically still invalid unicode and hence cannot safely be carried in JSON.

5 is not safely reversible, unless you also replace a bare \ with \\ in all messages.

My inclination is a combination of 1 and 2:

  • If the message starts with BOM, then decode as UTF8 and replace invalid runes with uFFFD
  • If the message does not start with BOM, then decode as UTF8, but if it fails anywhere, re-decode the entire message as latin-1

EDIT: reference https://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences

@candlerb
Copy link
Contributor Author

candlerb commented Mar 9, 2020

Until this is fixed, here is a workaround when using rsyslog: you can use the mmutf8fix module to replace invalid utf8 characters with a single ASCII character (unfortunately the Unicode replacement character is not possible).

Put this before the action which forwards to promtail:

module(load="mmutf8fix")
action(type="mmutf8fix" replacementChar="?")

This module is available as part of rsyslog from the Adiscon v8-stable repository, in separate package "rsyslog-mmutf8fix".

@cyriltovena
Copy link
Contributor

Thank you Brian for the workaround and possible solutions, we'll look into it !

🥇

@cyriltovena cyriltovena added component/agent keepalive An issue or PR that will be kept alive and never marked as stale. kind/bug labels Mar 13, 2020
@bastjan
Copy link
Contributor

bastjan commented May 2, 2020

We started seeing this too. Thanks for the workaround!

The problem is related to influxdata/go-syslog#21 and I started fixing upstream: influxdata/go-syslog#35.

The idea is to relax the parser to accept any encoding and add an option to enforce a compliant MSG. This would stop promtail from dropping the connection when receiving a non UTF8 message and pass it as-is.

The encoding could then be fixed or reencoded with a fallback encoding in the syslog receiver part of promtail or more generically in a later pipeline stage (new transform stage?).

Something like:

scrape_configs:
- job_name: syslog
  syslog:
    listen_address: 127.0.0.1:5140
  relabel_configs:
  - source_labels: [__syslog_message_hostname]
    target_label: host

  pipeline_stages:
  - match:
    selector: '{host=~"ex-.*"}'
    stages:
    - encoding:
        reencode_with_fallback: iso-latin-1

@chaudum chaudum added the type/bug Somehing is not working as expected label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/agent keepalive An issue or PR that will be kept alive and never marked as stale. type/bug Somehing is not working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants