Welcome! Log In Create A New Profile

Advanced

OpenSSL engine and async support

Posted by Grant Zhang 
Grant Zhang
OpenSSL engine and async support
February 04, 2017 01:00AM
This patch set adds the basic support for OpenSSL crypto engine and
async mode.

Changes since V2:
- support keyword "algo"
- ensure SSL engines are initialized before loading certs.
- limit one async fd per SSL connection
- better integrate with event cache

Changes since V1:
- add multiple engine support
- allow default algorithms to be specified for an engine
- remove the support for engine identifier "all" since (a) it is not possible
to specify default algorithms for all engine and (b) "all" makes it hard to
figure out what engine does what crypto algorithms.
- address Willy's other comments.
This patch adds the global 'ssl-engine' keyword. First arg is an engine
identifier followed by a list of default_algorithms the engine will
operate.

If the openssl version is too old, an error is reported when the option
is used.
---
doc/configuration.txt | 16 ++++++
include/common/mini-clist.h | 7 +++
src/ssl_sock.c | 119 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index c2ede71..ecd1769 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -589,6 +589,7 @@ The following keywords are supported in the "global" section :
- spread-checks
- server-state-base
- server-state-file
+ - ssl-engine
- tune.buffers.limit
- tune.buffers.reserve
- tune.bufsize
@@ -1240,6 +1241,21 @@ spread-checks <0..50, in percent>
and +/- 50%. A value between 2 and 5 seems to show good results. The
default value remains at 0.

+ssl-engine <name> [algo <comma-seperated list of algorithms>]
+ Sets the OpenSSL engine to <name>. List of valid values for <name> may be
+ obtained using the command "openssl engine". This statement may be used
+ multiple times, it will simply enable multiple crypto engines. Referencing an
+ unsupported engine will prevent haproxy from starting. Note that many engines
+ will lead to lower HTTPS performance than pure software with recent
+ processors. The optional command "algo" sets the default algorithms an ENGINE
+ will supply using the OPENSSL function ENGINE_set_default_string(). A value
+ of "ALL" uses the engine for all cryptographic operations. If no list of
+ algo is specified then the value of "ALL" is used. A comma-seperated list
+ of different algorithms may be specified, including: RSA, DSA, DH, EC, RAND,
+ CIPHERS, DIGESTS, PKEY, PKEY_CRYPTO, PKEY_ASN1. This is the same format that
+ openssl configuration file uses:
+ https://www.openssl.org/docs/man1.0.2/apps/config.html
+
tune.buffers.limit <number>
Sets a hard limit on the number of buffers which may be allocated per process.
The default value is zero which means unlimited. The minimum non-zero value
diff --git a/include/common/mini-clist.h b/include/common/mini-clist.h
index da24b33..7000927 100644
--- a/include/common/mini-clist.h
+++ b/include/common/mini-clist.h
@@ -61,6 +61,13 @@ struct cond_wordlist {
char *s;
};

+/* this is the same as above with the additional pointer to an argument. */
+struct arg1_wordlist {
+ struct list list;
+ void *arg1;
+ char *word;
+};
+
/* First undefine some macros which happen to also be defined on OpenBSD,
* in sys/queue.h, used by sys/event.h
*/
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 232a497..b173d77 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -52,6 +52,7 @@
#ifndef OPENSSL_NO_DH
#include <openssl/dh.h>
#endif
+#include <openssl/engine.h>

#include <import/lru.h>
#include <import/xxhash.h>
@@ -168,6 +169,9 @@ static struct {
struct list tlskeys_reference = LIST_HEAD_INIT(tlskeys_reference);
#endif

+struct list openssl_engines = LIST_HEAD_INIT(openssl_engines);
+static unsigned int openssl_engines_initialized;
+
#ifndef OPENSSL_NO_DH
static int ssl_dh_ptr_index = -1;
static DH *global_dh = NULL;
@@ -262,6 +266,55 @@ struct ocsp_cbk_arg {
};
};

+static int ssl_init_single_engine(const char *engine_id, const char *def_algorithms)
+{
+ int err_code = ERR_ABORT;
+ ENGINE *engine;
+
+ /* grab the structural reference to the engine */
+ engine = ENGINE_by_id(engine_id);
+ if (engine == NULL) {
+ Alert("ssl-engine %s: failed to get structural reference\n", engine_id);
+ goto fail_get;
+ }
+
+ if (!ENGINE_init(engine)) {
+ /* the engine couldn't initialise, release it */
+ Alert("ssl-engine %s: failed to initialize\n", engine_id);
+ goto fail_init;
+ }
+
+ if (ENGINE_set_default_string(engine, def_algorithms) == 0) {
+ Alert("ssl-engine %s: failed on ENGINE_set_default_string\n", engine_id);
+ goto fail_set_method;
+ }
+ err_code = 0;
+
+fail_set_method:
+ /* release the functional reference from ENGINE_init() */
+ ENGINE_finish(engine);
+
+fail_init:
+ /* release the structural reference from ENGINE_by_id() */
+ ENGINE_free(engine);
+
+fail_get:
+ return err_code;
+}
+
+static int ssl_init_engines(void)
+{
+ int err_code = 0;
+ struct arg1_wordlist *wl, *wlb;
+
+ list_for_each_entry_safe(wl, wlb, &openssl_engines, list) {
+ err_code = ssl_init_single_engine(wl->word, wl->arg1);
+ if (err_code == ERR_ABORT)
+ break;
+ }
+ return err_code;
+}
+
/*
* This function returns the number of seconds elapsed
* since the Epoch, 1970-01-01 00:00:00 +0000 (UTC) and the
@@ -2329,7 +2382,6 @@ static int ssl_sock_load_cert_chain_file(SSL_CTX *ctx, const char *file, struct
if (BIO_read_filename(in, file) <= 0)
goto end;

-
passwd_cb = SSL_CTX_get_default_passwd_cb(ctx);
passwd_cb_userdata = SSL_CTX_get_default_passwd_cb_userdata(ctx);

@@ -2513,6 +2565,12 @@ int ssl_sock_load_cert(char *path, struct bind_conf *bind_conf, char **err)
int j;
#endif

+ if (!openssl_engines_initialized) {
+ if (ssl_init_engines())
+ return 1;
+ openssl_engines_initialized = 1;
+ }
+
if (stat(path, &buf) == 0) {
dir = opendir(path);
if (!dir)
@@ -2654,6 +2712,12 @@ int ssl_sock_load_cert_list_file(char *file, struct bind_conf *bind_conf, struct
return 1;
}

+ if (!openssl_engines_initialized) {
+ if (ssl_init_engines())
+ return 1;
+ openssl_engines_initialized = 1;
+ }
+
while (fgets(thisline, sizeof(thisline), f) != NULL) {
int arg, newarg, cur_arg, i, ssl_b = 0, ssl_e = 0;
char *end;
@@ -6375,6 +6439,48 @@ static int ssl_parse_global_ca_crt_base(char **args, int section_type, struct pr
return 0;
}

+/* parse the "ssl-engine" keyword in global section.
+ * Returns <0 on alert, >0 on warning, 0 on success.
+ */
+static int ssl_parse_global_ssl_engine(char **args, int section_type, struct proxy *curpx,
+ struct proxy *defpx, const char *file, int line,
+ char **err)
+{
+ char *algo;
+ struct arg1_wordlist *wl;
+
+ if (*(args[1]) == 0) {
+ memprintf(err, "global statement '%s' expects a valid engine name as an argument.", args[0]);
+ return -1;
+ }
+
+ if (*(args[2]) == 0) {
+ memprintf(err, "statement '%s' expects algorithm names as an argument.", args[0]);
+ /* if no list of algorithms is given, it defaults to ALL */
+ algo = strdup("ALL");
+ goto add_engine;
+ }
+
+ /* otherwise the expected format is ssl-engine <engine_name> algo <list of algo> */
+ if (strcmp(args[2], "algo") != 0) {
+ memprintf(err, "global statement '%s' expects to have algo keyword.", args[0]);
+ return -1;
+ }
+
+ if (*(args[3]) == 0) {
+ memprintf(err, "global statement '%s' expects algorithm names as an argument.", args[0]);
+ return -1;
+ }
+ algo = strdup(args[3]);
+
+add_engine:
+ wl = calloc(1, sizeof(*wl));
+ wl->word = strdup(args[1]);
+ wl->arg1 = algo;
+ LIST_ADD(&openssl_engines, &wl->list);
+ return 0;
+}
+
/* parse the "ssl-default-bind-ciphers" / "ssl-default-server-ciphers" keywords
* in global section. Returns <0 on alert, >0 on warning, 0 on success.
*/
@@ -6948,6 +7054,7 @@ static struct cfg_kw_list cfg_kws = {ILH, {
#ifndef OPENSSL_NO_DH
{ CFG_GLOBAL, "ssl-dh-param-file", ssl_parse_global_dh_param_file },
#endif
+ { CFG_GLOBAL, "ssl-engine", ssl_parse_global_ssl_engine },
{ CFG_GLOBAL, "tune.ssl.cachesize", ssl_parse_global_int },
#ifndef OPENSSL_NO_DH
{ CFG_GLOBAL, "tune.ssl.default-dh-param", ssl_parse_global_default_dh },
@@ -7015,6 +7122,7 @@ static void __ssl_sock_init(void)
srv_register_keywords(&srv_kws);
cfg_register_keywords(&cfg_kws);
cli_register_kw(&cli_kws);
+ ENGINE_load_builtin_engines();
#if (defined SSL_CTRL_SET_TLSEXT_TICKET_KEY_CB && TLS_TICKETS_NO > 0)
hap_register_post_check(tlskeys_finalize_config);
#endif
@@ -7067,6 +7175,7 @@ static void __ssl_sock_init(void)
__attribute__((destructor))
static void __ssl_sock_deinit(void)
{
+ struct arg1_wordlist *wl, *wlb;
#if (defined SSL_CTRL_SET_TLSEXT_HOSTNAME && !defined SSL_NO_GENERATE_CERTIFICATES)
lru64_destroy(ssl_ctx_lru_tree);
#endif
@@ -7093,6 +7202,14 @@ static void __ssl_sock_deinit(void)
}
#endif

+ /* free up engine list */
+ list_for_each_entry_safe(wl, wlb, &openssl_engines, list) {
+ free(wl->arg1);
+ free(wl->word);
+ LIST_DEL(&wl->list);
+ free(wl);
+ }
+
ERR_remove_state(0);
ERR_free_strings();

--
1.9.1
Grant Zhang
[PATCH V3 2/2] RFC: add openssl async support
February 04, 2017 01:00AM
ssl_async is a global configuration parameter which enables asynchronous
processing in OPENSSL for all SSL connections haproxy handles. With
SSL_MODE_ASYNC mode set, TLS I/O operations may indicate a retry with
SSL_ERROR_WANT_ASYNC with this mode set if an asynchronous capable
engine is used to perform cryptographic operations.
---
doc/configuration.txt | 5 ++
include/types/connection.h | 2 +
include/types/fd.h | 1 +
src/ssl_sock.c | 112 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 120 insertions(+)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index ecd1769..8190920 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -590,6 +590,7 @@ The following keywords are supported in the "global" section :
- server-state-base
- server-state-file
- ssl-engine
+ - ssl-async
- tune.buffers.limit
- tune.buffers.reserve
- tune.bufsize
@@ -1256,6 +1257,10 @@ ssl-engine <name> [algo <comma-seperated list of algorithms>]
openssl configuration file uses:
https://www.openssl.org/docs/man1.0.2/apps/config.html

+ssl-async
+ Adds SSL_MODE_ASYNC mode to the SSL context. This enables asynchronous TLS
+ I/O operations if an asynchronous capable SSL engine is used.
+
tune.buffers.limit <number>
Sets a hard limit on the number of buffers which may be allocated per process.
The default value is zero which means unlimited. The minimum non-zero value
diff --git a/include/types/connection.h b/include/types/connection.h
index c644dd5..1c8e1a8 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -303,6 +303,8 @@ struct connection {
struct sockaddr_storage from; /* client address, or address to spoof when connecting to the server */
struct sockaddr_storage to; /* address reached by the client, or address to connect to */
} addr; /* addresses of the remote side, client for producer and server for consumer */
+
+ OSSL_ASYNC_FD async_fd;
};

/* proxy protocol v2 definitions */
diff --git a/include/types/fd.h b/include/types/fd.h
index 7f63093..f3d03f8 100644
--- a/include/types/fd.h
+++ b/include/types/fd.h
@@ -100,6 +100,7 @@ struct fdtab {
unsigned char updated:1; /* 1 if this fd is already in the update list */
unsigned char linger_risk:1; /* 1 if we must kill lingering before closing */
unsigned char cloned:1; /* 1 if a cloned socket, requires EPOLL_CTL_DEL on close */
+ unsigned char async:1; /* 1 if this fd is async ssl fd */
};

/* less often used information */
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index b173d77..ec5777f 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -53,6 +53,7 @@
#include <openssl/dh.h>
#endif
#include <openssl/engine.h>
+#include <openssl/async.h>

#include <import/lru.h>
#include <import/xxhash.h>
@@ -136,6 +137,7 @@ static struct xprt_ops ssl_sock;
static struct {
char *crt_base; /* base directory path for certificates */
char *ca_base; /* base directory path for CAs and CRLs */
+ int async; /* whether we use ssl async mode */

char *listen_default_ciphers;
char *connect_default_ciphers;
@@ -315,6 +317,54 @@ static int ssl_init_engines(void)
return err_code;
}

+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+void ssl_async_fd_handler(int fd)
+{
+ struct connection *conn = fdtab[fd].owner;
+ int conn_fd = conn->t.sock.fd;
+
+ /* crypto engine is available, let's notify the associated
+ * connection that it can pursue its processing.
+ */
+ conn_fd_handler(conn_fd);
+}
+
+/*
+ * openssl async fd handler
+ */
+void ssl_async_process_fds(struct connection *conn)
+{
+ OSSL_ASYNC_FD add_fd;
+ OSSL_ASYNC_FD del_fd;
+ size_t num_add_fds = 0;
+ size_t num_del_fds = 0;
+ SSL *ssl = conn->xprt_ctx;
+
+ SSL_get_changed_async_fds(ssl, &add_fd, &num_add_fds, &del_fd,
+ &num_del_fds);
+
+ if (num_add_fds == 0 && num_del_fds == 0)
+ return;
+
+ /* we don't support more than 1 async fds */
+ if (num_add_fds > 1 || num_del_fds > 1)
+ return;
+
+ if (num_del_fds)
+ fd_stop_both(del_fd);
+
+ if (num_add_fds) {
+ conn->async_fd = add_fd;
+ fdtab[add_fd].async = 1;
+ fdtab[add_fd].state = 0;
+ fdtab[add_fd].owner = conn;
+ fdtab[add_fd].iocb = ssl_async_fd_handler;
+ fd_insert(add_fd);
+ fd_want_recv(add_fd);
+ }
+}
+#endif
+
/*
* This function returns the number of seconds elapsed
* since the Epoch, 1970-01-01 00:00:00 +0000 (UTC) and the
@@ -2930,6 +2980,10 @@ int ssl_sock_prepare_ctx(struct bind_conf *bind_conf, struct ssl_bind_conf *ssl_
Alert("OpenSSL random data generator initialization failed.\n");
cfgerr++;
}
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (global_ssl.async)
+ sslmode |= SSL_MODE_ASYNC;
+#endif

if (conf_ssl_options & BC_SSL_O_NO_SSLV3)
ssloptions |= SSL_OP_NO_SSLv3;
@@ -3365,6 +3419,11 @@ int ssl_sock_prepare_srv_ctx(struct server *srv)
#endif
#endif
SSL_CTX_set_options(srv->ssl_ctx.ctx, options);
+
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (global_ssl.async)
+ mode |= SSL_MODE_ASYNC;
+#endif
SSL_CTX_set_mode(srv->ssl_ctx.ctx, mode);

if (global.ssl_server_verify == SSL_SERVER_VERIFY_REQUIRED)
@@ -3821,6 +3880,14 @@ int ssl_sock_handshake(struct connection *conn, unsigned int flag)
if (ret <= 0) {
/* handshake may have not been completed, let's find why */
ret = SSL_get_error(conn->xprt_ctx, ret);
+
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (ret == SSL_ERROR_WANT_ASYNC) {
+ ssl_async_process_fds(conn);
+ return 0;
+ }
+#endif
+
if (ret == SSL_ERROR_WANT_WRITE) {
/* SSL handshake needs to write, L4 connection may not be ready */
__conn_sock_stop_recv(conn);
@@ -3906,6 +3973,12 @@ int ssl_sock_handshake(struct connection *conn, unsigned int flag)
/* handshake did not complete, let's find why */
ret = SSL_get_error(conn->xprt_ctx, ret);

+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (ret == SSL_ERROR_WANT_ASYNC) {
+ ssl_async_process_fds(conn);
+ return 0;
+ }
+#endif
if (ret == SSL_ERROR_WANT_WRITE) {
/* SSL handshake needs to write, L4 connection may not be ready */
__conn_sock_stop_recv(conn);
@@ -4091,6 +4164,12 @@ static int ssl_sock_to_buf(struct connection *conn, struct buffer *buf, int coun
}
else {
ret = SSL_get_error(conn->xprt_ctx, ret);
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (ret == SSL_ERROR_WANT_ASYNC) {
+ ssl_async_process_fds(conn);
+ break;
+ }
+#endif
if (ret == SSL_ERROR_WANT_WRITE) {
/* handshake is running, and it needs to enable write */
conn->flags |= CO_FL_SSL_WAIT_HS;
@@ -4192,6 +4271,13 @@ static int ssl_sock_from_buf(struct connection *conn, struct buffer *buf, int fl
}
else {
ret = SSL_get_error(conn->xprt_ctx, ret);
+
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ if (ret == SSL_ERROR_WANT_ASYNC) {
+ ssl_async_process_fds(conn);
+ break;
+ }
+#endif
if (ret == SSL_ERROR_WANT_WRITE) {
if (SSL_renegotiate_pending(conn->xprt_ctx)) {
/* handshake is running, and it may need to re-enable write */
@@ -4226,6 +4312,15 @@ static int ssl_sock_from_buf(struct connection *conn, struct buffer *buf, int fl
static void ssl_sock_close(struct connection *conn) {

if (conn->xprt_ctx) {
+ if (global_ssl.async) {
+ /* the async fd is created and owned by the SSL engine, which is
+ * responsible for fd closure. Here we are done with the async fd
+ * thus disable the polling on it, as well as clean up fdtab entry.
+ */
+ fd_stop_both(conn->async_fd);
+ fdtab[conn->async_fd].async = 0;
+ fdtab[conn->async_fd].state = 0;
+ }
SSL_free(conn->xprt_ctx);
conn->xprt_ctx = NULL;
sslconns--;
@@ -6439,6 +6534,22 @@ static int ssl_parse_global_ca_crt_base(char **args, int section_type, struct pr
return 0;
}

+/* parse the "ssl-async" keyword in global section.
+ * Returns <0 on alert, >0 on warning, 0 on success.
+ */
+static int ssl_parse_global_ssl_async(char **args, int section_type, struct proxy *curpx,
+ struct proxy *defpx, const char *file, int line,
+ char **err)
+{
+#if OPENSSL_VERSION_NUMBER >= 0x1010000fL
+ global_ssl.async = 1;
+ return 0;
+#else
+ memprintf(err, "'%s': openssl library does not support async mode", args[0]);
+ return -1;
+#endif
+}
+
/* parse the "ssl-engine" keyword in global section.
* Returns <0 on alert, >0 on warning, 0 on success.
*/
@@ -7054,6 +7165,7 @@ static struct cfg_kw_list cfg_kws = {ILH, {
#ifndef OPENSSL_NO_DH
{ CFG_GLOBAL, "ssl-dh-param-file", ssl_parse_global_dh_param_file },
#endif
+ { CFG_GLOBAL, "ssl-async", ssl_parse_global_ssl_async },
{ CFG_GLOBAL, "ssl-engine", ssl_parse_global_ssl_engine },
{ CFG_GLOBAL, "tune.ssl.cachesize", ssl_parse_global_int },
#ifndef OPENSSL_NO_DH
--
1.9.1
Emeric Brun
Re: OpenSSL engine and async support
March 15, 2017 12:10PM
Hi Grant,

On 02/04/2017 12:55 AM, Grant Zhang wrote:
> This patch set adds the basic support for OpenSSL crypto engine and
> async mode.
>
> Changes since V2:
> - support keyword "algo"
> - ensure SSL engines are initialized before loading certs.
> - limit one async fd per SSL connection
> - better integrate with event cache
>
> Changes since V1:
> - add multiple engine support
> - allow default algorithms to be specified for an engine
> - remove the support for engine identifier "all" since (a) it is not possible
> to specify default algorithms for all engine and (b) "all" makes it hard to
> figure out what engine does what crypto algorithms.
> - address Willy's other comments.
>

Using an engine, if there is an error parsing the configuration, the haproxy stuck on a futex and do not exit:

[[email protected] ~]# cat haproxy/h.conf
global
ssl-engine qat
# ssl-async
tune.ssl.default-dh-param 2048

listen ss
mode tcp
bind 0.0.0.0:8080
server ssl 127.0.0.1:8443 ssl foobar verify none

listen gg
mode http
bind 0.0.0.0:8443 ssl crt /root/2048.pem
redirect location /

[[email protected] ~]# strace ./haproxy/haproxy -f ./haproxy/h.conf
....
write(2, "[ALERT] 073/120342 (2474) : ", 28[ALERT] 073/120342 (2474) : ) = 28
write(2, "Error(s) found in configuration "..., 56Error(s) found in configuration file : ./haproxy/h.conf
) = 56
write(2, "[WARNING] 073/120342 (2474) : ", 30[WARNING] 073/120342 (2474) : ) = 30
write(2, "config : missing timeouts for pr"..., 273config : missing timeouts for proxy 'ss'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
) = 273
write(2, "[ALERT] 073/120342 (2474) : ", 28[ALERT] 073/120342 (2474) : ) = 28
write(2, "Proxy 'ss', server 'ssl' [./hapr"..., 356Proxy 'ss', server 'ssl' [./haproxy/h.conf:9] verify is enabled by default but no CA file specified. If you're running on a LAN where you're certain to trust the server's certificate, please set an explicit 'verify none' statement on the 'server' line, or use 'ssl-server-verify none' in the global section to disable server-side verifications by default.
) = 356
write(2, "[WARNING] 073/120342 (2474) : ", 30[WARNING] 073/120342 (2474) : ) = 30
write(2, "config : missing timeouts for pr"..., 273config : missing timeouts for proxy 'gg'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
) = 273
mmap(NULL, 4324792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f14122d0000
write(2, "[ALERT] 073/120342 (2474) : ", 28[ALERT] 073/120342 (2474) : ) = 28
write(2, "Fatal errors found in configurat"..., 37Fatal errors found in configuration.
) = 37
futex(0x1a204a0, FUTEX_WAIT_PRIVATE, 2, NULL
Emeric Brun
Re: OpenSSL engine and async support
March 15, 2017 12:50PM
Hi Grant,

On 03/15/2017 12:05 PM, Emeric Brun wrote:
> Hi Grant,
>
> On 02/04/2017 12:55 AM, Grant Zhang wrote:
>> This patch set adds the basic support for OpenSSL crypto engine and
>> async mode.
>>
>> Changes since V2:
>> - support keyword "algo"
>> - ensure SSL engines are initialized before loading certs.
>> - limit one async fd per SSL connection
>> - better integrate with event cache
>>
>> Changes since V1:
>> - add multiple engine support
>> - allow default algorithms to be specified for an engine
>> - remove the support for engine identifier "all" since (a) it is not possible
>> to specify default algorithms for all engine and (b) "all" makes it hard to
>> figure out what engine does what crypto algorithms.
>> - address Willy's other comments.
>>
>

An other issue:

i'm using that configuration:

global
ssl-engine qat algo RSA
ssl-async
tune.ssl.default-dh-param 2048

listen ss
mode tcp
bind 0.0.0.0:8080
server ssl 127.0.0.1:8443 ssl no-ssl-reuse verify none

listen gg
mode http
bind 0.0.0.0:8443 ssl crt /root/2048.pem
redirect location /

Unable to perform a clear request through 8080. There is no is issue if i disable the engine or if i request directly in ssl on 8443.

R,
Emeric
Emeric Brun
Re: OpenSSL engine and async support
March 15, 2017 03:50PM
Hi Grant,

On 03/15/2017 12:46 PM, Emeric Brun wrote:
> Hi Grant,
>
> On 03/15/2017 12:05 PM, Emeric Brun wrote:
>> Hi Grant,
>>
>> On 02/04/2017 12:55 AM, Grant Zhang wrote:
>>> This patch set adds the basic support for OpenSSL crypto engine and
>>> async mode.
>>>
>>> Changes since V2:
>>> - support keyword "algo"
>>> - ensure SSL engines are initialized before loading certs.
>>> - limit one async fd per SSL connection
>>> - better integrate with event cache
>>>
>>> Changes since V1:
>>> - add multiple engine support
>>> - allow default algorithms to be specified for an engine
>>> - remove the support for engine identifier "all" since (a) it is not possible
>>> to specify default algorithms for all engine and (b) "all" makes it hard to
>>> figure out what engine does what crypto algorithms.
>>> - address Willy's other comments.
>>>
>>
>
> An other issue:
>
> i'm using that configuration:
>
> global
> ssl-engine qat algo RSA
> ssl-async
> tune.ssl.default-dh-param 2048
>
> listen ss
> mode tcp
> bind 0.0.0.0:8080
> server ssl 127.0.0.1:8443 ssl no-ssl-reuse verify none
>
> listen gg
> mode http
> bind 0.0.0.0:8443 ssl crt /root/2048.pem
> redirect location /
>
> Unable to perform a clear request through 8080. There is no is issue if i disable the engine or if i request directly in ssl on 8443.
>
> R,
> Emeric
>

There is some inconsistencies between the engine and the used client:

here the conf:
global
tune.ssl.default-dh-param 2048
ssl-engine qat
ssl-async

listen gg
mode http
bind 0.0.0.0:8443 ssl crt /root/2048.pem
redirect location /

openssl s_client -connect performs well but curl failed:
[email protected]:~/inject$ curl -k https://10.0.0.109:8443/
curl: (35) gnutls_handshake() failed: Bad record MAC


If I comment the ssl-engine line, no more issue.

R,
Emeric

the conf:
Grant Zhang
Re: OpenSSL engine and async support
March 15, 2017 05:10PM
Hi Emeric,

Thanks for testing. I will try repro the issues locally and report back.

Regards,

Grant

> On Mar 15, 2017, at 07:41, Emeric Brun <[email protected]> wrote:
>
> Hi Grant,
>
> On 03/15/2017 12:46 PM, Emeric Brun wrote:
>> Hi Grant,
>>
>> On 03/15/2017 12:05 PM, Emeric Brun wrote:
>>> Hi Grant,
>>>
>>> On 02/04/2017 12:55 AM, Grant Zhang wrote:
>>>> This patch set adds the basic support for OpenSSL crypto engine and
>>>> async mode.
>>>>
>>>> Changes since V2:
>>>> - support keyword "algo"
>>>> - ensure SSL engines are initialized before loading certs.
>>>> - limit one async fd per SSL connection
>>>> - better integrate with event cache
>>>>
>>>> Changes since V1:
>>>> - add multiple engine support
>>>> - allow default algorithms to be specified for an engine
>>>> - remove the support for engine identifier "all" since (a) it is not possible
>>>> to specify default algorithms for all engine and (b) "all" makes it hard to
>>>> figure out what engine does what crypto algorithms.
>>>> - address Willy's other comments.
>>>>
>>>
>>
>> An other issue:
>>
>> i'm using that configuration:
>>
>> global
>> ssl-engine qat algo RSA
>> ssl-async
>> tune.ssl.default-dh-param 2048
>>
>> listen ss
>> mode tcp
>> bind 0.0.0.0:8080
>> server ssl 127.0.0.1:8443 ssl no-ssl-reuse verify none
>>
>> listen gg
>> mode http
>> bind 0.0.0.0:8443 ssl crt /root/2048.pem
>> redirect location /
>>
>> Unable to perform a clear request through 8080. There is no is issue if i disable the engine or if i request directly in ssl on 8443.
>>
>> R,
>> Emeric
>>
>
> There is some inconsistencies between the engine and the used client:
>
> here the conf:
> global
> tune.ssl.default-dh-param 2048
> ssl-engine qat
> ssl-async
>
> listen gg
> mode http
> bind 0.0.0.0:8443 ssl crt /root/2048.pem
> redirect location /
>
> openssl s_client -connect performs well but curl failed:
> [email protected]:~/inject$ curl -k https://10.0.0.109:8443/
> curl: (35) gnutls_handshake() failed: Bad record MAC
>
>
> If I comment the ssl-engine line, no more issue.
>
> R,
> Emeric
>
> the conf:
>
>
>
>
Emeric Brun
Re: OpenSSL engine and async support
March 15, 2017 06:10PM
Hi John,

>>
>> There is some inconsistencies between the engine and the used client:
>>
>> here the conf:
>> global
>> tune.ssl.default-dh-param 2048
>> ssl-engine qat
>> ssl-async
>>
>> listen gg
>> mode http
>> bind 0.0.0.0:8443 ssl crt /root/2048.pem
>> redirect location /
>>
>> openssl s_client -connect performs well but curl failed:
>> [email protected]:~/inject$ curl -k https://10.0.0.109:8443/
>> curl: (35) gnutls_handshake() failed: Bad record MAC
>>
>>
>> If I comment the ssl-engine line, no more issue.
>>
>> R,
>> Emeric
>>
>> the conf:
>>
>>
>>
>>

I'm not sure that the issue is related to your patch, i may reach an issue int QAT engine

I've made some test using openssl s_server.

Doing a curl request shows this error:
[[email protected] bin]# ./openssl s_server -accept 9443 -engine qat -cert /root/2048.pem
ERROR
140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:602:
shutting down SSL
CONNECTION CLOSED

And using the haproxy as client also fails with this error:
140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to null:qat_rsa.c:917:
140267076605760:error:141EC044:SSL routines:tls_construct_server_key_exchange:internal error:ssl/statem/statem_srvr.c:2453:
shutting down SSL
CONNECTION CLOSED

R,
Emeric


>
>
Grant Zhang
Re: OpenSSL engine and async support
March 15, 2017 06:30PM
Hi Emeric
> On Mar 15, 2017, at 10:05, Emeric Brun <[email protected]> wrote:
>
> Hi John,
>
>>>
>>> There is some inconsistencies between the engine and the used client:
>>>
>>> here the conf:
>>> global
>>> tune.ssl.default-dh-param 2048
>>> ssl-engine qat
>>> ssl-async
>>>
>>> listen gg
>>> mode http
>>> bind 0.0.0.0:8443 ssl crt /root/2048.pem
>>> redirect location /
>>>
>>> openssl s_client -connect performs well but curl failed:
>>> [email protected]:~/inject$ curl -k https://10.0.0.109:8443/
>>> curl: (35) gnutls_handshake() failed: Bad record MAC
>>>
>>>
>>> If I comment the ssl-engine line, no more issue.
>>>
>>> R,
>>> Emeric
>>>
>>> the conf:
>>>
>>>
>>>
>>>
>
> I'm not sure that the issue is related to your patch, i may reach an issue int QAT engine
>
> I've made some test using openssl s_server.
>
> Doing a curl request shows this error:
> [[email protected] bin]# ./openssl s_server -accept 9443 -engine qat -cert /root/2048.pem
> ERROR
> 140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:602:
> shutting down SSL
> CONNECTION CLOSED
>
> And using the haproxy as client also fails with this error:
> 140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to null:qat_rsa.c:917:
> 140267076605760:error:141EC044:SSL routines:tls_construct_server_key_exchange:internal error:ssl/statem/statem_srvr.c:2453:
> shutting down SSL
> CONNECTION CLOSED
>
> R,
> Emeric

Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the following patch:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b

If not, could you please give a try?

Thanks,

Grant
Willy Tarreau
Re: OpenSSL engine and async support
March 15, 2017 07:10PM
Hi Grant,

On Wed, Mar 15, 2017 at 10:20:01AM -0700, Grant Zhang wrote:
> Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the following patch:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b

I think not because Emeric had issues applying your patch on top of recent
SSL changes so he had to roll back to the date of your submission, which
predates this fix. That makes sense indeed.

Willy
Emeric Brun
Re: OpenSSL engine and async support
March 16, 2017 02:20PM
Hi Grant,

On 03/15/2017 06:20 PM, Grant Zhang wrote:
> Hi Emeric
>> On Mar 15, 2017, at 10:05, Emeric Brun <[email protected]> wrote:
>>
>> Hi John,
>>
>>>>
>>>> There is some inconsistencies between the engine and the used client:
>>>>
>>>> here the conf:
>>>> global
>>>> tune.ssl.default-dh-param 2048
>>>> ssl-engine qat
>>>> ssl-async
>>>>
>>>> listen gg
>>>> mode http
>>>> bind 0.0.0.0:8443 ssl crt /root/2048.pem
>>>> redirect location /
>>>>
>>>> openssl s_client -connect performs well but curl failed:
>>>> [email protected]:~/inject$ curl -k https://10.0.0.109:8443/
>>>> curl: (35) gnutls_handshake() failed: Bad record MAC
>>>>
>>>>
>>>> If I comment the ssl-engine line, no more issue.
>>>>
>>>> R,
>>>> Emeric
>>>>
>>>> the conf:
>>>>
>>>>
>>>>
>>>>
>>
>> I'm not sure that the issue is related to your patch, i may reach an issue int QAT engine
>>
>> I've made some test using openssl s_server.
>>
>> Doing a curl request shows this error:
>> [[email protected] bin]# ./openssl s_server -accept 9443 -engine qat -cert /root/2048.pem
>> ERROR
>> 140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:602:
>> shutting down SSL
>> CONNECTION CLOSED
>>
>> And using the haproxy as client also fails with this error:
>> 140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to null:qat_rsa.c:917:
>> 140267076605760:error:141EC044:SSL routines:tls_construct_server_key_exchange:internal error:ssl/statem/statem_srvr.c:2453:
>> shutting down SSL
>> CONNECTION CLOSED
>>
>> R,
>> Emeric
>
> Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the following patch:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b
>
> If not, could you please give a try?
>
> Thanks,
>
> Grant
>
>

Indeed, I haven't this patch. But it seems that the issue is related to the qat engine:
I'm unable to perform a complete handshake using curl or haproxy ssl client mode with openssl s_server -engine qat (no issue without the engine).

I'm currently talking with intel guys and trying to solve it.

I'll keep you informed

R,
Emeric
Emeric Brun
Re: OpenSSL engine and async support
March 21, 2017 03:00PM
Hi Grant,

>>
>> I'm not sure that the issue is related to your patch, i may reach an issue int QAT engine
>>
>> I've made some test using openssl s_server.
>>
>> Doing a curl request shows this error:
>> [[email protected] bin]# ./openssl s_server -accept 9443 -engine qat -cert /root/2048.pem
>> ERROR
>> 140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:602:
>> shutting down SSL
>> CONNECTION CLOSED
>>
>> And using the haproxy as client also fails with this error:
>> 140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to null:qat_rsa.c:917:
>> 140267076605760:error:141EC044:SSL routines:tls_construct_server_key_exchange:internal error:ssl/statem/statem_srvr.c:2453:
>> shutting down SSL
>> CONNECTION CLOSED
>>
>> R,
>> Emeric
>
> Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the following patch:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b
>
> If not, could you please give a try?
>
> Thanks,
>
> Grant
>
>

To keep you informed:

We fixed my qat engine configuration and the second error is gone.

But i still notice 'bad record' errors if the client uses opensslv1.1 or gnutls.

There is no issue if the client uses opensslv1.0.x

Same error using the engine with haproxy or openssl s_server. So the problem is not on your side.

R,
Emeric
Grant Zhang
Re: OpenSSL engine and async support
March 21, 2017 10:10PM
> On Mar 21, 2017, at 06:56, Emeric Brun <[email protected]> wrote:
>
> Hi Grant,
>
>>>
>>> I'm not sure that the issue is related to your patch, i may reach an issue int QAT engine
>>>
>>> I've made some test using openssl s_server.
>>>
>>> Doing a curl request shows this error:
>>> [[email protected] bin]# ./openssl s_server -accept 9443 -engine qat -cert /root/2048.pem
>>> ERROR
>>> 140267076605760:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:602:
>>> shutting down SSL
>>> CONNECTION CLOSED
>>>
>>> And using the haproxy as client also fails with this error:
>>> 140267076605760:error:800910C8:lib(128):qat_rsa_priv_enc:rsa from to null:qat_rsa.c:917:
>>> 140267076605760:error:141EC044:SSL routines:tls_construct_server_key_exchange:internal error:ssl/statem/statem_srvr.c:2453:
>>> shutting down SSL
>>> CONNECTION CLOSED
>>>
>>> R,
>>> Emeric
>>
>> Maybe you run into the openssl 1.1 SNI issue. Does your test branch have the following patch:
>> http://git.haproxy.org/?p=haproxy.git;a=commit;h=d3850603933c9319528375088a9b28b9b345246b
>>
>> If not, could you please give a try?
>>
>> Thanks,
>>
>> Grant
>>
>>
>
> To keep you informed:
>
> We fixed my qat engine configuration and the second error is gone.
>
> But i still notice 'bad record' errors if the client uses opensslv1.1 or gnutls.
>
> There is no issue if the client uses opensslv1.0.x
>
> Same error using the engine with haproxy or openssl s_server. So the problem is not on your side.
>
> R,
> Emeric

Hey Emeric,

Thank you very much for the information. Hopefully the s_server + qat issue could be addressed soon.

Regards,

Grant
Emeric Brun
Re: OpenSSL engine and async support
March 27, 2017 11:30AM
Hi Grant,

> Hey Emeric,
>
> Thank you very much for the information. Hopefully the s_server + qat issue could be addressed soon.
>
> Regards,
>
> Grant
>
>
>


Intel's guys told me that the bug is related to prf and asked me to recompile the engine using '--disable_qat_prf'. Doing that i can do some tests iwth the qat engine but i'm facing stability issues:

[[email protected] haproxy]# /usr/local/ssl/bin/openssl speed -engine qat -elapsed -async_jobs 8 rsa2048
[WARNING][e_qat.c:1531:bind_qat()] QAT Warnings enabled.
engine "qat" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing 2048 bit private rsa's for 10s: 13442 2048 bit private RSA's in 10.01s
Doing 2048 bit public rsa's for 10s: 290503 2048 bit public RSA's in 10.00s
OpenSSL 1.1.0e 16 Feb 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/local/ssl/ssl\"" -DENGINESDIR="\"/usr/local/ssl/lib/engines-1.1\"" -Wa,--noexecstack
sign verify sign/s verify/s
rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3


Doing a benchmark using haproxy and qat engine stall to ~450 connections/sec

Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):

here thre trace:

[[email protected] ~]# strace -p 27085
Process 27085 attached
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0
epoll_wait(3, {}, 200, 1000) = 0

The epoll awake all seconds, seems normal.

If i continue to inject re-using the same key (session resuming,no rsa computation), i observe ~1500 connections/src

But stopping the injection the process steal 156% of cpu doing nothing ( core 1 20% in user and 80% in system, and core 2 76% in user):

Here the trace:
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15

epoll_wait awake in very fast loop.

When this point is reached, some of time, re-starting the injection will crash haproxy in segfault.

Here my haproxy's config:
global
tune.ssl.default-dh-param 2048
ssl-engine qat
ssl-async

listen gg
mode http
bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES
redirect location /

R,
Emeric
Grant Zhang
Re: OpenSSL engine and async support
March 28, 2017 12:00AM
> On Mar 27, 2017, at 02:21, Emeric Brun <[email protected]> wrote:
> Intel's guys told me that the bug is related to prf and asked me to recompile the engine using '--disable_qat_prf'. Doing that i can do some tests iwth the qat engine but i'm facing stability issues:
>
> [[email protected] haproxy]# /usr/local/ssl/bin/openssl speed -engine qat -elapsed -async_jobs 8 rsa2048
> [WARNING][e_qat.c:1531:bind_qat()] QAT Warnings enabled.
> engine "qat" set.
> You have chosen to measure elapsed time instead of user CPU time.
> Doing 2048 bit private rsa's for 10s: 13442 2048 bit private RSA's in 10.01s
> Doing 2048 bit public rsa's for 10s: 290503 2048 bit public RSA's in 10.00s
> OpenSSL 1.1.0e 16 Feb 2017
> built on: reproducible build, date unspecified
> options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
> compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DOPENSSLDIR="\"/usr/local/ssl/ssl\"" -DENGINESDIR="\"/usr/local/ssl/lib/engines-1.1\"" -Wa,--noexecstack
> sign verify sign/s verify/s
> rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3

Hmm, the numbers are less than what I've seen:

[email protected]:~$ sudo /tmp/openssl_1.1.0_install/bin/openssl speed -engine qat -elapsed -async_jobs 8 rsa2048
engine "qat" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing 2048 bit private rsa's for 10s: 102435 2048 bit private RSA's in 10.00s
Doing 2048 bit public rsa's for 10s: 498989 2048 bit public RSA's in 10.00s
OpenSSL 1.1.0b 26 Sep 2016
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -DDSO_DLFCN -DHAVE_DLFCN_H -DNDEBUG -DOPENSSL_THREADS -DOPENSSL_NO_STATIC_ENGINE -DOPENSSL_PIC -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DENGINE_CONF_DEBUG -DOPENSSLDIR="\"/tmp/openssl_1.1.0_install\"" -DENGINESDIR="\"/tmp/openssl_1.1.0_install/lib/engines-1.1\"" -Wa,--noexecstack
sign verify sign/s verify/s
rsa 2048 bits 0.000098s 0.000020s 10243.5 49898.9

With -async_jobs set to 72, I am able to get:
sign verify sign/s verify/s
rsa 2048 bits 0.000025s 0.000006s 40496.4 170126.1

What kind of qat device do you have? Mine is a 8955(coleto creek).

> Doing a benchmark using haproxy and qat engine stall to ~450 connections/sec

At one point during my test I was getting around 500 connections/sec with qat engine. IIRC the issue was fixed when I upped the # of open files (ulimit -n) on the test client side. Probably your issue is not the same though.

>
> Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):
Hmm, an idle haproxy process with qat enabled consumes about 5% of a core in my test. 50% is too much:-(.

>
> here thre trace:
>
> [[email protected] ~]# strace -p 27085
> Process 27085 attached
> epoll_wait(3, {}, 200, 1000) = 0
> epoll_wait(3, {}, 200, 1000) = 0
> epoll_wait(3, {}, 200, 1000) = 0
> epoll_wait(3, {}, 200, 1000) = 0
> epoll_wait(3, {}, 200, 1000) = 0
>
> The epoll awake all seconds, seems normal.
>
> If i continue to inject re-using the same key (session resuming,no rsa computation), i observe ~1500 connections/src
>
> But stopping the injection the process steal 156% of cpu doing nothing ( core 1 20% in user and 80% in system, and core 2 76% in user):
>
> Here the trace:
> epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
> epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
> epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
> epoll_wait(3, {{EPOLLIN|EPOLLRDHUP, {u32=57, u64=57}}, {EPOLLIN|EPOLLRDHUP, {u32=56, u64=56}}, {EPOLLIN|EPOLLRDHUP, {u32=55, u64=55}}, {EPOLLIN|EPOLLRDHUP, {u32=54, u64=54}}, {EPOLLIN|EPOLLRDHUP, {u32=53, u64=53}}, {EPOLLIN|EPOLLRDHUP, {u32=52, u64=52}}, {EPOLLIN|EPOLLRDHUP, {u32=51, u64=51}}, {EPOLLIN|EPOLLRDHUP, {u32=50, u64=50}}, {EPOLLIN|EPOLLRDHUP, {u32=49, u64=49}}, {EPOLLIN|EPOLLRDHUP, {u32=48, u64=48}}, {EPOLLIN|EPOLLRDHUP, {u32=47, u64=47}}, {EPOLLIN|EPOLLRDHUP, {u32=45, u64=45}}, {EPOLLIN|EPOLLRDHUP, {u32=42, u64=42}}, {EPOLLIN|EPOLLRDHUP, {u32=44, u64=44}}, {EPOLLIN|EPOLLRDHUP, {u32=43, u64=43}}}, 200, 1000) = 15
>
> epoll_wait awake in very fast loop.
>
> When this point is reached, some of time, re-starting the injection will crash haproxy in segfault.
>
> Here my haproxy's config:
> global
> tune.ssl.default-dh-param 2048
> ssl-engine qat
> ssl-async
>
> listen gg
> mode http
Hmm, I have been testing with tcp mode (which is what we use in prod env). Any chance you have a test with tcp mode and see if you could reproduce the same issue?

Thanks,

Grant
Willy Tarreau
Re: OpenSSL engine and async support
March 28, 2017 07:50AM
Hi guys,

On Mon, Mar 27, 2017 at 02:57:38PM -0700, Grant Zhang wrote:
> > sign verify sign/s verify/s
> > rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3
>
> Hmm, the numbers are less than what I've seen:
(...)

This is an atom C2518 and it seems that --disable-prf has cut the performance
in half. We should receive a 8920 soon.

> > Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):
> Hmm, an idle haproxy process with qat enabled consumes about 5% of a core in
> my test. 50% is too much:-(.

In theory it should not consume anything anymore if it has nothing to do,
so maybe the 5% you observed will help understand what is happening.

> > Here my haproxy's config:
> > global
> > tune.ssl.default-dh-param 2048
> > ssl-engine qat
> > ssl-async
> >
> > listen gg
> > mode http
> Hmm, I have been testing with tcp mode (which is what we use in prod env).
> Any chance you have a test with tcp mode and see if you could reproduce the
> same issue?

I don't think there will be any difference but you're right, it would be
useful to know. However what I suspect based on epoll_wait() reporting
events that are not dealt with is that the fd owner is NULL (but it should
not be without the FD being closed) or is still set but does not point to
the proper object after too fast a reuse (eg: async engine vs connection).
This would also explain the crash.

One test you could both run consists in disabling epoll and trying with
poll() by starting haproxy with "-de" (or "noepoll" in the global section).

Cheers,
Willy
Emeric Brun
Re: OpenSSL engine and async support
March 28, 2017 10:50AM
Hi Grant,
On 03/28/2017 07:38 AM, Willy Tarreau wrote:
> Hi guys,
>
> On Mon, Mar 27, 2017 at 02:57:38PM -0700, Grant Zhang wrote:
>>> sign verify sign/s verify/s
>>> rsa 2048 bits 0.000745s 0.000034s 1342.9 29050.3
>>
>> Hmm, the numbers are less than what I've seen:
> (...)
>
> This is an atom C2518 and it seems that --disable-prf has cut the performance
> in half. We should receive a 8920 soon.
>
>>> Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):
>> Hmm, an idle haproxy process with qat enabled consumes about 5% of a core in
>> my test. 50% is too much:-(.
>
> In theory it should not consume anything anymore if it has nothing to do,
> so maybe the 5% you observed will help understand what is happening.

I've just noticed 50% cpu usage directly at start-up if we enable the engine (w or wout ssl-async):
global
tune.ssl.default-dh-param 2048
ssl-engine qat
# ssl-async

listen gg
mode http
bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES
redirect location


R,
Emeric
Grant Zhang
Re: OpenSSL engine and async support
April 01, 2017 02:10AM
Hi Emeric,

Sorry for my delayed reply.


On 03/28/2017 01:47 AM, Emeric Brun wrote:
>
>> This is an atom C2518 and it seems that --disable-prf has cut the performance
>> in half. We should receive a 8920 soon.
>>
>>>> Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):
>>> Hmm, an idle haproxy process with qat enabled consumes about 5% of a core in
>>> my test. 50% is too much:-(.
>> In theory it should not consume anything anymore if it has nothing to do,
>> so maybe the 5% you observed will help understand what is happening.
> I've just noticed 50% cpu usage directly at start-up if we enable the engine (w or wout ssl-async):
> global
> tune.ssl.default-dh-param 2048
> ssl-engine qat
> # ssl-async
>
> listen gg
> mode http
> bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES
> redirect location
Somehow I cannot reproduce the cpu usage issue using the above config.
In my test with the above config, when haproxy is idle, pidstat shows 4%
cpu usage

11:49:14 PM 359247 3.33 1.33 0.00 4.67 1 haproxy_nodebug
11:49:17 PM 359247 3.33 1.33 0.00 4.67 1 haproxy_nodebug
11:49:20 PM 359247 2.67 1.33 0.00 4.00 1 haproxy_nodebug

When it is under load test the cpu usage jumps to 100%(single process mode):
11:51:26 PM 359247 85.67 21.67 0.00 107.33 8 haproxy_nodebug

I am not sure whether it is the different hardware(c2000 vs. 895X), or
some difference in software. Just something to check:
* your kernel version (I tested with 4.4/4.7/4.9 without problem
though), and qat driver version?
* openssl version (1.1.0b-e?)
* are you using the latest QAT_ENGINE https://github.com/01org/QAT_Engine
* I assume you use qat_contig_mem kernel module?
* are you using the following config file for your c2000 card?
https://github.com/01org/QAT_Engine/blob/master/qat/config/c2xxx/multi_process_optimized/c2xxx_qa_dev0.conf

Thanks,

Grant
Emeric Brun
Re: OpenSSL engine and async support
April 10, 2017 04:50PM
Hi Grant,

On 04/01/2017 02:01 AM, Grant Zhang wrote:
> Hi Emeric,
>
> Sorry for my delayed reply.
>
>
> On 03/28/2017 01:47 AM, Emeric Brun wrote:
>>
>>> This is an atom C2518 and it seems that --disable-prf has cut the performance
>>> in half. We should receive a 8920 soon.
>>>
>>>>> Stopping the injection, the haproxy process continue to steal cpu doing nothing (top shows ~50% of one core, mainly in user):
>>>> Hmm, an idle haproxy process with qat enabled consumes about 5% of a core in
>>>> my test. 50% is too much:-(.
>>> In theory it should not consume anything anymore if it has nothing to do,
>>> so maybe the 5% you observed will help understand what is happening.
>> I've just noticed 50% cpu usage directly at start-up if we enable the engine (w or wout ssl-async):
>> global
>> tune.ssl.default-dh-param 2048
>> ssl-engine qat
>> # ssl-async
>>
>> listen gg
>> mode http
>> bind 0.0.0.0:9443 ssl crt /root/2048.pem ciphers AES
>> redirect location
> Somehow I cannot reproduce the cpu usage issue using the above config. In my test with the above config, when haproxy is idle, pidstat shows 4% cpu usage
>
> 11:49:14 PM 359247 3.33 1.33 0.00 4.67 1 haproxy_nodebug
> 11:49:17 PM 359247 3.33 1.33 0.00 4.67 1 haproxy_nodebug
> 11:49:20 PM 359247 2.67 1.33 0.00 4.00 1 haproxy_nodebug
>
> When it is under load test the cpu usage jumps to 100%(single process mode):
> 11:51:26 PM 359247 85.67 21.67 0.00 107.33 8 haproxy_nodebug
>
> I am not sure whether it is the different hardware(c2000 vs. 895X), or some difference in software. Just something to check:
We've juste reveive dh8920 but the qat config dh89xx fails to load with.

> * your kernel version (I tested with 4.4/4.7/4.9 without problem though), and qat driver version?
I'm using centos as described in intel's doc:
[[email protected] QAT_Engine]# uname -a
Linux centos 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

and for qat
qatmux.l.2.6.0-60 (QAT1.5)
> * openssl version (1.1.0b-e?)
compiled 1.1.0e
> * are you using the latest QAT_ENGINE https://github.com/01org/QAT_Engine
Yes, i am
> * I assume you use qat_contig_mem kernel module?
Yes, i am
> * are you using the following config file for your c2000 card? https://github.com/01org/QAT_Engine/blob/master/qat/config/c2xxx/multi_process_optimized/c2xxx_qa_dev0.conf
I'm using the one provided with the driver, reviewed and patched by intel guys because not compliant with my ship because provided one is for 2 engines and mine have only one.
> Thanks,
>
> Grant
>
Could you provide patches rebased on current dev master branch?

R,
Emeric
Grant Zhang
Re: OpenSSL engine and async support
April 10, 2017 05:20PM
> On Apr 10, 2017, at 07:42, Emeric Brun <[email protected]> wrote:
>

>> * openssl version (1.1.0b-e?)
> compiled 1.1.0e
>>
>>
> Could you provide patches rebased on current dev master branch?
I am kinda busy with other project but will try to provide rebased patches this week.

Thanks,

Grant
Sorry, only registered users may post in this forum.

Click here to login