[PATCH RFC 00/10] crypto: engine: permit to batch requests

16 views
Skip to first unread message

Corentin Labbe

unread,
Jan 14, 2020, 8:59:52 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
Hello

The sun8i-ce hardware can work on multiple requests in one batch.
For this it use a task descriptor, and chain them.
For the moment, the driver does not use this mechanism and do requests
one at a time and issue an irq for each.

Using the chaning will permit to issue less interrupts, and increase
thoughput.

But the crypto/engine can enqueue lots of requests but can ran them only
one by one.

This serie introduce a way to batch requests in crypto/engine by
- setting a batch limit (1 by default)
- refactor the prepare/unprepare code to permit to have x requests
prepared/unprepared at the same time.

For testing the serie, the selftest are not enough, since it issue
request one at a time.
I have used LUKS for testing it.

Please give me what you think about this serie, specially maintainers
which have hardware with the same kind of capability.

Regards

Corentin Labbe (10):
crypto: sun8i-ce: move iv data to request context
crypto: sun8i-ce: increase task list size
crypto: sun8i-ce: split into prepare/run/unprepare
crypto: sun8i-ce: introduce the slot number
crypto: engine: transform cur_req in an array
crypto: engine: introduce ct
crypto: sun8i-ce: handle slot > 0
crypto: engine: add slot parameter
crypto: engine: permit to batch requests
crypto: sun8i-ce: use the new batch mechanism

crypto/crypto_engine.c | 76 +++++++----
.../allwinner/sun8i-ce/sun8i-ce-cipher.c | 121 +++++++++++++-----
.../crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 17 ++-
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h | 17 ++-
drivers/crypto/omap-aes-gcm.c | 2 +-
drivers/crypto/omap-aes.c | 4 +-
drivers/crypto/omap-des.c | 4 +-
drivers/crypto/stm32/stm32-cryp.c | 8 +-
drivers/crypto/stm32/stm32-hash.c | 4 +-
include/crypto/engine.h | 27 +++-
10 files changed, 201 insertions(+), 79 deletions(-)

--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 8:59:53 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
Instead of storing IV data in the channel context, store them in the
request context.
Storing them in the channel structure was conceptualy wrong since they
are per request related.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
.../allwinner/sun8i-ce/sun8i-ce-cipher.c | 27 +++++++++----------
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h | 10 ++++---
2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index 75e2bef2b363..6108cea0e0bd 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -91,7 +91,6 @@ static int sun8i_ce_cipher(struct skcipher_request *areq)
struct scatterlist *sg;
unsigned int todo, len, offset, ivsize;
dma_addr_t addr_iv = 0, addr_key = 0;
- void *backup_iv = NULL;
u32 common, sym;
int flow, i;
int nr_sgs = 0;
@@ -154,24 +153,24 @@ static int sun8i_ce_cipher(struct skcipher_request *areq)

ivsize = crypto_skcipher_ivsize(tfm);
if (areq->iv && crypto_skcipher_ivsize(tfm) > 0) {
- chan->ivlen = ivsize;
- chan->bounce_iv = kzalloc(ivsize, GFP_KERNEL | GFP_DMA);
- if (!chan->bounce_iv) {
+ rctx->ivlen = ivsize;
+ rctx->bounce_iv = kzalloc(ivsize, GFP_KERNEL | GFP_DMA);
+ if (!rctx->bounce_iv) {
err = -ENOMEM;
goto theend_key;
}
if (rctx->op_dir & CE_DECRYPTION) {
- backup_iv = kzalloc(ivsize, GFP_KERNEL);
- if (!backup_iv) {
+ rctx->backup_iv = kzalloc(ivsize, GFP_KERNEL);
+ if (!rctx->backup_iv) {
err = -ENOMEM;
goto theend_key;
}
offset = areq->cryptlen - ivsize;
- scatterwalk_map_and_copy(backup_iv, areq->src, offset,
- ivsize, 0);
+ scatterwalk_map_and_copy(rctx->backup_iv, areq->src,
+ offset, ivsize, 0);
}
- memcpy(chan->bounce_iv, areq->iv, ivsize);
- addr_iv = dma_map_single(ce->dev, chan->bounce_iv, chan->ivlen,
+ memcpy(rctx->bounce_iv, areq->iv, ivsize);
+ addr_iv = dma_map_single(ce->dev, rctx->bounce_iv, rctx->ivlen,
DMA_TO_DEVICE);
cet->t_iv = cpu_to_le32(addr_iv);
if (dma_mapping_error(ce->dev, addr_iv)) {
@@ -252,17 +251,17 @@ static int sun8i_ce_cipher(struct skcipher_request *areq)
theend_iv:
if (areq->iv && ivsize > 0) {
if (addr_iv)
- dma_unmap_single(ce->dev, addr_iv, chan->ivlen,
+ dma_unmap_single(ce->dev, addr_iv, rctx->ivlen,
DMA_TO_DEVICE);
offset = areq->cryptlen - ivsize;
if (rctx->op_dir & CE_DECRYPTION) {
- memcpy(areq->iv, backup_iv, ivsize);
- kzfree(backup_iv);
+ memcpy(areq->iv, rctx->backup_iv, ivsize);
+ kzfree(rctx->backup_iv);
} else {
scatterwalk_map_and_copy(areq->iv, areq->dst, offset,
ivsize, 0);
}
- kfree(chan->bounce_iv);
+ kfree(rctx->bounce_iv);
}

theend_key:
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
index 8f8404c84a4d..49507ef2ec63 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
@@ -129,8 +129,6 @@ struct ce_task {
/*
* struct sun8i_ce_flow - Information used by each flow
* @engine: ptr to the crypto_engine for this flow
- * @bounce_iv: buffer which contain the IV
- * @ivlen: size of bounce_iv
* @complete: completion for the current task on this flow
* @status: set to 1 by interrupt if task is done
* @t_phy: Physical address of task
@@ -139,8 +137,6 @@ struct ce_task {
*/
struct sun8i_ce_flow {
struct crypto_engine *engine;
- void *bounce_iv;
- unsigned int ivlen;
struct completion complete;
int status;
dma_addr_t t_phy;
@@ -183,10 +179,16 @@ struct sun8i_ce_dev {
* struct sun8i_cipher_req_ctx - context for a skcipher request
* @op_dir: direction (encrypt vs decrypt) for this request
* @flow: the flow to use for this request
+ * @backup_iv: buffer which contain the next IV to store
+ * @bounce_iv: buffer which contain a copy of IV
+ * @ivlen: size of bounce_iv
*/
struct sun8i_cipher_req_ctx {
u32 op_dir;
int flow;
+ void *backup_iv;
+ void *bounce_iv;
+ unsigned int ivlen;
};

/*
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 8:59:55 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
The CE can handle more than 1 task at once, so this patch increase the size of
the task list up to 8.
For the moment I did not see more gain beyong this value.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 4 ++--
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h | 1 +
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
index f72346a44e69..e8bf7bf31061 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
@@ -321,7 +321,7 @@ static void sun8i_ce_free_chanlist(struct sun8i_ce_dev *ce, int i)
while (i >= 0) {
crypto_engine_exit(ce->chanlist[i].engine);
if (ce->chanlist[i].tl)
- dma_free_coherent(ce->dev, sizeof(struct ce_task),
+ dma_free_coherent(ce->dev, sizeof(struct ce_task) * MAXTASK,
ce->chanlist[i].tl,
ce->chanlist[i].t_phy);
i--;
@@ -356,7 +356,7 @@ static int sun8i_ce_allocate_chanlist(struct sun8i_ce_dev *ce)
goto error_engine;
}
ce->chanlist[i].tl = dma_alloc_coherent(ce->dev,
- sizeof(struct ce_task),
+ sizeof(struct ce_task) * MAXTASK,
&ce->chanlist[i].t_phy,
GFP_KERNEL);
if (!ce->chanlist[i].tl) {
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
index 49507ef2ec63..049b3175d755 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
@@ -73,6 +73,7 @@
#define CE_MAX_CLOCKS 3

#define MAXFLOW 4
+#define MAXTASK 8

/*
* struct ce_clock - Describe clocks used by sun8i-ce
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 8:59:57 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
This patch split the do_one_request into three.
Prepare will handle all DMA mapping and initialisation of the task
structure.
Unprepare will clean all DMA mapping.
And the do_one_request will be limited to just excuting the task.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
.../allwinner/sun8i-ce/sun8i-ce-cipher.c | 70 ++++++++++++++++---
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h | 4 ++
2 files changed, 66 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index 6108cea0e0bd..401f39f144ea 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -78,8 +78,9 @@ static int sun8i_ce_cipher_fallback(struct skcipher_request *areq)
return err;
}

-static int sun8i_ce_cipher(struct skcipher_request *areq)
+static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req)
{
+ struct skcipher_request *areq = container_of(async_req, struct skcipher_request, base);
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
struct sun8i_cipher_tfm_ctx *op = crypto_skcipher_ctx(tfm);
struct sun8i_ce_dev *ce = op->ce;
@@ -237,7 +238,9 @@ static int sun8i_ce_cipher(struct skcipher_request *areq)
}

chan->timeout = areq->cryptlen;
- err = sun8i_ce_run_task(ce, flow, crypto_tfm_alg_name(areq->base.tfm));
+ rctx->nr_sgs = nr_sgs;
+ rctx->nr_sgd = nr_sgd;
+ return 0;

theend_sgs:
if (areq->src == areq->dst) {
@@ -271,13 +274,64 @@ static int sun8i_ce_cipher(struct skcipher_request *areq)
return err;
}

-static int sun8i_ce_handle_cipher_request(struct crypto_engine *engine, void *areq)
+int sun8i_ce_cipher_run(struct crypto_engine *engine, void *areq)
{
- int err;
struct skcipher_request *breq = container_of(areq, struct skcipher_request, base);
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(breq);
+ struct sun8i_cipher_tfm_ctx *op = crypto_skcipher_ctx(tfm);
+ struct sun8i_ce_dev *ce = op->ce;
+ struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(breq);
+ int flow, err;

- err = sun8i_ce_cipher(breq);
+ flow = rctx->flow;
+ err = sun8i_ce_run_task(ce, flow, crypto_tfm_alg_name(breq->base.tfm));
crypto_finalize_skcipher_request(engine, breq, err);
+ return 0;
+}
+
+static int sun8i_ce_cipher_unprepare(struct crypto_engine *engine, void *async_req)
+{
+ struct skcipher_request *areq = container_of(async_req, struct skcipher_request, base);
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
+ struct sun8i_cipher_tfm_ctx *op = crypto_skcipher_ctx(tfm);
+ struct sun8i_ce_dev *ce = op->ce;
+ struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
+ struct sun8i_ce_flow *chan;
+ struct ce_task *cet;
+ unsigned int ivsize, offset;
+ int nr_sgs = rctx->nr_sgs;
+ int nr_sgd = rctx->nr_sgd;
+ int flow;
+
+ flow = rctx->flow;
+ chan = &ce->chanlist[flow];
+ cet = chan->tl;
+ ivsize = crypto_skcipher_ivsize(tfm);
+
+ if (areq->src == areq->dst) {
+ dma_unmap_sg(ce->dev, areq->src, nr_sgs, DMA_BIDIRECTIONAL);
+ } else {
+ if (nr_sgs > 0)
+ dma_unmap_sg(ce->dev, areq->src, nr_sgs, DMA_TO_DEVICE);
+ dma_unmap_sg(ce->dev, areq->dst, nr_sgd, DMA_FROM_DEVICE);
+ }
+
+ if (areq->iv && ivsize > 0) {
+ if (cet->t_iv)
+ dma_unmap_single(ce->dev, cet->t_iv, rctx->ivlen,
+ DMA_TO_DEVICE);
+ offset = areq->cryptlen - ivsize;
+ if (rctx->op_dir & CE_DECRYPTION) {
+ memcpy(areq->iv, rctx->backup_iv, ivsize);
+ kzfree(rctx->backup_iv);
+ } else {
+ scatterwalk_map_and_copy(areq->iv, areq->dst, offset,
+ ivsize, 0);
+ }
+ kfree(rctx->bounce_iv);
+ }
+
+ dma_unmap_single(ce->dev, cet->t_key, op->keylen, DMA_TO_DEVICE);

return 0;
}
@@ -347,9 +401,9 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
crypto_tfm_alg_driver_name(&sktfm->base),
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));

- op->enginectx.op.do_one_request = sun8i_ce_handle_cipher_request;
- op->enginectx.op.prepare_request = NULL;
- op->enginectx.op.unprepare_request = NULL;
+ op->enginectx.op.do_one_request = sun8i_ce_cipher_run;
+ op->enginectx.op.prepare_request = sun8i_ce_cipher_prepare;
+ op->enginectx.op.unprepare_request = sun8i_ce_cipher_unprepare;

err = pm_runtime_get_sync(op->ce->dev);
if (err < 0)
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
index 049b3175d755..2d3325a13bf1 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
@@ -183,6 +183,8 @@ struct sun8i_ce_dev {
* @backup_iv: buffer which contain the next IV to store
* @bounce_iv: buffer which contain a copy of IV
* @ivlen: size of bounce_iv
+ * @nr_sgs: The number of source SG (as given by dma_map_sg())
+ * @nr_sgd: The number of destination SG (as given by dma_map_sg())
*/
struct sun8i_cipher_req_ctx {
u32 op_dir;
@@ -190,6 +192,8 @@ struct sun8i_cipher_req_ctx {
void *backup_iv;
void *bounce_iv;
unsigned int ivlen;
+ int nr_sgs;
+ int nr_sgd;
};

/*
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 8:59:58 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
This patch adds the slot position, for choosing which task in the task
list should be prepared/unprepared.
The slot is for the moment is always 0 until the infrastructure will handle
non 0 value.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 6 ++++--
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index 401f39f144ea..9c1f6c9eaaf9 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -96,6 +96,7 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req
int flow, i;
int nr_sgs = 0;
int nr_sgd = 0;
+ int slot = 0;
int err = 0;

algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
@@ -114,7 +115,7 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req

chan = &ce->chanlist[flow];

- cet = chan->tl;
+ cet = &chan->tl[slot];
memset(cet, 0, sizeof(struct ce_task));

cet->t_id = cpu_to_le32(flow);
@@ -301,11 +302,12 @@ static int sun8i_ce_cipher_unprepare(struct crypto_engine *engine, void *async_r
unsigned int ivsize, offset;
int nr_sgs = rctx->nr_sgs;
int nr_sgd = rctx->nr_sgd;
+ int slot = 0;
int flow;

flow = rctx->flow;
chan = &ce->chanlist[flow];
- cet = chan->tl;
+ cet = &chan->tl[slot];
ivsize = crypto_skcipher_ivsize(tfm);

if (areq->src == areq->dst) {
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
index e8bf7bf31061..bd355f4b95f3 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
@@ -120,7 +120,7 @@ int sun8i_ce_run_task(struct sun8i_ce_dev *ce, int flow, const char *name)
/* Be sure all data is written before enabling the task */
wmb();

- v = 1 | (ce->chanlist[flow].tl->t_common_ctl & 0x7F) << 8;
+ v = 1 | (ce->chanlist[flow].tl[0].t_common_ctl & 0x7F) << 8;
writel(v, ce->base + CE_TLR);
mutex_unlock(&ce->mlock);

--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:00 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
We will store the number of request in a batch in engine->ct.
This patch adds all loop to unprepare all requests of a batch.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
crypto/crypto_engine.c | 30 ++++++++++++++++++------------
include/crypto/engine.h | 2 ++
2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
index b72873550587..591dea5ddeec 100644
--- a/crypto/crypto_engine.c
+++ b/crypto/crypto_engine.c
@@ -28,6 +28,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
bool finalize_cur_req = false;
int ret;
struct crypto_engine_ctx *enginectx;
+ int i = 0;

spin_lock_irqsave(&engine->queue_lock, flags);
if (engine->cur_reqs[0].req == req)
@@ -35,17 +36,20 @@ static void crypto_finalize_request(struct crypto_engine *engine,
spin_unlock_irqrestore(&engine->queue_lock, flags);

if (finalize_cur_req) {
- enginectx = crypto_tfm_ctx(engine->cur_reqs[0].req->tfm);
- if (engine->cur_reqs[0].prepared &&
- enginectx->op.unprepare_request) {
- ret = enginectx->op.unprepare_request(engine, engine->cur_reqs[0].req);
- if (ret)
- dev_err(engine->dev, "failed to unprepare request\n");
- }
- engine->cur_reqs[0].req->complete(engine->cur_reqs[0].req, err);
+ do {
+ enginectx = crypto_tfm_ctx(engine->cur_reqs[i].req->tfm);
+ if (engine->cur_reqs[i].prepared &&
+ enginectx->op.unprepare_request) {
+ ret = enginectx->op.unprepare_request(engine, engine->cur_reqs[i].req);
+ if (ret)
+ dev_err(engine->dev, "failed to unprepare request\n");
+ }
+ engine->cur_reqs[i].prepared = false;
+ engine->cur_reqs[i].req->complete(engine->cur_reqs[i].req, err);
+ } while (++i < engine->ct);
spin_lock_irqsave(&engine->queue_lock, flags);
- engine->cur_reqs[0].prepared = false;
- engine->cur_reqs[0].req = NULL;
+ while (engine->ct-- > 0)
+ engine->cur_reqs[engine->ct].req = NULL;
spin_unlock_irqrestore(&engine->queue_lock, flags);
} else {
req->complete(req, err);
@@ -109,13 +113,14 @@ static void crypto_pump_requests(struct crypto_engine *engine,
goto out;
}

+ engine->ct = 0;
/* Get the fist request from the engine queue to handle */
backlog = crypto_get_backlog(&engine->queue);
async_req = crypto_dequeue_request(&engine->queue);
if (!async_req)
goto out;

- engine->cur_reqs[0].req = async_req;
+ engine->cur_reqs[engine->ct].req = async_req;
if (backlog)
backlog->complete(backlog, -EINPROGRESS);

@@ -144,8 +149,9 @@ static void crypto_pump_requests(struct crypto_engine *engine,
ret);
goto req_err;
}
- engine->cur_reqs[0].prepared = true;
+ engine->cur_reqs[engine->ct].prepared = true;
}
+ engine->ct++;
if (!enginectx->op.do_one_request) {
dev_err(engine->dev, "failed to do request\n");
ret = -EINVAL;
diff --git a/include/crypto/engine.h b/include/crypto/engine.h
index 362134e226f4..021136a47b93 100644
--- a/include/crypto/engine.h
+++ b/include/crypto/engine.h
@@ -50,6 +50,7 @@ struct cur_req {
* @priv_data: the engine private data
* @rmax: The number of request which can be processed in one batch
* @cur_reqs: A list for requests to be processed
+ * @ct: How many requests will be handled in one batch
*/
struct crypto_engine {
char name[ENGINE_NAME_LEN];
@@ -73,6 +74,7 @@ struct crypto_engine {
void *priv_data;
int rmax;
struct cur_req *cur_reqs;
+ int ct;
};

/*
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:00 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
For having the ability of doing a batch of request in one do_one_request(), we
should be able to store them in an array. (for unpreparing them later).
This patch converts cur_req in an array of request, but for the moment
hardcode the maximum to 1.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
crypto/crypto_engine.c | 32 ++++++++++++++++++--------------
include/crypto/engine.h | 19 +++++++++++++++----
2 files changed, 33 insertions(+), 18 deletions(-)

diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
index eb029ff1e05a..b72873550587 100644
--- a/crypto/crypto_engine.c
+++ b/crypto/crypto_engine.c
@@ -30,26 +30,27 @@ static void crypto_finalize_request(struct crypto_engine *engine,
struct crypto_engine_ctx *enginectx;

spin_lock_irqsave(&engine->queue_lock, flags);
- if (engine->cur_req == req)
+ if (engine->cur_reqs[0].req == req)
finalize_cur_req = true;
spin_unlock_irqrestore(&engine->queue_lock, flags);

if (finalize_cur_req) {
- enginectx = crypto_tfm_ctx(req->tfm);
- if (engine->cur_req_prepared &&
+ enginectx = crypto_tfm_ctx(engine->cur_reqs[0].req->tfm);
+ if (engine->cur_reqs[0].prepared &&
enginectx->op.unprepare_request) {
- ret = enginectx->op.unprepare_request(engine, req);
+ ret = enginectx->op.unprepare_request(engine, engine->cur_reqs[0].req);
if (ret)
dev_err(engine->dev, "failed to unprepare request\n");
}
+ engine->cur_reqs[0].req->complete(engine->cur_reqs[0].req, err);
spin_lock_irqsave(&engine->queue_lock, flags);
- engine->cur_req = NULL;
- engine->cur_req_prepared = false;
+ engine->cur_reqs[0].prepared = false;
+ engine->cur_reqs[0].req = NULL;
spin_unlock_irqrestore(&engine->queue_lock, flags);
+ } else {
+ req->complete(req, err);
}

- req->complete(req, err);
-
kthread_queue_work(engine->kworker, &engine->pump_requests);
}

@@ -74,7 +75,7 @@ static void crypto_pump_requests(struct crypto_engine *engine,
spin_lock_irqsave(&engine->queue_lock, flags);

/* Make sure we are not already running a request */
- if (engine->cur_req)
+ if (engine->cur_reqs[0].req)
goto out;

/* If another context is idling then defer */
@@ -114,7 +115,7 @@ static void crypto_pump_requests(struct crypto_engine *engine,
if (!async_req)
goto out;

- engine->cur_req = async_req;
+ engine->cur_reqs[0].req = async_req;
if (backlog)
backlog->complete(backlog, -EINPROGRESS);

@@ -143,14 +144,14 @@ static void crypto_pump_requests(struct crypto_engine *engine,
ret);
goto req_err;
}
- engine->cur_req_prepared = true;
+ engine->cur_reqs[0].prepared = true;
}
if (!enginectx->op.do_one_request) {
dev_err(engine->dev, "failed to do request\n");
ret = -EINVAL;
goto req_err;
}
- ret = enginectx->op.do_one_request(engine, async_req);
+ ret = enginectx->op.do_one_request(engine, engine->cur_reqs[0].req);
if (ret) {
dev_err(engine->dev, "Failed to do one request from queue: %d\n", ret);
goto req_err;
@@ -158,7 +159,7 @@ static void crypto_pump_requests(struct crypto_engine *engine,
return;

req_err:
- crypto_finalize_request(engine, async_req, ret);
+ crypto_finalize_request(engine, engine->cur_reqs[0].req, ret);
return;

out:
@@ -411,10 +412,13 @@ struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
engine->running = false;
engine->busy = false;
engine->idling = false;
- engine->cur_req_prepared = false;
engine->priv_data = dev;
snprintf(engine->name, sizeof(engine->name),
"%s-engine", dev_name(dev));
+ engine->rmax = 1;
+ engine->cur_reqs = devm_kzalloc(dev, sizeof(struct cur_req) * engine->rmax, GFP_KERNEL);
+ if (!engine->cur_reqs)
+ return NULL;

crypto_init_queue(&engine->queue, CRYPTO_ENGINE_MAX_QLEN);
spin_lock_init(&engine->queue_lock);
diff --git a/include/crypto/engine.h b/include/crypto/engine.h
index e29cd67f93c7..362134e226f4 100644
--- a/include/crypto/engine.h
+++ b/include/crypto/engine.h
@@ -18,13 +18,23 @@
#include <crypto/skcipher.h>

#define ENGINE_NAME_LEN 30
+
+/*
+ * struct cur_req - Represent a request to be processed
+ * @prepared: Does the request was prepared
+ * @req: The request to be processed
+ */
+struct cur_req {
+ bool prepared;
+ struct crypto_async_request *req;
+};
+
/*
* struct crypto_engine - crypto hardware engine
* @name: the engine name
* @idling: the engine is entering idle state
* @busy: request pump is busy
* @running: the engine is on working
- * @cur_req_prepared: current request is prepared
* @list: link with the global crypto engine list
* @queue_lock: spinlock to syncronise access to request queue
* @queue: the crypto queue of the engine
@@ -38,14 +48,14 @@
* @kworker: kthread worker struct for request pump
* @pump_requests: work struct for scheduling work to the request pump
* @priv_data: the engine private data
- * @cur_req: the current request which is on processing
+ * @rmax: The number of request which can be processed in one batch
+ * @cur_reqs: A list for requests to be processed
*/
struct crypto_engine {
char name[ENGINE_NAME_LEN];
bool idling;
bool busy;
bool running;
- bool cur_req_prepared;

struct list_head list;
spinlock_t queue_lock;
@@ -61,7 +71,8 @@ struct crypto_engine {
struct kthread_work pump_requests;

void *priv_data;
- struct crypto_async_request *cur_req;
+ int rmax;
+ struct cur_req *cur_reqs;
};

/*
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:02 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
Handle the fact a slot could be different than 0.
This imply:
- linking two task via next
- set interrupt flag just before running the batch in the last task.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 8 +++++++-
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 2 ++
2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index 9c1f6c9eaaf9..d56c992fbf93 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -99,6 +99,9 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req
int slot = 0;
int err = 0;

+ if (slot < 0 || slot >= MAXTASK)
+ return -EINVAL;
+
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);

dev_dbg(ce->dev, "%s %s %u %x IV(%p %u) key=%u\n", __func__,
@@ -120,8 +123,11 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req

cet->t_id = cpu_to_le32(flow);
common = ce->variant->alg_cipher[algt->ce_algo_id];
- common |= rctx->op_dir | CE_COMM_INT;
+ common |= rctx->op_dir;
cet->t_common_ctl = cpu_to_le32(common);
+ if (slot > 0)
+ chan->tl[slot - 1].next = cpu_to_le32(chan->t_phy + 176 * slot);
+
/* CTS and recent CE (H6) need length in bytes, in word otherwise */
if (ce->variant->has_t_dlen_in_bytes)
cet->t_dlen = cpu_to_le32(areq->cryptlen);
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
index bd355f4b95f3..39bf684c0ff5 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
@@ -106,6 +106,8 @@ int sun8i_ce_run_task(struct sun8i_ce_dev *ce, int flow, const char *name)
#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
ce->chanlist[flow].stat_req++;
#endif
+ /* mark last one */
+ ce->chanlist[flow].tl[ce->chanlist[flow].engine->ct - 1].t_common_ctl |= cpu_to_le32(CE_COMM_INT);

mutex_lock(&ce->mlock);

--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:04 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
prepare/unprepare functions should be able to know which slot in a batch
should be used for preparing a request.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
crypto/crypto_engine.c | 4 ++--
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 6 ++----
drivers/crypto/omap-aes-gcm.c | 2 +-
drivers/crypto/omap-aes.c | 4 ++--
drivers/crypto/omap-des.c | 4 ++--
drivers/crypto/stm32/stm32-cryp.c | 8 ++++----
drivers/crypto/stm32/stm32-hash.c | 4 ++--
include/crypto/engine.h | 4 ++--
8 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
index 591dea5ddeec..e23a398ba330 100644
--- a/crypto/crypto_engine.c
+++ b/crypto/crypto_engine.c
@@ -40,7 +40,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
enginectx = crypto_tfm_ctx(engine->cur_reqs[i].req->tfm);
if (engine->cur_reqs[i].prepared &&
enginectx->op.unprepare_request) {
- ret = enginectx->op.unprepare_request(engine, engine->cur_reqs[i].req);
+ ret = enginectx->op.unprepare_request(engine, engine->cur_reqs[i].req, i);
if (ret)
dev_err(engine->dev, "failed to unprepare request\n");
}
@@ -143,7 +143,7 @@ static void crypto_pump_requests(struct crypto_engine *engine,
enginectx = crypto_tfm_ctx(async_req->tfm);

if (enginectx->op.prepare_request) {
- ret = enginectx->op.prepare_request(engine, async_req);
+ ret = enginectx->op.prepare_request(engine, async_req, engine->ct);
if (ret) {
dev_err(engine->dev, "failed to prepare request: %d\n",
ret);
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index d56c992fbf93..41d18c18d1d1 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -78,7 +78,7 @@ static int sun8i_ce_cipher_fallback(struct skcipher_request *areq)
return err;
}

-static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req)
+static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req, int slot)
{
struct skcipher_request *areq = container_of(async_req, struct skcipher_request, base);
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
@@ -96,7 +96,6 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req
int flow, i;
int nr_sgs = 0;
int nr_sgd = 0;
- int slot = 0;
int err = 0;

if (slot < 0 || slot >= MAXTASK)
@@ -296,7 +295,7 @@ int sun8i_ce_cipher_run(struct crypto_engine *engine, void *areq)
return 0;
}

-static int sun8i_ce_cipher_unprepare(struct crypto_engine *engine, void *async_req)
+static int sun8i_ce_cipher_unprepare(struct crypto_engine *engine, void *async_req, int slot)
{
struct skcipher_request *areq = container_of(async_req, struct skcipher_request, base);
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
@@ -308,7 +307,6 @@ static int sun8i_ce_cipher_unprepare(struct crypto_engine *engine, void *async_r
unsigned int ivsize, offset;
int nr_sgs = rctx->nr_sgs;
int nr_sgd = rctx->nr_sgd;
- int slot = 0;
int flow;

flow = rctx->flow;
diff --git a/drivers/crypto/omap-aes-gcm.c b/drivers/crypto/omap-aes-gcm.c
index 32dc00dc570b..0fea3dd40378 100644
--- a/drivers/crypto/omap-aes-gcm.c
+++ b/drivers/crypto/omap-aes-gcm.c
@@ -213,7 +213,7 @@ static int omap_aes_gcm_handle_queue(struct omap_aes_dev *dd,
return 0;
}

-static int omap_aes_gcm_prepare_req(struct crypto_engine *engine, void *areq)
+static int omap_aes_gcm_prepare_req(struct crypto_engine *engine, void *areq, int slot)
{
struct aead_request *req = container_of(areq, struct aead_request,
base);
diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index 824ddf2a66ff..4d9caa7ce8da 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -427,7 +427,7 @@ static int omap_aes_handle_queue(struct omap_aes_dev *dd,
}

static int omap_aes_prepare_req(struct crypto_engine *engine,
- void *areq)
+ void *areq, int slot)
{
struct skcipher_request *req = container_of(areq, struct skcipher_request, base);
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(
@@ -632,7 +632,7 @@ static int omap_aes_ctr_decrypt(struct skcipher_request *req)
}

static int omap_aes_prepare_req(struct crypto_engine *engine,
- void *req);
+ void *req, int slot);
static int omap_aes_crypt_req(struct crypto_engine *engine,
void *req);

diff --git a/drivers/crypto/omap-des.c b/drivers/crypto/omap-des.c
index 8eda43319204..92c5fb1d4b83 100644
--- a/drivers/crypto/omap-des.c
+++ b/drivers/crypto/omap-des.c
@@ -524,7 +524,7 @@ static int omap_des_handle_queue(struct omap_des_dev *dd,
}

static int omap_des_prepare_req(struct crypto_engine *engine,
- void *areq)
+ void *areq, int slot)
{
struct skcipher_request *req = container_of(areq, struct skcipher_request, base);
struct omap_des_ctx *ctx = crypto_skcipher_ctx(
@@ -711,7 +711,7 @@ static int omap_des_cbc_decrypt(struct skcipher_request *req)
}

static int omap_des_prepare_req(struct crypto_engine *engine,
- void *areq);
+ void *areq, int slot);
static int omap_des_crypt_req(struct crypto_engine *engine,
void *areq);

diff --git a/drivers/crypto/stm32/stm32-cryp.c b/drivers/crypto/stm32/stm32-cryp.c
index d347a1d6e351..7c65b526d20e 100644
--- a/drivers/crypto/stm32/stm32-cryp.c
+++ b/drivers/crypto/stm32/stm32-cryp.c
@@ -684,7 +684,7 @@ static int stm32_cryp_cpu_start(struct stm32_cryp *cryp)

static int stm32_cryp_cipher_one_req(struct crypto_engine *engine, void *areq);
static int stm32_cryp_prepare_cipher_req(struct crypto_engine *engine,
- void *areq);
+ void *areq, int slot);

static int stm32_cryp_init_tfm(struct crypto_skcipher *tfm)
{
@@ -700,7 +700,7 @@ static int stm32_cryp_init_tfm(struct crypto_skcipher *tfm)

static int stm32_cryp_aead_one_req(struct crypto_engine *engine, void *areq);
static int stm32_cryp_prepare_aead_req(struct crypto_engine *engine,
- void *areq);
+ void *areq, int slot);

static int stm32_cryp_aes_aead_init(struct crypto_aead *tfm)
{
@@ -1015,7 +1015,7 @@ static int stm32_cryp_prepare_req(struct skcipher_request *req,
}

static int stm32_cryp_prepare_cipher_req(struct crypto_engine *engine,
- void *areq)
+ void *areq, int slot)
{
struct skcipher_request *req = container_of(areq,
struct skcipher_request,
@@ -1039,7 +1039,7 @@ static int stm32_cryp_cipher_one_req(struct crypto_engine *engine, void *areq)
return stm32_cryp_cpu_start(cryp);
}

-static int stm32_cryp_prepare_aead_req(struct crypto_engine *engine, void *areq)
+static int stm32_cryp_prepare_aead_req(struct crypto_engine *engine, void *areq, int slot)
{
struct aead_request *req = container_of(areq, struct aead_request,
base);
diff --git a/drivers/crypto/stm32/stm32-hash.c b/drivers/crypto/stm32/stm32-hash.c
index 167b80eec437..4a696c48126c 100644
--- a/drivers/crypto/stm32/stm32-hash.c
+++ b/drivers/crypto/stm32/stm32-hash.c
@@ -824,7 +824,7 @@ static int stm32_hash_hw_init(struct stm32_hash_dev *hdev,
}

static int stm32_hash_one_request(struct crypto_engine *engine, void *areq);
-static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq);
+static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq, int slot);

static int stm32_hash_handle_queue(struct stm32_hash_dev *hdev,
struct ahash_request *req)
@@ -832,7 +832,7 @@ static int stm32_hash_handle_queue(struct stm32_hash_dev *hdev,
return crypto_transfer_hash_request_to_engine(hdev->engine, req);
}

-static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq)
+static int stm32_hash_prepare_req(struct crypto_engine *engine, void *areq, int slot)
{
struct ahash_request *req = container_of(areq, struct ahash_request,
base);
diff --git a/include/crypto/engine.h b/include/crypto/engine.h
index 021136a47b93..55d3dbc2498c 100644
--- a/include/crypto/engine.h
+++ b/include/crypto/engine.h
@@ -85,9 +85,9 @@ struct crypto_engine {
*/
struct crypto_engine_op {
int (*prepare_request)(struct crypto_engine *engine,
- void *areq);
+ void *areq, int slot);
int (*unprepare_request)(struct crypto_engine *engine,
- void *areq);
+ void *areq, int slot);
int (*do_one_request)(struct crypto_engine *engine,
void *areq);
};
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:07 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
Now everything is ready, this patch permits to choose the number of
request to batch.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
crypto/crypto_engine.c | 32 +++++++++++++++++++++++++++-----
include/crypto/engine.h | 2 ++
2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
index e23a398ba330..e9cd9ec9a732 100644
--- a/crypto/crypto_engine.c
+++ b/crypto/crypto_engine.c
@@ -114,6 +114,7 @@ static void crypto_pump_requests(struct crypto_engine *engine,
}

engine->ct = 0;
+retry:
/* Get the fist request from the engine queue to handle */
backlog = crypto_get_backlog(&engine->queue);
async_req = crypto_dequeue_request(&engine->queue);
@@ -151,7 +152,10 @@ static void crypto_pump_requests(struct crypto_engine *engine,
}
engine->cur_reqs[engine->ct].prepared = true;
}
- engine->ct++;
+ if (++engine->ct < engine->rmax && engine->queue.qlen > 0) {
+ spin_lock_irqsave(&engine->queue_lock, flags);
+ goto retry;
+ }
if (!enginectx->op.do_one_request) {
dev_err(engine->dev, "failed to do request\n");
ret = -EINVAL;
@@ -393,15 +397,18 @@ int crypto_engine_stop(struct crypto_engine *engine)
EXPORT_SYMBOL_GPL(crypto_engine_stop);

/**
- * crypto_engine_alloc_init - allocate crypto hardware engine structure and
+ * crypto_engine_alloc_init2 - allocate crypto hardware engine structure and
* initialize it.
* @dev: the device attached with one hardware engine
* @rt: whether this queue is set to run as a realtime task
+ * @rmax: The number of request that the engine can batch in one
+ * @qlen: The size of the crypto queue
*
* This must be called from context that can sleep.
* Return: the crypto engine structure on success, else NULL.
*/
-struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
+struct crypto_engine *crypto_engine_alloc_init2(struct device *dev, bool rt,
+ int rmax, int qlen)
{
struct sched_param param = { .sched_priority = MAX_RT_PRIO / 2 };
struct crypto_engine *engine;
@@ -421,12 +428,12 @@ struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
engine->priv_data = dev;
snprintf(engine->name, sizeof(engine->name),
"%s-engine", dev_name(dev));
- engine->rmax = 1;
+ engine->rmax = rmax;
engine->cur_reqs = devm_kzalloc(dev, sizeof(struct cur_req) * engine->rmax, GFP_KERNEL);
if (!engine->cur_reqs)
return NULL;

- crypto_init_queue(&engine->queue, CRYPTO_ENGINE_MAX_QLEN);
+ crypto_init_queue(&engine->queue, qlen);
spin_lock_init(&engine->queue_lock);

engine->kworker = kthread_create_worker(0, "%s", engine->name);
@@ -443,6 +450,21 @@ struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)

return engine;
}
+EXPORT_SYMBOL_GPL(crypto_engine_alloc_init2);
+
+/**
+ * crypto_engine_alloc_init - allocate crypto hardware engine structure and
+ * initialize it.
+ * @dev: the device attached with one hardware engine
+ * @rt: whether this queue is set to run as a realtime task
+ *
+ * This must be called from context that can sleep.
+ * Return: the crypto engine structure on success, else NULL.
+ */
+struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
+{
+ return crypto_engine_alloc_init2(dev, rt, 1, CRYPTO_ENGINE_MAX_QLEN);
+}
EXPORT_SYMBOL_GPL(crypto_engine_alloc_init);

/**
diff --git a/include/crypto/engine.h b/include/crypto/engine.h
index 55d3dbc2498c..fe0dfea8bf07 100644
--- a/include/crypto/engine.h
+++ b/include/crypto/engine.h
@@ -115,6 +115,8 @@ void crypto_finalize_skcipher_request(struct crypto_engine *engine,
int crypto_engine_start(struct crypto_engine *engine);
int crypto_engine_stop(struct crypto_engine *engine);
struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt);
+struct crypto_engine *crypto_engine_alloc_init2(struct device *dev, bool rt,
+ int rmax, int qlen);
int crypto_engine_exit(struct crypto_engine *engine);

#endif /* _CRYPTO_ENGINE_H */
--
2.24.1

Corentin Labbe

unread,
Jan 14, 2020, 9:00:07 AM1/14/20
to alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, iuliana...@nxp.com, horia....@nxp.com, aymen....@nxp.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com, Corentin Labbe
Now all infrastructure to batch request are in place, it is time to use
it.
Introduce some debug for it also.

Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
---
.../crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 14 ++++++++------
drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c | 9 ++++++---
drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h | 2 ++
3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index 41d18c18d1d1..fe5374788304 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -103,20 +103,22 @@ static int sun8i_ce_cipher_prepare(struct crypto_engine *engine, void *async_req

algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);

- dev_dbg(ce->dev, "%s %s %u %x IV(%p %u) key=%u\n", __func__,
+ dev_dbg(ce->dev, "%s %s %u %x IV(%p %u) key=%u slot=%d\n", __func__,
crypto_tfm_alg_name(areq->base.tfm),
areq->cryptlen,
rctx->op_dir, areq->iv, crypto_skcipher_ivsize(tfm),
- op->keylen);
-
-#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
- algt->stat_req++;
-#endif
+ op->keylen, slot);

flow = rctx->flow;

chan = &ce->chanlist[flow];

+#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
+ algt->stat_req++;
+ if (chan->engine->ct + 1 > chan->tmax)
+ chan->tmax = chan->engine->ct + 1;
+#endif
+
cet = &chan->tl[slot];
memset(cet, 0, sizeof(struct ce_task));

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
index 39bf684c0ff5..7cd98c227357 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-core.c
@@ -104,7 +104,7 @@ int sun8i_ce_run_task(struct sun8i_ce_dev *ce, int flow, const char *name)
int err = 0;

#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
- ce->chanlist[flow].stat_req++;
+ ce->chanlist[flow].stat_req += ce->chanlist[flow].engine->ct;
#endif
/* mark last one */
ce->chanlist[flow].tl[ce->chanlist[flow].engine->ct - 1].t_common_ctl |= cpu_to_le32(CE_COMM_INT);
@@ -287,7 +287,10 @@ static int sun8i_ce_dbgfs_read(struct seq_file *seq, void *v)
int i;

for (i = 0; i < MAXFLOW; i++)
- seq_printf(seq, "Channel %d: nreq %lu\n", i, ce->chanlist[i].stat_req);
+ seq_printf(seq, "Channel %d: nreq %lu tmax %d eqlen=%d/%d\n", i,
+ ce->chanlist[i].stat_req, ce->chanlist[i].tmax,
+ ce->chanlist[i].engine->queue.qlen,
+ ce->chanlist[i].engine->queue.max_qlen);

for (i = 0; i < ARRAY_SIZE(ce_algs); i++) {
if (!ce_algs[i].ce)
@@ -345,7 +348,7 @@ static int sun8i_ce_allocate_chanlist(struct sun8i_ce_dev *ce)
for (i = 0; i < MAXFLOW; i++) {
init_completion(&ce->chanlist[i].complete);

- ce->chanlist[i].engine = crypto_engine_alloc_init(ce->dev, true);
+ ce->chanlist[i].engine = crypto_engine_alloc_init2(ce->dev, true, MAXTASK, MAXTASK * 2);
if (!ce->chanlist[i].engine) {
dev_err(ce->dev, "Cannot allocate engine\n");
i--;
diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
index 2d3325a13bf1..22bb15fea476 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h
@@ -135,6 +135,7 @@ struct ce_task {
* @t_phy: Physical address of task
* @tl: pointer to the current ce_task for this flow
* @stat_req: number of request done by this flow
+ * @tmax: The maximum number of tasks done in one batch
*/
struct sun8i_ce_flow {
struct crypto_engine *engine;
@@ -145,6 +146,7 @@ struct sun8i_ce_flow {
struct ce_task *tl;
#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
unsigned long stat_req;
+ int tmax;
#endif
};

--
2.24.1

Corentin Labbe

unread,
Jan 16, 2020, 8:16:08 AM1/16/20
to Iuliana Prodan, alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, Horia Geanta, Aymen Sghaier, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com
On Thu, Jan 16, 2020 at 11:33:24AM +0000, Iuliana Prodan wrote:
> On 1/14/2020 3:59 PM, Corentin Labbe wrote:
> > Hello
> >
> > The sun8i-ce hardware can work on multiple requests in one batch.
> > For this it use a task descriptor, and chain them.
> > For the moment, the driver does not use this mechanism and do requests
> > one at a time and issue an irq for each.
> >
> > Using the chaning will permit to issue less interrupts, and increase
> > thoughput.
> >
> > But the crypto/engine can enqueue lots of requests but can ran them only
> > one by one.
> >
> > This serie introduce a way to batch requests in crypto/engine by
> > - setting a batch limit (1 by default)
> > - refactor the prepare/unprepare code to permit to have x requests
> > prepared/unprepared at the same time.
> >
> > For testing the serie, the selftest are not enough, since it issue
> > request one at a time.
> > I have used LUKS for testing it.
> >
> > Please give me what you think about this serie, specially maintainers
> > which have hardware with the same kind of capability.
> >
> Hi,
>
> I'm working on CAAM, on adding support for crypto-engine.
> These modifications are not working on CAAM.
> They seem to be specific to requests that are linked. CAAM can work on
> multiple request, at the same time, but they are processed independently.
> So, I believe the parallelization is a good idea, but the requests still
> need to be independent.
> I'll follow up with comments on each patch.

Hello

Thanks for the review.
Yes my serie is for doing "linked" request.
For the CAAM, if you can do multiple request independently, why not having x crypto engine ? (like sun8i-ce/sun8i-ss/amlogic)

>
> Also, IMO you should send the patches for crypto-engine improvements in
> a separate series from the one for allwinner driver.

For this RFC serie, I tried to do real atomic patch, for let people see the whole process.

Regards

Corentin Labbe

unread,
Jan 16, 2020, 8:21:13 AM1/16/20
to Iuliana Prodan, alexandr...@st.com, da...@davemloft.net, her...@gondor.apana.org.au, mcoquel...@gmail.com, mri...@kernel.org, we...@csie.org, Horia Geanta, Aymen Sghaier, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linux...@st-md-mailman.stormreply.com, linux...@googlegroups.com
On Thu, Jan 16, 2020 at 11:34:19AM +0000, Iuliana Prodan wrote:
> On 1/14/2020 4:00 PM, Corentin Labbe wrote:
> > We will store the number of request in a batch in engine->ct.
> > This patch adds all loop to unprepare all requests of a batch.
> >
> > Signed-off-by: Corentin Labbe <clabbe....@gmail.com>
> > ---
> > crypto/crypto_engine.c | 30 ++++++++++++++++++------------
> > include/crypto/engine.h | 2 ++
> > 2 files changed, 20 insertions(+), 12 deletions(-)
> >
> > diff --git a/crypto/crypto_engine.c b/crypto/crypto_engine.c
> > index b72873550587..591dea5ddeec 100644
> > --- a/crypto/crypto_engine.c
> > +++ b/crypto/crypto_engine.c
> > @@ -28,6 +28,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
> > bool finalize_cur_req = false;
> > int ret;
> > struct crypto_engine_ctx *enginectx;
> > + int i = 0;
> >
> > spin_lock_irqsave(&engine->queue_lock, flags);
> > if (engine->cur_reqs[0].req == req)
> You're checking here just the first request, but do the completion for
> all? Why? Shouldn't we check for each request if it was done by hw or not?

The first request is a sort of key for the whole batch.
>
> I've also seen that the do_one_request is called only on the first
> request, from the batch.

Since the request are linked, this is not a problem.
But I miss this explanaition in the code.

>
> In your driver you do the prepare/unprepare for the whole batch at once,
> but not all drivers, who uses crypto-engine, are doing this (see virtio,
> amlogic, stm32). And I don't know if they can...

prepare is optionnal, and unprepare is optional even if prepare is done.
Furthermore, doing prepare/unprepare is optional per request.
I have tested this serie on sun8i-ss and amlogic which dont use prepare/unprepare.

Reply all
Reply to author
Forward
0 new messages